git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 00/18] make test "linting" more comprehensive
@ 2022-09-01  0:29 Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 01/18] t: add skeleton chainlint.pl Eric Sunshine via GitGitGadget
                   ` (18 more replies)
  0 siblings, 19 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine

A while back, Peff successfully nerd-sniped[1] me into tackling a
long-brewing idea I had about (possibly) improving "chainlint" performance
by linting all tests in all scripts with a single command invocation instead
of running "sed" 26800+ times (once for each test). The new linter
introduced by this series can check all test definitions in the entire
project in a single invocation, and each test definition is checked only
once no matter how many times the test is actually run (unlike chainlint.sed
which will check a test repeatedly if, for instance, the test is run in a
loop). Moreover, all test definitions in the project are "linted" even if
some of those tests would not run on a particular platform or under a
certain configuration (unlike chainlint.sed which only lints tests which
actually run).

The new linter is a good deal smarter than chainlint.sed and understands not
just shell syntax but also some semantics of test construction, unlike
chainlint.sed which is merely heuristics-based. For instance, the new linter
recognizes cases when a broken &&-chain is legitimate, such as when "$?" is
handled explicitly or when a failure is signaled directly with "false", in
which case the &&-chain leading up to the "false" is immaterial, as well as
other cases. Unlike chainlint.sed, it recognizes that a semicolon after the
last command in a compound statement is harmless, thus won't interpret the
semicolon as breaking the &&-chain.

The new linter also provides considerably better coverage for broken
&&-chains. The "magic exit code 117" &&-chain checker built into test-lib.sh
only works for top-level command invocations; it doesn't work within "{...}"
groups, "(...)" subshells, "$(...)" substitutions, or within bodies of
compound statements, such as "if", "for", "while", "case", etc.
chainlint.sed partly fills the gap by catching broken &&-chains in "(...)"
subshells one level deep, but bugs can still lurk behind broken &&-chains in
the other cases. The new linter catches broken &&-chains within all those
constructs to any depth.

Another important improvement is that the new linter understands that shell
loops do not terminate automatically when a command in the loop body fails,
and that the condition needs to be handled explicitly by the test author by
using "|| return 1" (or "|| exit 1" in a subshell) to signal failure.
Consequently, the new linter will complain when a loop is lacking "|| return
1" (or "|| exit 1").

Finally, unlike chainlint.sed which (not surprisingly) is implemented in
"sed", the new linter is written in Perl, thus should be more accessible to
a wider audience, and is structured as a traditional top-down parser which
makes it much easier to reason about.

The new linter could eventually subsume other linting tasks such as
check-nonportable-shell.pl (which itself takes a couple seconds to run on my
machine), though it probably should be renamed to something other than
"chainlint" since it is no longer targeted only at spotting &&-chain breaks,
but that can wait for another day.

Ævar offered some sensible comments[2,3] about optimizing the Makefile rules
related to chainlint, but those optimizations are not tackled here for a few
reasons: (1) this series is already quite long, (2) I'd like to keep the
series focused on its primary goal of installing a new and improved linter,
(3) these patches do not make the Makefile situation any worse[4], and (4)
those optimizations can easily be done atop this series[5].

Junio: This series is nominally atop es/t4301-sed-portability-fix which is
in "next", and es/fix-chained-tests, es/test-chain-lint, and es/chainlint,
all of which are already in "master".

Dscho: This series conflicts with some patches carried only by the Git for
Windows project; the resolutions are obvious and simple. The new linter also
identifies some problems in tests carried only by the Git for Windows
project.

[1] https://lore.kernel.org/git/YJzGcZpZ+E9R0gYd@coredump.intra.peff.net/
[2]
https://lore.kernel.org/git/RFC-patch-1.1-bb3f1577829-20211213T095456Z-avarab@gmail.com/
[3] https://lore.kernel.org/git/211213.86tufc8oop.gmgdl@evledraar.gmail.com/
[4]
https://lore.kernel.org/git/CAPig+cSFtpt6ExbVDbcx3tZodrKFuM-r2GMW4TQ2tJmLvHBFtQ@mail.gmail.com/
[5] https://lore.kernel.org/git/211214.86tufbbbu3.gmgdl@evledraar.gmail.com/

Eric Sunshine (18):
  t: add skeleton chainlint.pl
  chainlint.pl: add POSIX shell lexical analyzer
  chainlint.pl: add POSIX shell parser
  chainlint.pl: add parser to validate tests
  chainlint.pl: add parser to identify test definitions
  chainlint.pl: validate test scripts in parallel
  chainlint.pl: don't require `return|exit|continue` to end with `&&`
  t/Makefile: apply chainlint.pl to existing self-tests
  chainlint.pl: don't require `&` background command to end with `&&`
  chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly
  chainlint.pl: don't flag broken &&-chain if failure indicated
    explicitly
  chainlint.pl: complain about loops lacking explicit failure handling
  chainlint.pl: allow `|| echo` to signal failure upstream of a pipe
  t/chainlint: add more chainlint.pl self-tests
  test-lib: retire "lint harder" optimization hack
  test-lib: replace chainlint.sed with chainlint.pl
  t/Makefile: teach `make test` and `make prove` to run chainlint.pl
  t: retire unused chainlint.sed

 contrib/buildsystems/CMakeLists.txt           |   2 +-
 t/Makefile                                    |  49 +-
 t/README                                      |   5 -
 t/chainlint.pl                                | 730 ++++++++++++++++++
 t/chainlint.sed                               | 399 ----------
 t/chainlint/blank-line-before-esac.expect     |  18 +
 t/chainlint/blank-line-before-esac.test       |  19 +
 t/chainlint/block.expect                      |  15 +-
 t/chainlint/block.test                        |  15 +-
 t/chainlint/chain-break-background.expect     |   9 +
 t/chainlint/chain-break-background.test       |  10 +
 t/chainlint/chain-break-continue.expect       |  12 +
 t/chainlint/chain-break-continue.test         |  13 +
 t/chainlint/chain-break-false.expect          |   9 +
 t/chainlint/chain-break-false.test            |  10 +
 t/chainlint/chain-break-return-exit.expect    |  19 +
 t/chainlint/chain-break-return-exit.test      |  23 +
 t/chainlint/chain-break-status.expect         |   9 +
 t/chainlint/chain-break-status.test           |  11 +
 t/chainlint/chained-block.expect              |   9 +
 t/chainlint/chained-block.test                |  11 +
 t/chainlint/chained-subshell.expect           |  10 +
 t/chainlint/chained-subshell.test             |  13 +
 .../command-substitution-subsubshell.expect   |   2 +
 .../command-substitution-subsubshell.test     |   3 +
 t/chainlint/complex-if-in-cuddled-loop.expect |   2 +-
 t/chainlint/double-here-doc.expect            |   2 +
 t/chainlint/double-here-doc.test              |  12 +
 t/chainlint/dqstring-line-splice.expect       |   3 +
 t/chainlint/dqstring-line-splice.test         |   7 +
 t/chainlint/dqstring-no-interpolate.expect    |  11 +
 t/chainlint/dqstring-no-interpolate.test      |  15 +
 t/chainlint/empty-here-doc.expect             |   3 +
 t/chainlint/empty-here-doc.test               |   5 +
 t/chainlint/exclamation.expect                |   4 +
 t/chainlint/exclamation.test                  |   8 +
 t/chainlint/for-loop-abbreviated.expect       |   5 +
 t/chainlint/for-loop-abbreviated.test         |   6 +
 t/chainlint/for-loop.expect                   |   4 +-
 t/chainlint/function.expect                   |  11 +
 t/chainlint/function.test                     |  13 +
 t/chainlint/here-doc-indent-operator.expect   |   5 +
 t/chainlint/here-doc-indent-operator.test     |  13 +
 t/chainlint/here-doc-multi-line-string.expect |   3 +-
 t/chainlint/if-condition-split.expect         |   7 +
 t/chainlint/if-condition-split.test           |   8 +
 t/chainlint/if-in-loop.expect                 |   2 +-
 t/chainlint/if-in-loop.test                   |   2 +-
 t/chainlint/loop-detect-failure.expect        |  15 +
 t/chainlint/loop-detect-failure.test          |  17 +
 t/chainlint/loop-detect-status.expect         |  18 +
 t/chainlint/loop-detect-status.test           |  19 +
 t/chainlint/loop-in-if.expect                 |   2 +-
 t/chainlint/loop-upstream-pipe.expect         |  10 +
 t/chainlint/loop-upstream-pipe.test           |  11 +
 t/chainlint/multi-line-string.expect          |  11 +-
 t/chainlint/nested-loop-detect-failure.expect |  31 +
 t/chainlint/nested-loop-detect-failure.test   |  35 +
 t/chainlint/nested-subshell.expect            |   2 +-
 t/chainlint/one-liner-for-loop.expect         |   9 +
 t/chainlint/one-liner-for-loop.test           |  10 +
 t/chainlint/return-loop.expect                |   5 +
 t/chainlint/return-loop.test                  |   6 +
 t/chainlint/semicolon.expect                  |   2 +-
 t/chainlint/sqstring-in-sqstring.expect       |   4 +
 t/chainlint/sqstring-in-sqstring.test         |   5 +
 t/chainlint/t7900-subtree.expect              |  13 +-
 t/chainlint/token-pasting.expect              |  27 +
 t/chainlint/token-pasting.test                |  32 +
 t/chainlint/while-loop.expect                 |   4 +-
 t/t0027-auto-crlf.sh                          |   7 +-
 t/t3070-wildmatch.sh                          |   5 -
 t/test-lib.sh                                 |  12 +-
 73 files changed, 1439 insertions(+), 449 deletions(-)
 create mode 100755 t/chainlint.pl
 delete mode 100644 t/chainlint.sed
 create mode 100644 t/chainlint/blank-line-before-esac.expect
 create mode 100644 t/chainlint/blank-line-before-esac.test
 create mode 100644 t/chainlint/chain-break-background.expect
 create mode 100644 t/chainlint/chain-break-background.test
 create mode 100644 t/chainlint/chain-break-continue.expect
 create mode 100644 t/chainlint/chain-break-continue.test
 create mode 100644 t/chainlint/chain-break-false.expect
 create mode 100644 t/chainlint/chain-break-false.test
 create mode 100644 t/chainlint/chain-break-return-exit.expect
 create mode 100644 t/chainlint/chain-break-return-exit.test
 create mode 100644 t/chainlint/chain-break-status.expect
 create mode 100644 t/chainlint/chain-break-status.test
 create mode 100644 t/chainlint/chained-block.expect
 create mode 100644 t/chainlint/chained-block.test
 create mode 100644 t/chainlint/chained-subshell.expect
 create mode 100644 t/chainlint/chained-subshell.test
 create mode 100644 t/chainlint/command-substitution-subsubshell.expect
 create mode 100644 t/chainlint/command-substitution-subsubshell.test
 create mode 100644 t/chainlint/double-here-doc.expect
 create mode 100644 t/chainlint/double-here-doc.test
 create mode 100644 t/chainlint/dqstring-line-splice.expect
 create mode 100644 t/chainlint/dqstring-line-splice.test
 create mode 100644 t/chainlint/dqstring-no-interpolate.expect
 create mode 100644 t/chainlint/dqstring-no-interpolate.test
 create mode 100644 t/chainlint/empty-here-doc.expect
 create mode 100644 t/chainlint/empty-here-doc.test
 create mode 100644 t/chainlint/exclamation.expect
 create mode 100644 t/chainlint/exclamation.test
 create mode 100644 t/chainlint/for-loop-abbreviated.expect
 create mode 100644 t/chainlint/for-loop-abbreviated.test
 create mode 100644 t/chainlint/function.expect
 create mode 100644 t/chainlint/function.test
 create mode 100644 t/chainlint/here-doc-indent-operator.expect
 create mode 100644 t/chainlint/here-doc-indent-operator.test
 create mode 100644 t/chainlint/if-condition-split.expect
 create mode 100644 t/chainlint/if-condition-split.test
 create mode 100644 t/chainlint/loop-detect-failure.expect
 create mode 100644 t/chainlint/loop-detect-failure.test
 create mode 100644 t/chainlint/loop-detect-status.expect
 create mode 100644 t/chainlint/loop-detect-status.test
 create mode 100644 t/chainlint/loop-upstream-pipe.expect
 create mode 100644 t/chainlint/loop-upstream-pipe.test
 create mode 100644 t/chainlint/nested-loop-detect-failure.expect
 create mode 100644 t/chainlint/nested-loop-detect-failure.test
 create mode 100644 t/chainlint/one-liner-for-loop.expect
 create mode 100644 t/chainlint/one-liner-for-loop.test
 create mode 100644 t/chainlint/return-loop.expect
 create mode 100644 t/chainlint/return-loop.test
 create mode 100644 t/chainlint/sqstring-in-sqstring.expect
 create mode 100644 t/chainlint/sqstring-in-sqstring.test
 create mode 100644 t/chainlint/token-pasting.expect
 create mode 100644 t/chainlint/token-pasting.test


base-commit: d42b38dfb5edf1a7fddd9542d722f91038407819
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1322%2Fsunshineco%2Fchainlintperl-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1322/sunshineco/chainlintperl-v1
Pull-Request: https://github.com/git/git/pull/1322
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 01/18] t: add skeleton chainlint.pl
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01 12:27   ` Ævar Arnfjörð Bjarmason
  2022-09-01  0:29 ` [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer Eric Sunshine via GitGitGadget
                   ` (17 subsequent siblings)
  18 siblings, 1 reply; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

Although chainlint.sed usefully identifies broken &&-chains in tests, it
has several shortcomings which include:

  * only detects &&-chain breakage in subshells (one-level deep)

  * does not check for broken top-level &&-chains; that task is left to
    the "magic exit code 117" checker built into test-lib.sh, however,
    that detection does not extend to `{...}` blocks, `$(...)`
    expressions, or compound statements such as `if...fi`,
    `while...done`, `case...esac`

  * uses heuristics, which makes it (potentially) fallible and difficult
    to tweak to handle additional real-world cases

  * written in `sed` and employs advanced `sed` operators which are
    probably not well-known to many programmers, thus the pool of people
    who can maintain it is likely small

  * manually simulates recursion into subshells which makes it much more
    difficult to reason about than, say, a traditional top-down parser

  * checks each test as the test is run, which can get expensive for
    tests which are run repeatedly by functions or loops since their
    bodies will be checked over and over (tens or hundreds of times)
    unnecessarily

To address these shortcomings, begin implementing a more functional and
precise test linter which understands shell syntax and semantics rather
than employing heuristics, thus is able to recognize structural problems
with tests beyond broken &&-chains.

The new linter is written in Perl, thus should be more accessible to a
wider audience, and is structured as a traditional top-down parser which
makes it much easier to reason about, and allows it to inspect compound
statements within test bodies to any depth.

Furthermore, it can check all test definitions in the entire project in
a single invocation rather than having to be invoked once per test, and
each test definition is checked only once no matter how many times the
test is actually run.

At this stage, the new linter is just a skeleton containing boilerplate
which handles command-line options, collects and reports statistics, and
feeds its arguments -- paths of test scripts -- to a (presently)
do-nothing script parser for validation. Subsequent changes will flesh
out the functionality.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl | 115 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 115 insertions(+)
 create mode 100755 t/chainlint.pl

diff --git a/t/chainlint.pl b/t/chainlint.pl
new file mode 100755
index 00000000000..e8ab95c7858
--- /dev/null
+++ b/t/chainlint.pl
@@ -0,0 +1,115 @@
+#!/usr/bin/env perl
+#
+# Copyright (c) 2021-2022 Eric Sunshine <sunshine@sunshineco.com>
+#
+# This tool scans shell scripts for test definitions and checks those tests for
+# problems, such as broken &&-chains, which might hide bugs in the tests
+# themselves or in behaviors being exercised by the tests.
+#
+# Input arguments are pathnames of shell scripts containing test definitions,
+# or globs referencing a collection of scripts. For each problem discovered,
+# the pathname of the script containing the test is printed along with the test
+# name and the test body with a `?!FOO?!` annotation at the location of each
+# detected problem, where "FOO" is a tag such as "AMP" which indicates a broken
+# &&-chain. Returns zero if no problems are discovered, otherwise non-zero.
+
+use warnings;
+use strict;
+use File::Glob;
+use Getopt::Long;
+
+my $show_stats;
+my $emit_all;
+
+package ScriptParser;
+
+sub new {
+	my $class = shift @_;
+	my $self = bless {} => $class;
+	$self->{output} = [];
+	$self->{ntests} = 0;
+	return $self;
+}
+
+sub parse_cmd {
+	return undef;
+}
+
+# main contains high-level functionality for processing command-line switches,
+# feeding input test scripts to ScriptParser, and reporting results.
+package main;
+
+my $getnow = sub { return time(); };
+my $interval = sub { return time() - shift; };
+if (eval {require Time::HiRes; Time::HiRes->import(); 1;}) {
+	$getnow = sub { return [Time::HiRes::gettimeofday()]; };
+	$interval = sub { return Time::HiRes::tv_interval(shift); };
+}
+
+sub show_stats {
+	my ($start_time, $stats) = @_;
+	my $walltime = $interval->($start_time);
+	my ($usertime) = times();
+	my ($total_workers, $total_scripts, $total_tests, $total_errs) = (0, 0, 0, 0);
+	for (@$stats) {
+		my ($worker, $nscripts, $ntests, $nerrs) = @$_;
+		print(STDERR "worker $worker: $nscripts scripts, $ntests tests, $nerrs errors\n");
+		$total_workers++;
+		$total_scripts += $nscripts;
+		$total_tests += $ntests;
+		$total_errs += $nerrs;
+	}
+	printf(STDERR "total: %d workers, %d scripts, %d tests, %d errors, %.2fs/%.2fs (wall/user)\n", $total_workers, $total_scripts, $total_tests, $total_errs, $walltime, $usertime);
+}
+
+sub check_script {
+	my ($id, $next_script, $emit) = @_;
+	my ($nscripts, $ntests, $nerrs) = (0, 0, 0);
+	while (my $path = $next_script->()) {
+		$nscripts++;
+		my $fh;
+		unless (open($fh, "<", $path)) {
+			$emit->("?!ERR?! $path: $!\n");
+			next;
+		}
+		my $s = do { local $/; <$fh> };
+		close($fh);
+		my $parser = ScriptParser->new(\$s);
+		1 while $parser->parse_cmd();
+		if (@{$parser->{output}}) {
+			my $s = join('', @{$parser->{output}});
+			$emit->("# chainlint: $path\n" . $s);
+			$nerrs += () = $s =~ /\?![^?]+\?!/g;
+		}
+		$ntests += $parser->{ntests};
+	}
+	return [$id, $nscripts, $ntests, $nerrs];
+}
+
+sub exit_code {
+	my $stats = shift @_;
+	for (@$stats) {
+		my ($worker, $nscripts, $ntests, $nerrs) = @$_;
+		return 1 if $nerrs;
+	}
+	return 0;
+}
+
+Getopt::Long::Configure(qw{bundling});
+GetOptions(
+	"emit-all!" => \$emit_all,
+	"stats|show-stats!" => \$show_stats) or die("option error\n");
+
+my $start_time = $getnow->();
+my @stats;
+
+my @scripts;
+push(@scripts, File::Glob::bsd_glob($_)) for (@ARGV);
+unless (@scripts) {
+	show_stats($start_time, \@stats) if $show_stats;
+	exit;
+}
+
+push(@stats, check_script(1, sub { shift(@scripts); }, sub { print(@_); }));
+show_stats($start_time, \@stats) if $show_stats;
+exit(exit_code(\@stats));
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 01/18] t: add skeleton chainlint.pl Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01 12:32   ` Ævar Arnfjörð Bjarmason
  2022-09-01  0:29 ` [PATCH 03/18] chainlint.pl: add POSIX shell parser Eric Sunshine via GitGitGadget
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

Begin fleshing out chainlint.pl by adding a lexical analyzer for the
POSIX shell command language. The sole entry point Lexer::scan_token()
returns the next token from the input. It will be called by the upcoming
shell language parser.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl | 177 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 177 insertions(+)

diff --git a/t/chainlint.pl b/t/chainlint.pl
index e8ab95c7858..81ffbf28bf3 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -21,6 +21,183 @@ use Getopt::Long;
 my $show_stats;
 my $emit_all;
 
+# Lexer tokenizes POSIX shell scripts. It is roughly modeled after section 2.3
+# "Token Recognition" of POSIX chapter 2 "Shell Command Language". Although
+# similar to lexical analyzers for other languages, this one differs in a few
+# substantial ways due to quirks of the shell command language.
+#
+# For instance, in many languages, newline is just whitespace like space or
+# TAB, but in shell a newline is a command separator, thus a distinct lexical
+# token. A newline is significant and returned as a distinct token even at the
+# end of a shell comment.
+#
+# In other languages, `1+2` would typically be scanned as three tokens
+# (`1`, `+`, and `2`), but in shell it is a single token. However, the similar
+# `1 + 2`, which embeds whitepace, is scanned as three token in shell, as well.
+# In shell, several characters with special meaning lose that meaning when not
+# surrounded by whitespace. For instance, the negation operator `!` is special
+# when standing alone surrounded by whitespace; whereas in `foo!uucp` it is
+# just a plain character in the longer token "foo!uucp". In many other
+# languages, `"string"/foo:'string'` might be scanned as five tokens ("string",
+# `/`, `foo`, `:`, and 'string'), but in shell, it is just a single token.
+#
+# The lexical analyzer for the shell command language is also somewhat unusual
+# in that it recursively invokes the parser to handle the body of `$(...)`
+# expressions which can contain arbitrary shell code. Such expressions may be
+# encountered both inside and outside of double-quoted strings.
+#
+# The lexical analyzer is responsible for consuming shell here-doc bodies which
+# extend from the line following a `<<TAG` operator until a line consisting
+# solely of `TAG`. Here-doc consumption begins when a newline is encountered.
+# It is legal for multiple here-doc `<<TAG` operators to be present on a single
+# line, in which case their bodies must be present one following the next, and
+# are consumed in the (left-to-right) order the `<<TAG` operators appear on the
+# line. A special complication is that the bodies of all here-docs must be
+# consumed when the newline is encountered even if the parse context depth has
+# changed. For instance, in `cat <<A && x=$(cat <<B &&\n`, bodies of here-docs
+# "A" and "B" must be consumed even though "A" was introduced outside the
+# recursive parse context in which "B" was introduced and in which the newline
+# is encountered.
+package Lexer;
+
+sub new {
+	my ($class, $parser, $s) = @_;
+	bless {
+		parser => $parser,
+		buff => $s,
+		heretags => []
+	} => $class;
+}
+
+sub scan_heredoc_tag {
+	my $self = shift @_;
+	${$self->{buff}} =~ /\G(-?)/gc;
+	my $indented = $1;
+	my $tag = $self->scan_token();
+	$tag =~ s/['"\\]//g;
+	push(@{$self->{heretags}}, $indented ? "\t$tag" : "$tag");
+	return "<<$indented$tag";
+}
+
+sub scan_op {
+	my ($self, $c) = @_;
+	my $b = $self->{buff};
+	return $c unless $$b =~ /\G(.)/sgc;
+	my $cc = $c . $1;
+	return scan_heredoc_tag($self) if $cc eq '<<';
+	return $cc if $cc =~ /^(?:&&|\|\||>>|;;|<&|>&|<>|>\|)$/;
+	pos($$b)--;
+	return $c;
+}
+
+sub scan_sqstring {
+	my $self = shift @_;
+	${$self->{buff}} =~ /\G([^']*'|.*\z)/sgc;
+	return "'" . $1;
+}
+
+sub scan_dqstring {
+	my $self = shift @_;
+	my $b = $self->{buff};
+	my $s = '"';
+	while (1) {
+		# slurp up non-special characters
+		$s .= $1 if $$b =~ /\G([^"\$\\]+)/gc;
+		# handle special characters
+		last unless $$b =~ /\G(.)/sgc;
+		my $c = $1;
+		$s .= '"', last if $c eq '"';
+		$s .= '$' . $self->scan_dollar(), next if $c eq '$';
+		if ($c eq '\\') {
+			$s .= '\\', last unless $$b =~ /\G(.)/sgc;
+			$c = $1;
+			next if $c eq "\n"; # line splice
+			# backslash escapes only $, `, ", \ in dq-string
+			$s .= '\\' unless $c =~ /^[\$`"\\]$/;
+			$s .= $c;
+			next;
+		}
+		die("internal error scanning dq-string '$c'\n");
+	}
+	return $s;
+}
+
+sub scan_balanced {
+	my ($self, $c1, $c2) = @_;
+	my $b = $self->{buff};
+	my $depth = 1;
+	my $s = $c1;
+	while ($$b =~ /\G([^\Q$c1$c2\E]*(?:[\Q$c1$c2\E]|\z))/gc) {
+		$s .= $1;
+		$depth++, next if $s =~ /\Q$c1\E$/;
+		$depth--;
+		last if $depth == 0;
+	}
+	return $s;
+}
+
+sub scan_subst {
+	my $self = shift @_;
+	my @tokens = $self->{parser}->parse(qr/^\)$/);
+	$self->{parser}->next_token(); # closing ")"
+	return @tokens;
+}
+
+sub scan_dollar {
+	my $self = shift @_;
+	my $b = $self->{buff};
+	return $self->scan_balanced('(', ')') if $$b =~ /\G\((?=\()/gc; # $((...))
+	return '(' . join(' ', $self->scan_subst()) . ')' if $$b =~ /\G\(/gc; # $(...)
+	return $self->scan_balanced('{', '}') if $$b =~ /\G\{/gc; # ${...}
+	return $1 if $$b =~ /\G(\w+)/gc; # $var
+	return $1 if $$b =~ /\G([@*#?$!0-9-])/gc; # $*, $1, $$, etc.
+	return '';
+}
+
+sub swallow_heredocs {
+	my $self = shift @_;
+	my $b = $self->{buff};
+	my $tags = $self->{heretags};
+	while (my $tag = shift @$tags) {
+		my $indent = $tag =~ s/^\t// ? '\\s*' : '';
+		$$b =~ /(?:\G|\n)$indent\Q$tag\E(?:\n|\z)/gc;
+	}
+}
+
+sub scan_token {
+	my $self = shift @_;
+	my $b = $self->{buff};
+	my $token = '';
+RESTART:
+	$$b =~ /\G[ \t]+/gc; # skip whitespace (but not newline)
+	return "\n" if $$b =~ /\G#[^\n]*(?:\n|\z)/gc; # comment
+	while (1) {
+		# slurp up non-special characters
+		$token .= $1 if $$b =~ /\G([^\\;&|<>(){}'"\$\s]+)/gc;
+		# handle special characters
+		last unless $$b =~ /\G(.)/sgc;
+		my $c = $1;
+		last if $c =~ /^[ \t]$/; # whitespace ends token
+		pos($$b)--, last if length($token) && $c =~ /^[;&|<>(){}\n]$/;
+		$token .= $self->scan_sqstring(), next if $c eq "'";
+		$token .= $self->scan_dqstring(), next if $c eq '"';
+		$token .= $c . $self->scan_dollar(), next if $c eq '$';
+		$self->swallow_heredocs(), $token = $c, last if $c eq "\n";
+		$token = $self->scan_op($c), last if $c =~ /^[;&|<>]$/;
+		$token = $c, last if $c =~ /^[(){}]$/;
+		if ($c eq '\\') {
+			$token .= '\\', last unless $$b =~ /\G(.)/sgc;
+			$c = $1;
+			next if $c eq "\n" && length($token); # line splice
+			goto RESTART if $c eq "\n"; # line splice
+			$token .= '\\' . $c;
+			next;
+		}
+		die("internal error scanning character '$c'\n");
+	}
+	return length($token) ? $token : undef;
+}
+
 package ScriptParser;
 
 sub new {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 03/18] chainlint.pl: add POSIX shell parser
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 01/18] t: add skeleton chainlint.pl Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 04/18] chainlint.pl: add parser to validate tests Eric Sunshine via GitGitGadget
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

Continue fleshing out chainlint.pl by adding a general purpose recursive
descent parser for the POSIX shell command language. Although never
invoked directly, upcoming parser subclasses will extend its
functionality for specific purposes, such as plucking test definitions
from input scripts and applying domain-specific knowledge to perform
test validation.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl | 243 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 243 insertions(+)

diff --git a/t/chainlint.pl b/t/chainlint.pl
index 81ffbf28bf3..cdf136896be 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -198,6 +198,249 @@ RESTART:
 	return length($token) ? $token : undef;
 }
 
+# ShellParser parses POSIX shell scripts (with minor extensions for Bash). It
+# is a recursive descent parser very roughly modeled after section 2.10 "Shell
+# Grammar" of POSIX chapter 2 "Shell Command Language".
+package ShellParser;
+
+sub new {
+	my ($class, $s) = @_;
+	my $self = bless {
+		buff => [],
+		stop => [],
+		output => []
+	} => $class;
+	$self->{lexer} = Lexer->new($self, $s);
+	return $self;
+}
+
+sub next_token {
+	my $self = shift @_;
+	return pop(@{$self->{buff}}) if @{$self->{buff}};
+	return $self->{lexer}->scan_token();
+}
+
+sub untoken {
+	my $self = shift @_;
+	push(@{$self->{buff}}, @_);
+}
+
+sub peek {
+	my $self = shift @_;
+	my $token = $self->next_token();
+	return undef unless defined($token);
+	$self->untoken($token);
+	return $token;
+}
+
+sub stop_at {
+	my ($self, $token) = @_;
+	return 1 unless defined($token);
+	my $stop = ${$self->{stop}}[-1] if @{$self->{stop}};
+	return defined($stop) && $token =~ $stop;
+}
+
+sub expect {
+	my ($self, $expect) = @_;
+	my $token = $self->next_token();
+	return $token if defined($token) && $token eq $expect;
+	push(@{$self->{output}}, "?!ERR?! expected '$expect' but found '" . (defined($token) ? $token : "<end-of-input>") . "'\n");
+	$self->untoken($token) if defined($token);
+	return ();
+}
+
+sub optional_newlines {
+	my $self = shift @_;
+	my @tokens;
+	while (my $token = $self->peek()) {
+		last unless $token eq "\n";
+		push(@tokens, $self->next_token());
+	}
+	return @tokens;
+}
+
+sub parse_group {
+	my $self = shift @_;
+	return ($self->parse(qr/^}$/),
+		$self->expect('}'));
+}
+
+sub parse_subshell {
+	my $self = shift @_;
+	return ($self->parse(qr/^\)$/),
+		$self->expect(')'));
+}
+
+sub parse_case_pattern {
+	my $self = shift @_;
+	my @tokens;
+	while (defined(my $token = $self->next_token())) {
+		push(@tokens, $token);
+		last if $token eq ')';
+	}
+	return @tokens;
+}
+
+sub parse_case {
+	my $self = shift @_;
+	my @tokens;
+	push(@tokens,
+	     $self->next_token(), # subject
+	     $self->optional_newlines(),
+	     $self->expect('in'),
+	     $self->optional_newlines());
+	while (1) {
+		my $token = $self->peek();
+		last unless defined($token) && $token ne 'esac';
+		push(@tokens,
+		     $self->parse_case_pattern(),
+		     $self->optional_newlines(),
+		     $self->parse(qr/^(?:;;|esac)$/)); # item body
+		$token = $self->peek();
+		last unless defined($token) && $token ne 'esac';
+		push(@tokens,
+		     $self->expect(';;'),
+		     $self->optional_newlines());
+	}
+	push(@tokens, $self->expect('esac'));
+	return @tokens;
+}
+
+sub parse_for {
+	my $self = shift @_;
+	my @tokens;
+	push(@tokens,
+	     $self->next_token(), # variable
+	     $self->optional_newlines());
+	my $token = $self->peek();
+	if (defined($token) && $token eq 'in') {
+		push(@tokens,
+		     $self->expect('in'),
+		     $self->optional_newlines());
+	}
+	push(@tokens,
+	     $self->parse(qr/^do$/), # items
+	     $self->expect('do'),
+	     $self->optional_newlines(),
+	     $self->parse_loop_body(),
+	     $self->expect('done'));
+	return @tokens;
+}
+
+sub parse_if {
+	my $self = shift @_;
+	my @tokens;
+	while (1) {
+		push(@tokens,
+		     $self->parse(qr/^then$/), # if/elif condition
+		     $self->expect('then'),
+		     $self->optional_newlines(),
+		     $self->parse(qr/^(?:elif|else|fi)$/)); # if/elif body
+		my $token = $self->peek();
+		last unless defined($token) && $token eq 'elif';
+		push(@tokens, $self->expect('elif'));
+	}
+	my $token = $self->peek();
+	if (defined($token) && $token eq 'else') {
+		push(@tokens,
+		     $self->expect('else'),
+		     $self->optional_newlines(),
+		     $self->parse(qr/^fi$/)); # else body
+	}
+	push(@tokens, $self->expect('fi'));
+	return @tokens;
+}
+
+sub parse_loop_body {
+	my $self = shift @_;
+	return $self->parse(qr/^done$/);
+}
+
+sub parse_loop {
+	my $self = shift @_;
+	return ($self->parse(qr/^do$/), # condition
+		$self->expect('do'),
+		$self->optional_newlines(),
+		$self->parse_loop_body(),
+		$self->expect('done'));
+}
+
+sub parse_func {
+	my $self = shift @_;
+	return ($self->expect('('),
+		$self->expect(')'),
+		$self->optional_newlines(),
+		$self->parse_cmd()); # body
+}
+
+sub parse_bash_array_assignment {
+	my $self = shift @_;
+	my @tokens = $self->expect('(');
+	while (defined(my $token = $self->next_token())) {
+		push(@tokens, $token);
+		last if $token eq ')';
+	}
+	return @tokens;
+}
+
+my %compound = (
+	'{' => \&parse_group,
+	'(' => \&parse_subshell,
+	'case' => \&parse_case,
+	'for' => \&parse_for,
+	'if' => \&parse_if,
+	'until' => \&parse_loop,
+	'while' => \&parse_loop);
+
+sub parse_cmd {
+	my $self = shift @_;
+	my $cmd = $self->next_token();
+	return () unless defined($cmd);
+	return $cmd if $cmd eq "\n";
+
+	my $token;
+	my @tokens = $cmd;
+	if ($cmd eq '!') {
+		push(@tokens, $self->parse_cmd());
+		return @tokens;
+	} elsif (my $f = $compound{$cmd}) {
+		push(@tokens, $self->$f());
+	} elsif (defined($token = $self->peek()) && $token eq '(') {
+		if ($cmd !~ /\w=$/) {
+			push(@tokens, $self->parse_func());
+			return @tokens;
+		}
+		$tokens[-1] .= join(' ', $self->parse_bash_array_assignment());
+	}
+
+	while (defined(my $token = $self->next_token())) {
+		$self->untoken($token), last if $self->stop_at($token);
+		push(@tokens, $token);
+		last if $token =~ /^(?:[;&\n|]|&&|\|\|)$/;
+	}
+	push(@tokens, $self->next_token()) if $tokens[-1] ne "\n" && defined($token = $self->peek()) && $token eq "\n";
+	return @tokens;
+}
+
+sub accumulate {
+	my ($self, $tokens, $cmd) = @_;
+	push(@$tokens, @$cmd);
+}
+
+sub parse {
+	my ($self, $stop) = @_;
+	push(@{$self->{stop}}, $stop);
+	goto DONE if $self->stop_at($self->peek());
+	my @tokens;
+	while (my @cmd = $self->parse_cmd()) {
+		$self->accumulate(\@tokens, \@cmd);
+		last if $self->stop_at($self->peek());
+	}
+DONE:
+	pop(@{$self->{stop}});
+	return @tokens;
+}
+
 package ScriptParser;
 
 sub new {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 04/18] chainlint.pl: add parser to validate tests
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (2 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 03/18] chainlint.pl: add POSIX shell parser Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 05/18] chainlint.pl: add parser to identify test definitions Eric Sunshine via GitGitGadget
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

Continue fleshing out chainlint.pl by adding TestParser, a parser with
special knowledge about how Git tests should be written; for instance,
it knows that commands within a test body should be chained together
with `&&`. An upcoming parser which plucks test definitions from test
scripts will invoke TestParser for each test body it encounters.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl | 46 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/t/chainlint.pl b/t/chainlint.pl
index cdf136896be..ad257106e56 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -441,6 +441,52 @@ DONE:
 	return @tokens;
 }
 
+# TestParser is a subclass of ShellParser which, beyond parsing shell script
+# code, is also imbued with semantic knowledge of test construction, and checks
+# tests for common problems (such as broken &&-chains) which might hide bugs in
+# the tests themselves or in behaviors being exercised by the tests. As such,
+# TestParser is only called upon to parse test bodies, not the top-level
+# scripts in which the tests are defined.
+package TestParser;
+
+use base 'ShellParser';
+
+sub find_non_nl {
+	my $tokens = shift @_;
+	my $n = shift @_;
+	$n = $#$tokens if !defined($n);
+	$n-- while $n >= 0 && $$tokens[$n] eq "\n";
+	return $n;
+}
+
+sub ends_with {
+	my ($tokens, $needles) = @_;
+	my $n = find_non_nl($tokens);
+	for my $needle (reverse(@$needles)) {
+		return undef if $n < 0;
+		$n = find_non_nl($tokens, $n), next if $needle eq "\n";
+		return undef if $$tokens[$n] !~ $needle;
+		$n--;
+	}
+	return 1;
+}
+
+sub accumulate {
+	my ($self, $tokens, $cmd) = @_;
+	goto DONE unless @$tokens;
+	goto DONE if @$cmd == 1 && $$cmd[0] eq "\n";
+
+	# did previous command end with "&&", "||", "|"?
+	goto DONE if ends_with($tokens, [qr/^(?:&&|\|\||\|)$/]);
+
+	# flag missing "&&" at end of previous command
+	my $n = find_non_nl($tokens);
+	splice(@$tokens, $n + 1, 0, '?!AMP?!') unless $n < 0;
+
+DONE:
+	$self->SUPER::accumulate($tokens, $cmd);
+}
+
 package ScriptParser;
 
 sub new {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 05/18] chainlint.pl: add parser to identify test definitions
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (3 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 04/18] chainlint.pl: add parser to validate tests Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 06/18] chainlint.pl: validate test scripts in parallel Eric Sunshine via GitGitGadget
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

Finish fleshing out chainlint.pl by adding ScriptParser, a parser which
scans shell scripts for tests defined by test_expect_success() and
test_expect_failure(), plucks the test body from each definition, and
passes it to TestParser for validation. It recognizes test definitions
not only at the top-level of test scripts but also tests synthesized
within compound commands such as loops and function.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl | 63 +++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 60 insertions(+), 3 deletions(-)

diff --git a/t/chainlint.pl b/t/chainlint.pl
index ad257106e56..d526723ac00 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -487,18 +487,75 @@ DONE:
 	$self->SUPER::accumulate($tokens, $cmd);
 }
 
+# ScriptParser is a subclass of ShellParser which identifies individual test
+# definitions within test scripts, and passes each test body through TestParser
+# to identify possible problems. ShellParser detects test definitions not only
+# at the top-level of test scripts but also within compound commands such as
+# loops and function definitions.
 package ScriptParser;
 
+use base 'ShellParser';
+
 sub new {
 	my $class = shift @_;
-	my $self = bless {} => $class;
-	$self->{output} = [];
+	my $self = $class->SUPER::new(@_);
 	$self->{ntests} = 0;
 	return $self;
 }
 
+# extract the raw content of a token, which may be a single string or a
+# composition of multiple strings and non-string character runs; for instance,
+# `"test body"` unwraps to `test body`; `word"a b"42'c d'` to `worda b42c d`
+sub unwrap {
+	my $token = @_ ? shift @_ : $_;
+	# simple case: 'sqstring' or "dqstring"
+	return $token if $token =~ s/^'([^']*)'$/$1/;
+	return $token if $token =~ s/^"([^"]*)"$/$1/;
+
+	# composite case
+	my ($s, $q, $escaped);
+	while (1) {
+		# slurp up non-special characters
+		$s .= $1 if $token =~ /\G([^\\'"]*)/gc;
+		# handle special characters
+		last unless $token =~ /\G(.)/sgc;
+		my $c = $1;
+		$q = undef, next if defined($q) && $c eq $q;
+		$q = $c, next if !defined($q) && $c =~ /^['"]$/;
+		if ($c eq '\\') {
+			last unless $token =~ /\G(.)/sgc;
+			$c = $1;
+			$s .= '\\' if $c eq "\n"; # preserve line splice
+		}
+		$s .= $c;
+	}
+	return $s
+}
+
+sub check_test {
+	my $self = shift @_;
+	my ($title, $body) = map(unwrap, @_);
+	$self->{ntests}++;
+	my $parser = TestParser->new(\$body);
+	my @tokens = $parser->parse();
+	return unless $emit_all || grep(/\?![^?]+\?!/, @tokens);
+	my $checked = join(' ', @tokens);
+	$checked =~ s/^\n//;
+	$checked =~ s/^ //mg;
+	$checked =~ s/ $//mg;
+	$checked .= "\n" unless $checked =~ /\n$/;
+	push(@{$self->{output}}, "# chainlint: $title\n$checked");
+}
+
 sub parse_cmd {
-	return undef;
+	my $self = shift @_;
+	my @tokens = $self->SUPER::parse_cmd();
+	return @tokens unless @tokens && $tokens[0] =~ /^test_expect_(?:success|failure)$/;
+	my $n = $#tokens;
+	$n-- while $n >= 0 && $tokens[$n] =~ /^(?:[;&\n|]|&&|\|\|)$/;
+	$self->check_test($tokens[1], $tokens[2]) if $n == 2; # title body
+	$self->check_test($tokens[2], $tokens[3]) if $n > 2;  # prereq title body
+	return @tokens;
 }
 
 # main contains high-level functionality for processing command-line switches,
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (4 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 05/18] chainlint.pl: add parser to identify test definitions Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01 12:36   ` Ævar Arnfjörð Bjarmason
  2022-09-06 22:35   ` Eric Wong
  2022-09-01  0:29 ` [PATCH 07/18] chainlint.pl: don't require `return|exit|continue` to end with `&&` Eric Sunshine via GitGitGadget
                   ` (12 subsequent siblings)
  18 siblings, 2 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

Although chainlint.pl has undergone a good deal of optimization during
its development -- increasing in speed significantly -- parsing and
validating 1050+ scripts and 16500+ tests via Perl is not exactly
instantaneous. However, perceived performance can be improved by taking
advantage of the fact that there is no interdependence between test
scripts or test definitions, thus parsing and validating can be done in
parallel. The number of available cores is determined automatically but
can be overridden via the --jobs option.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/t/chainlint.pl b/t/chainlint.pl
index d526723ac00..898573a9100 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -15,9 +15,11 @@
 
 use warnings;
 use strict;
+use Config;
 use File::Glob;
 use Getopt::Long;
 
+my $jobs = -1;
 my $show_stats;
 my $emit_all;
 
@@ -569,6 +571,16 @@ if (eval {require Time::HiRes; Time::HiRes->import(); 1;}) {
 	$interval = sub { return Time::HiRes::tv_interval(shift); };
 }
 
+sub ncores {
+	# Windows
+	return $ENV{NUMBER_OF_PROCESSORS} if exists($ENV{NUMBER_OF_PROCESSORS});
+	# Linux / MSYS2 / Cygwin / WSL
+	do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor\s*:/, <>)); } if -r '/proc/cpuinfo';
+	# macOS & BSD
+	return qx/sysctl -n hw.ncpu/ if $^O =~ /(?:^darwin$|bsd)/;
+	return 1;
+}
+
 sub show_stats {
 	my ($start_time, $stats) = @_;
 	my $walltime = $interval->($start_time);
@@ -621,7 +633,9 @@ sub exit_code {
 Getopt::Long::Configure(qw{bundling});
 GetOptions(
 	"emit-all!" => \$emit_all,
+	"jobs|j=i" => \$jobs,
 	"stats|show-stats!" => \$show_stats) or die("option error\n");
+$jobs = ncores() if $jobs < 1;
 
 my $start_time = $getnow->();
 my @stats;
@@ -633,6 +647,40 @@ unless (@scripts) {
 	exit;
 }
 
-push(@stats, check_script(1, sub { shift(@scripts); }, sub { print(@_); }));
+unless ($Config{useithreads} && eval {
+	require threads; threads->import();
+	require Thread::Queue; Thread::Queue->import();
+	1;
+	}) {
+	push(@stats, check_script(1, sub { shift(@scripts); }, sub { print(@_); }));
+	show_stats($start_time, \@stats) if $show_stats;
+	exit(exit_code(\@stats));
+}
+
+my $script_queue = Thread::Queue->new();
+my $output_queue = Thread::Queue->new();
+
+sub next_script { return $script_queue->dequeue(); }
+sub emit { $output_queue->enqueue(@_); }
+
+sub monitor {
+	while (my $s = $output_queue->dequeue()) {
+		print($s);
+	}
+}
+
+my $mon = threads->create({'context' => 'void'}, \&monitor);
+threads->create({'context' => 'list'}, \&check_script, $_, \&next_script, \&emit) for 1..$jobs;
+
+$script_queue->enqueue(@scripts);
+$script_queue->end();
+
+for (threads->list()) {
+	push(@stats, $_->join()) unless $_ == $mon;
+}
+
+$output_queue->end();
+$mon->join();
+
 show_stats($start_time, \@stats) if $show_stats;
 exit(exit_code(\@stats));
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 07/18] chainlint.pl: don't require `return|exit|continue` to end with `&&`
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (5 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 06/18] chainlint.pl: validate test scripts in parallel Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 08/18] t/Makefile: apply chainlint.pl to existing self-tests Eric Sunshine via GitGitGadget
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

In order to check for &&-chain breakage, each time TestParser encounters
a new command, it checks whether the previous command ends with `&&`,
and -- with a couple exceptions -- signals breakage if it does not. The
first exception is that a command may validly end with `||`, which is
commonly employed as `command || return 1` at the very end of a loop
body to terminate the loop early. The second is that piping one
command's output with `|` to another command does not constitute a
&&-chain break (the exit status of the pipe is the exit status of the
final command in the pipe).

However, it turns out that there are a few additional cases found in the
wild in which it is likely safe for `&&` to be missing even when other
commands follow. For instance:

    while {condition-1}
    do
        test {condition-2} || return 1 # or `exit 1` within a subshell
        more-commands
    done

    while {condition-1}
    do
        test {condition-2} || continue
        more-commands
    done

Such cases indicate deliberate thought about failure modes by the test
author, thus flagging them as breaking the &&-chain is not helpful.
Therefore, take these special cases into consideration when checking for
&&-chain breakage.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl                             | 20 ++++++++++++++++++--
 t/chainlint/chain-break-continue.expect    | 12 ++++++++++++
 t/chainlint/chain-break-continue.test      | 13 +++++++++++++
 t/chainlint/chain-break-return-exit.expect |  4 ++++
 t/chainlint/chain-break-return-exit.test   |  5 +++++
 t/chainlint/return-loop.expect             |  5 +++++
 t/chainlint/return-loop.test               |  6 ++++++
 7 files changed, 63 insertions(+), 2 deletions(-)
 create mode 100644 t/chainlint/chain-break-continue.expect
 create mode 100644 t/chainlint/chain-break-continue.test
 create mode 100644 t/chainlint/chain-break-return-exit.expect
 create mode 100644 t/chainlint/chain-break-return-exit.test
 create mode 100644 t/chainlint/return-loop.expect
 create mode 100644 t/chainlint/return-loop.test

diff --git a/t/chainlint.pl b/t/chainlint.pl
index 898573a9100..31c444067ce 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -473,13 +473,29 @@ sub ends_with {
 	return 1;
 }
 
+sub match_ending {
+	my ($tokens, $endings) = @_;
+	for my $needles (@$endings) {
+		next if @$tokens < scalar(grep {$_ ne "\n"} @$needles);
+		return 1 if ends_with($tokens, $needles);
+	}
+	return undef;
+}
+
+my @safe_endings = (
+	[qr/^(?:&&|\|\||\|)$/],
+	[qr/^(?:exit|return)$/, qr/^(?:\d+|\$\?)$/],
+	[qr/^(?:exit|return)$/, qr/^(?:\d+|\$\?)$/, qr/^;$/],
+	[qr/^(?:exit|return|continue)$/],
+	[qr/^(?:exit|return|continue)$/, qr/^;$/]);
+
 sub accumulate {
 	my ($self, $tokens, $cmd) = @_;
 	goto DONE unless @$tokens;
 	goto DONE if @$cmd == 1 && $$cmd[0] eq "\n";
 
-	# did previous command end with "&&", "||", "|"?
-	goto DONE if ends_with($tokens, [qr/^(?:&&|\|\||\|)$/]);
+	# did previous command end with "&&", "|", "|| return" or similar?
+	goto DONE if match_ending($tokens, \@safe_endings);
 
 	# flag missing "&&" at end of previous command
 	my $n = find_non_nl($tokens);
diff --git a/t/chainlint/chain-break-continue.expect b/t/chainlint/chain-break-continue.expect
new file mode 100644
index 00000000000..47a34577100
--- /dev/null
+++ b/t/chainlint/chain-break-continue.expect
@@ -0,0 +1,12 @@
+git ls-tree --name-only -r refs/notes/many_notes |
+while read path
+do
+	test "$path" = "foobar/non-note.txt" && continue
+	test "$path" = "deadbeef" && continue
+	test "$path" = "de/adbeef" && continue
+
+	if test $(expr length "$path") -ne $hexsz
+	then
+		return 1
+	fi
+done
diff --git a/t/chainlint/chain-break-continue.test b/t/chainlint/chain-break-continue.test
new file mode 100644
index 00000000000..f0af71d8bd9
--- /dev/null
+++ b/t/chainlint/chain-break-continue.test
@@ -0,0 +1,13 @@
+git ls-tree --name-only -r refs/notes/many_notes |
+while read path
+do
+# LINT: broken &&-chain okay if explicit "continue"
+	test "$path" = "foobar/non-note.txt" && continue
+	test "$path" = "deadbeef" && continue
+	test "$path" = "de/adbeef" && continue
+
+	if test $(expr length "$path") -ne $hexsz
+	then
+		return 1
+	fi
+done
diff --git a/t/chainlint/chain-break-return-exit.expect b/t/chainlint/chain-break-return-exit.expect
new file mode 100644
index 00000000000..dba292ee89b
--- /dev/null
+++ b/t/chainlint/chain-break-return-exit.expect
@@ -0,0 +1,4 @@
+for i in 1 2 3 4 ; do
+	git checkout main -b $i || return $?
+	test_commit $i $i $i tag$i || return $?
+done
diff --git a/t/chainlint/chain-break-return-exit.test b/t/chainlint/chain-break-return-exit.test
new file mode 100644
index 00000000000..e2b059933aa
--- /dev/null
+++ b/t/chainlint/chain-break-return-exit.test
@@ -0,0 +1,5 @@
+for i in 1 2 3 4 ; do
+# LINT: broken &&-chain okay if explicit "return $?" signals failure
+	git checkout main -b $i || return $?
+	test_commit $i $i $i tag$i || return $?
+done
diff --git a/t/chainlint/return-loop.expect b/t/chainlint/return-loop.expect
new file mode 100644
index 00000000000..cfc0549befe
--- /dev/null
+++ b/t/chainlint/return-loop.expect
@@ -0,0 +1,5 @@
+while test $i -lt $((num - 5))
+do
+	git notes add -m "notes for commit$i" HEAD~$i || return 1
+	i=$((i + 1))
+done
diff --git a/t/chainlint/return-loop.test b/t/chainlint/return-loop.test
new file mode 100644
index 00000000000..f90b1713005
--- /dev/null
+++ b/t/chainlint/return-loop.test
@@ -0,0 +1,6 @@
+while test $i -lt $((num - 5))
+do
+# LINT: "|| return {n}" valid loop escape outside subshell; no "&&" needed
+	git notes add -m "notes for commit$i" HEAD~$i || return 1
+	i=$((i + 1))
+done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 08/18] t/Makefile: apply chainlint.pl to existing self-tests
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (6 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 07/18] chainlint.pl: don't require `return|exit|continue` to end with `&&` Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 09/18] chainlint.pl: don't require `&` background command to end with `&&` Eric Sunshine via GitGitGadget
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

Now that chainlint.pl is functional, take advantage of the existing
chainlint self-tests to validate its operation. (While at it, stop
validating chainlint.sed against the self-tests since it will soon be
retired.)

Due to chainlint.sed implementation limitations leaking into the
self-test "expect" files, a few of them require minor adjustment to make
them compatible with chainlint.pl which does not share those
limitations.

First, because `sed` does not provide any sort of real recursion,
chainlint.sed only emulates recursion into subshells, and each level of
recursion leads to a multiplicative increase in complexity of the `sed`
rules. To avoid substantial complexity, chainlint.sed, therefore, only
emulates subshell recursion one level deep. Any subshell deeper than
that is passed through as-is, which means that &&-chains are not checked
in deeper subshells. chainlint.pl, on the other hand, employs a proper
recursive descent parser, thus checks subshells to any depth and
correctly flags broken &&-chains in deep subshells.

Second, due to sed's line-oriented nature, chainlint.sed, by necessity,
folds multi-line quoted strings into a single line. chainlint.pl, on the
other hand, employs a proper lexical analyzer which preserves quoted
strings as-is, including embedded newlines.

Furthermore, the output of chainlint.sed and chainlint.pl do not match
precisely in terms of whitespace. However, since the purpose of the
self-checks is to verify that the ?!AMP?! annotations are being
correctly added, minor whitespace differences are immaterial. For this
reason, rather than adjusting whitespace in all existing self-test
"expect" files to match the new linter's output, the `check-chainlint`
target ignores whitespace differences. Since `diff -w` is not POSIX,
`check-chainlint` attempts to employ `git diff -w`, and only falls back
to non-POSIX `diff -w` (and `-u`) if `git diff` is not available.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/Makefile                                    | 29 +++++++++++++++----
 t/chainlint/block.expect                      |  2 +-
 t/chainlint/here-doc-multi-line-string.expect |  3 +-
 t/chainlint/multi-line-string.expect          | 11 +++++--
 t/chainlint/nested-subshell.expect            |  2 +-
 t/chainlint/t7900-subtree.expect              | 13 +++++++--
 6 files changed, 46 insertions(+), 14 deletions(-)

diff --git a/t/Makefile b/t/Makefile
index 1c80c0c79a0..11f276774ea 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -38,7 +38,7 @@ T = $(sort $(wildcard t[0-9][0-9][0-9][0-9]-*.sh))
 THELPERS = $(sort $(filter-out $(T),$(wildcard *.sh)))
 TPERF = $(sort $(wildcard perf/p[0-9][0-9][0-9][0-9]-*.sh))
 CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test)))
-CHAINLINT = sed -f chainlint.sed
+CHAINLINT = '$(PERL_PATH_SQ)' chainlint.pl
 
 all: $(DEFAULT_TEST_TARGET)
 
@@ -73,10 +73,29 @@ clean-chainlint:
 
 check-chainlint:
 	@mkdir -p '$(CHAINLINTTMP_SQ)' && \
-	sed -e '/^# LINT: /d' $(patsubst %,chainlint/%.test,$(CHAINLINTTESTS)) >'$(CHAINLINTTMP_SQ)'/tests && \
-	sed -e '/^[ 	]*$$/d' $(patsubst %,chainlint/%.expect,$(CHAINLINTTESTS)) >'$(CHAINLINTTMP_SQ)'/expect && \
-	$(CHAINLINT) '$(CHAINLINTTMP_SQ)'/tests | grep -v '^[	]*$$' >'$(CHAINLINTTMP_SQ)'/actual && \
-	diff -u '$(CHAINLINTTMP_SQ)'/expect '$(CHAINLINTTMP_SQ)'/actual
+	for i in $(CHAINLINTTESTS); do \
+		echo "test_expect_success '$$i' '" && \
+		sed -e '/^# LINT: /d' chainlint/$$i.test && \
+		echo "'"; \
+	done >'$(CHAINLINTTMP_SQ)'/tests && \
+	{ \
+		echo "# chainlint: $(CHAINLINTTMP_SQ)/tests" && \
+		for i in $(CHAINLINTTESTS); do \
+			echo "# chainlint: $$i" && \
+			sed -e '/^[ 	]*$$/d' chainlint/$$i.expect; \
+		done \
+	} >'$(CHAINLINTTMP_SQ)'/expect && \
+	$(CHAINLINT) --emit-all '$(CHAINLINTTMP_SQ)'/tests | \
+		grep -v '^[ 	]*$$' >'$(CHAINLINTTMP_SQ)'/actual && \
+	if test -f ../GIT-BUILD-OPTIONS; then \
+		. ../GIT-BUILD-OPTIONS; \
+	fi && \
+	if test -x ../git$$X; then \
+		DIFFW="../git$$X --no-pager diff -w --no-index"; \
+	else \
+		DIFFW="diff -w -u"; \
+	fi && \
+	$$DIFFW '$(CHAINLINTTMP_SQ)'/expect '$(CHAINLINTTMP_SQ)'/actual
 
 test-lint: test-lint-duplicates test-lint-executable test-lint-shell-syntax \
 	test-lint-filenames
diff --git a/t/chainlint/block.expect b/t/chainlint/block.expect
index da60257ebc4..37dbf7d95fa 100644
--- a/t/chainlint/block.expect
+++ b/t/chainlint/block.expect
@@ -1,7 +1,7 @@
 (
 	foo &&
 	{
-		echo a
+		echo a ?!AMP?!
 		echo b
 	} &&
 	bar &&
diff --git a/t/chainlint/here-doc-multi-line-string.expect b/t/chainlint/here-doc-multi-line-string.expect
index 2578191ca8a..be64b26869a 100644
--- a/t/chainlint/here-doc-multi-line-string.expect
+++ b/t/chainlint/here-doc-multi-line-string.expect
@@ -1,4 +1,5 @@
 (
-	cat <<-TXT && echo "multi-line	string" ?!AMP?!
+	cat <<-TXT && echo "multi-line
+	string" ?!AMP?!
 	bap
 )
diff --git a/t/chainlint/multi-line-string.expect b/t/chainlint/multi-line-string.expect
index ab0dadf748e..27ff95218e7 100644
--- a/t/chainlint/multi-line-string.expect
+++ b/t/chainlint/multi-line-string.expect
@@ -1,9 +1,14 @@
 (
-	x="line 1		line 2		line 3" &&
-	y="line 1		line2" ?!AMP?!
+	x="line 1
+		line 2
+		line 3" &&
+	y="line 1
+		line2" ?!AMP?!
 	foobar
 ) &&
 (
-	echo "xyz" "abc		def		ghi" &&
+	echo "xyz" "abc
+		def
+		ghi" &&
 	barfoo
 )
diff --git a/t/chainlint/nested-subshell.expect b/t/chainlint/nested-subshell.expect
index 41a48adaa2b..02e0a9f1bb5 100644
--- a/t/chainlint/nested-subshell.expect
+++ b/t/chainlint/nested-subshell.expect
@@ -6,7 +6,7 @@
 	) >file &&
 	cd foo &&
 	(
-		echo a
+		echo a ?!AMP?!
 		echo b
 	) >file
 )
diff --git a/t/chainlint/t7900-subtree.expect b/t/chainlint/t7900-subtree.expect
index 1cccc7bf7e1..69167da2f27 100644
--- a/t/chainlint/t7900-subtree.expect
+++ b/t/chainlint/t7900-subtree.expect
@@ -1,10 +1,17 @@
 (
-	chks="sub1sub2sub3sub4" &&
+	chks="sub1
+sub2
+sub3
+sub4" &&
 	chks_sub=$(cat <<TXT | sed "s,^,sub dir/,"
 ) &&
-	chkms="main-sub1main-sub2main-sub3main-sub4" &&
+	chkms="main-sub1
+main-sub2
+main-sub3
+main-sub4" &&
 	chkms_sub=$(cat <<TXT | sed "s,^,sub dir/,"
 ) &&
 	subfiles=$(git ls-files) &&
-	check_equal "$subfiles" "$chkms$chks"
+	check_equal "$subfiles" "$chkms
+$chks"
 )
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 09/18] chainlint.pl: don't require `&` background command to end with `&&`
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (7 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 08/18] t/Makefile: apply chainlint.pl to existing self-tests Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 10/18] chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly Eric Sunshine via GitGitGadget
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

The exit status of the `&` asynchronous operator which starts a command
in the background is unconditionally zero, and the few places in the
test scripts which launch commands asynchronously are not interested in
the exit status of the `&` operator (though they often capture the
background command's PID). As such, there is little value in complaining
about broken &&-chain for a command launched in the background, and
doing so would only make busy-work for test authors. Therefore, take
this special case into account when checking for &&-chain breakage.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl                            |  2 +-
 t/chainlint/chain-break-background.expect |  9 +++++++++
 t/chainlint/chain-break-background.test   | 10 ++++++++++
 3 files changed, 20 insertions(+), 1 deletion(-)
 create mode 100644 t/chainlint/chain-break-background.expect
 create mode 100644 t/chainlint/chain-break-background.test

diff --git a/t/chainlint.pl b/t/chainlint.pl
index 31c444067ce..ba3fcb0c8e6 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -483,7 +483,7 @@ sub match_ending {
 }
 
 my @safe_endings = (
-	[qr/^(?:&&|\|\||\|)$/],
+	[qr/^(?:&&|\|\||\||&)$/],
 	[qr/^(?:exit|return)$/, qr/^(?:\d+|\$\?)$/],
 	[qr/^(?:exit|return)$/, qr/^(?:\d+|\$\?)$/, qr/^;$/],
 	[qr/^(?:exit|return|continue)$/],
diff --git a/t/chainlint/chain-break-background.expect b/t/chainlint/chain-break-background.expect
new file mode 100644
index 00000000000..28f9114f42d
--- /dev/null
+++ b/t/chainlint/chain-break-background.expect
@@ -0,0 +1,9 @@
+JGIT_DAEMON_PID= &&
+git init --bare empty.git &&
+> empty.git/git-daemon-export-ok &&
+mkfifo jgit_daemon_output &&
+{
+	jgit daemon --port="$JGIT_DAEMON_PORT" . > jgit_daemon_output &
+	JGIT_DAEMON_PID=$!
+} &&
+test_expect_code 2 git ls-remote --exit-code git://localhost:$JGIT_DAEMON_PORT/empty.git
diff --git a/t/chainlint/chain-break-background.test b/t/chainlint/chain-break-background.test
new file mode 100644
index 00000000000..e10f656b055
--- /dev/null
+++ b/t/chainlint/chain-break-background.test
@@ -0,0 +1,10 @@
+JGIT_DAEMON_PID= &&
+git init --bare empty.git &&
+>empty.git/git-daemon-export-ok &&
+mkfifo jgit_daemon_output &&
+{
+# LINT: exit status of "&" is always 0 so &&-chaining immaterial
+	jgit daemon --port="$JGIT_DAEMON_PORT" . >jgit_daemon_output &
+	JGIT_DAEMON_PID=$!
+} &&
+test_expect_code 2 git ls-remote --exit-code git://localhost:$JGIT_DAEMON_PORT/empty.git
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 10/18] chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (8 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 09/18] chainlint.pl: don't require `&` background command to end with `&&` Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 11/18] chainlint.pl: don't flag broken &&-chain if failure indicated explicitly Eric Sunshine via GitGitGadget
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

There are cases in which tests capture and check a command's exit code
explicitly without employing test_expect_code(). They do so by
intentionally breaking the &&-chain since it would be impossible to
capture "$?" in the failing case if the `status=$?` assignment was part
of the &&-chain. Since such constructs are manually checking the exit
code, their &&-chain breakage is legitimate and safe, thus should not be
flagged. Therefore, stop flagging &&-chain breakage in such cases.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl                        |  6 ++++++
 t/chainlint/chain-break-status.expect |  9 +++++++++
 t/chainlint/chain-break-status.test   | 11 +++++++++++
 3 files changed, 26 insertions(+)
 create mode 100644 t/chainlint/chain-break-status.expect
 create mode 100644 t/chainlint/chain-break-status.test

diff --git a/t/chainlint.pl b/t/chainlint.pl
index ba3fcb0c8e6..14e1db3519a 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -497,6 +497,12 @@ sub accumulate {
 	# did previous command end with "&&", "|", "|| return" or similar?
 	goto DONE if match_ending($tokens, \@safe_endings);
 
+	# if this command handles "$?" specially, then okay for previous
+	# command to be missing "&&"
+	for my $token (@$cmd) {
+		goto DONE if $token =~ /\$\?/;
+	}
+
 	# flag missing "&&" at end of previous command
 	my $n = find_non_nl($tokens);
 	splice(@$tokens, $n + 1, 0, '?!AMP?!') unless $n < 0;
diff --git a/t/chainlint/chain-break-status.expect b/t/chainlint/chain-break-status.expect
new file mode 100644
index 00000000000..f4bada94632
--- /dev/null
+++ b/t/chainlint/chain-break-status.expect
@@ -0,0 +1,9 @@
+OUT=$(( ( large_git ; echo $? 1 >& 3 ) | : ) 3 >& 1) &&
+test_match_signal 13 "$OUT" &&
+
+{ test-tool sigchain > actual ; ret=$? ; } &&
+{
+	test_match_signal 15 "$ret" ||
+	test "$ret" = 3
+} &&
+test_cmp expect actual
diff --git a/t/chainlint/chain-break-status.test b/t/chainlint/chain-break-status.test
new file mode 100644
index 00000000000..a6602a7b99c
--- /dev/null
+++ b/t/chainlint/chain-break-status.test
@@ -0,0 +1,11 @@
+# LINT: broken &&-chain okay if next command handles "$?" explicitly
+OUT=$( ((large_git; echo $? 1>&3) | :) 3>&1 ) &&
+test_match_signal 13 "$OUT" &&
+
+# LINT: broken &&-chain okay if next command handles "$?" explicitly
+{ test-tool sigchain >actual; ret=$?; } &&
+{
+	test_match_signal 15 "$ret" ||
+	test "$ret" = 3
+} &&
+test_cmp expect actual
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 11/18] chainlint.pl: don't flag broken &&-chain if failure indicated explicitly
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (9 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 10/18] chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 12/18] chainlint.pl: complain about loops lacking explicit failure handling Eric Sunshine via GitGitGadget
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

There are quite a few tests which print an error messages and then
explicitly signal failure with `false`, `return 1`, or `exit 1` as the
final command in an `if` branch. In these cases, the tests don't bother
maintaining the &&-chain between `echo` and the explicit "test failed"
indicator. Since such constructs are manually signaling failure, their
&&-chain breakage is legitimate and safe -- both for the command
immediately preceding `false`, `return`, or `exit`, as well as for all
preceding commands in the `if` branch. Therefore, stop flagging &&-chain
breakage in these sorts of cases.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl                             |  8 ++++++++
 t/chainlint/chain-break-false.expect       |  9 +++++++++
 t/chainlint/chain-break-false.test         | 10 ++++++++++
 t/chainlint/chain-break-return-exit.expect | 15 +++++++++++++++
 t/chainlint/chain-break-return-exit.test   | 18 ++++++++++++++++++
 t/chainlint/if-in-loop.expect              |  2 +-
 t/chainlint/if-in-loop.test                |  2 +-
 7 files changed, 62 insertions(+), 2 deletions(-)
 create mode 100644 t/chainlint/chain-break-false.expect
 create mode 100644 t/chainlint/chain-break-false.test

diff --git a/t/chainlint.pl b/t/chainlint.pl
index 14e1db3519a..a76a09ecf5e 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -503,6 +503,14 @@ sub accumulate {
 		goto DONE if $token =~ /\$\?/;
 	}
 
+	# if this command is "false", "return 1", or "exit 1" (which signal
+	# failure explicitly), then okay for all preceding commands to be
+	# missing "&&"
+	if ($$cmd[0] =~ /^(?:false|return|exit)$/) {
+		@$tokens = grep(!/^\?!AMP\?!$/, @$tokens);
+		goto DONE;
+	}
+
 	# flag missing "&&" at end of previous command
 	my $n = find_non_nl($tokens);
 	splice(@$tokens, $n + 1, 0, '?!AMP?!') unless $n < 0;
diff --git a/t/chainlint/chain-break-false.expect b/t/chainlint/chain-break-false.expect
new file mode 100644
index 00000000000..989766fb856
--- /dev/null
+++ b/t/chainlint/chain-break-false.expect
@@ -0,0 +1,9 @@
+if condition not satisified
+then
+	echo it did not work...
+	echo failed!
+	false
+else
+	echo it went okay ?!AMP?!
+	congratulate user
+fi
diff --git a/t/chainlint/chain-break-false.test b/t/chainlint/chain-break-false.test
new file mode 100644
index 00000000000..a5aaff8c8a4
--- /dev/null
+++ b/t/chainlint/chain-break-false.test
@@ -0,0 +1,10 @@
+# LINT: broken &&-chain okay if explicit "false" signals failure
+if condition not satisified
+then
+	echo it did not work...
+	echo failed!
+	false
+else
+	echo it went okay
+	congratulate user
+fi
diff --git a/t/chainlint/chain-break-return-exit.expect b/t/chainlint/chain-break-return-exit.expect
index dba292ee89b..1732d221c32 100644
--- a/t/chainlint/chain-break-return-exit.expect
+++ b/t/chainlint/chain-break-return-exit.expect
@@ -1,3 +1,18 @@
+case "$(git ls-files)" in
+one ) echo pass one ;;
+* ) echo bad one ; return 1 ;;
+esac &&
+(
+	case "$(git ls-files)" in
+	two ) echo pass two ;;
+	* ) echo bad two ; exit 1 ;;
+esac
+) &&
+case "$(git ls-files)" in
+dir/two"$LF"one ) echo pass both ;;
+* ) echo bad ; return 1 ;;
+esac &&
+
 for i in 1 2 3 4 ; do
 	git checkout main -b $i || return $?
 	test_commit $i $i $i tag$i || return $?
diff --git a/t/chainlint/chain-break-return-exit.test b/t/chainlint/chain-break-return-exit.test
index e2b059933aa..46542edf881 100644
--- a/t/chainlint/chain-break-return-exit.test
+++ b/t/chainlint/chain-break-return-exit.test
@@ -1,3 +1,21 @@
+case "$(git ls-files)" in
+one) echo pass one ;;
+# LINT: broken &&-chain okay if explicit "return 1" signals failuire
+*) echo bad one; return 1 ;;
+esac &&
+(
+	case "$(git ls-files)" in
+	two) echo pass two ;;
+# LINT: broken &&-chain okay if explicit "exit 1" signals failuire
+	*) echo bad two; exit 1 ;;
+	esac
+) &&
+case "$(git ls-files)" in
+dir/two"$LF"one) echo pass both ;;
+# LINT: broken &&-chain okay if explicit "return 1" signals failuire
+*) echo bad; return 1 ;;
+esac &&
+
 for i in 1 2 3 4 ; do
 # LINT: broken &&-chain okay if explicit "return $?" signals failure
 	git checkout main -b $i || return $?
diff --git a/t/chainlint/if-in-loop.expect b/t/chainlint/if-in-loop.expect
index 03b82a3e58c..d6514ae7492 100644
--- a/t/chainlint/if-in-loop.expect
+++ b/t/chainlint/if-in-loop.expect
@@ -3,7 +3,7 @@
 	do
 		if false
 		then
-			echo "err" ?!AMP?!
+			echo "err"
 			exit 1
 		fi ?!AMP?!
 		foo
diff --git a/t/chainlint/if-in-loop.test b/t/chainlint/if-in-loop.test
index f0cf19cfada..90c23976fec 100644
--- a/t/chainlint/if-in-loop.test
+++ b/t/chainlint/if-in-loop.test
@@ -3,7 +3,7 @@
 	do
 		if false
 		then
-# LINT: missing "&&" on "echo"
+# LINT: missing "&&" on "echo" okay since "exit 1" signals error explicitly
 			echo "err"
 			exit 1
 # LINT: missing "&&" on "fi"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 12/18] chainlint.pl: complain about loops lacking explicit failure handling
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (10 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 11/18] chainlint.pl: don't flag broken &&-chain if failure indicated explicitly Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 13/18] chainlint.pl: allow `|| echo` to signal failure upstream of a pipe Eric Sunshine via GitGitGadget
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

Shell `for` and `while` loops do not terminate automatically just
because a command fails within the loop body. Instead, the loop
continues to iterate and eventually returns the exit status of the final
command of the final iteration, which may not be the command which
failed, thus it is possible for failures to go undetected. Consequently,
it is important for test authors to explicitly handle failure within the
loop body by terminating the loop manually upon failure. This can be
done by returning a non-zero exit code from within the loop body
(i.e. `|| return 1`) or exiting (i.e. `|| exit 1`) if the loop is within
a subshell, or by manually checking `$?` and taking some appropriate
action. Therefore, add logic to detect and complain about loops which
lack explicit `return` or `exit`, or `$?` check.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl                                | 11 ++++++
 t/chainlint/complex-if-in-cuddled-loop.expect |  2 +-
 t/chainlint/for-loop.expect                   |  4 +--
 t/chainlint/loop-detect-failure.expect        | 15 ++++++++
 t/chainlint/loop-detect-failure.test          | 17 +++++++++
 t/chainlint/loop-detect-status.expect         | 18 ++++++++++
 t/chainlint/loop-detect-status.test           | 19 ++++++++++
 t/chainlint/loop-in-if.expect                 |  2 +-
 t/chainlint/nested-loop-detect-failure.expect | 31 ++++++++++++++++
 t/chainlint/nested-loop-detect-failure.test   | 35 +++++++++++++++++++
 t/chainlint/semicolon.expect                  |  2 +-
 t/chainlint/while-loop.expect                 |  4 +--
 12 files changed, 153 insertions(+), 7 deletions(-)
 create mode 100644 t/chainlint/loop-detect-failure.expect
 create mode 100644 t/chainlint/loop-detect-failure.test
 create mode 100644 t/chainlint/loop-detect-status.expect
 create mode 100644 t/chainlint/loop-detect-status.test
 create mode 100644 t/chainlint/nested-loop-detect-failure.expect
 create mode 100644 t/chainlint/nested-loop-detect-failure.test

diff --git a/t/chainlint.pl b/t/chainlint.pl
index a76a09ecf5e..674b3ddf696 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -482,6 +482,17 @@ sub match_ending {
 	return undef;
 }
 
+sub parse_loop_body {
+	my $self = shift @_;
+	my @tokens = $self->SUPER::parse_loop_body(@_);
+	# did loop signal failure via "|| return" or "|| exit"?
+	return @tokens if !@tokens || grep(/^(?:return|exit|\$\?)$/, @tokens);
+	# flag missing "return/exit" handling explicit failure in loop body
+	my $n = find_non_nl(\@tokens);
+	splice(@tokens, $n + 1, 0, '?!LOOP?!');
+	return @tokens;
+}
+
 my @safe_endings = (
 	[qr/^(?:&&|\|\||\||&)$/],
 	[qr/^(?:exit|return)$/, qr/^(?:\d+|\$\?)$/],
diff --git a/t/chainlint/complex-if-in-cuddled-loop.expect b/t/chainlint/complex-if-in-cuddled-loop.expect
index 2fca1834095..dac2d0fd1d9 100644
--- a/t/chainlint/complex-if-in-cuddled-loop.expect
+++ b/t/chainlint/complex-if-in-cuddled-loop.expect
@@ -4,6 +4,6 @@
      :
    else
      echo >file
-   fi
+   fi ?!LOOP?!
  done) &&
 test ! -f file
diff --git a/t/chainlint/for-loop.expect b/t/chainlint/for-loop.expect
index 6671b8cd842..a5810c9bddd 100644
--- a/t/chainlint/for-loop.expect
+++ b/t/chainlint/for-loop.expect
@@ -2,10 +2,10 @@
 	for i in a b c
 	do
 		echo $i ?!AMP?!
-		cat <<-EOF
+		cat <<-EOF ?!LOOP?!
 	done ?!AMP?!
 	for i in a b c; do
 		echo $i &&
-		cat $i
+		cat $i ?!LOOP?!
 	done
 )
diff --git a/t/chainlint/loop-detect-failure.expect b/t/chainlint/loop-detect-failure.expect
new file mode 100644
index 00000000000..a66025c39d4
--- /dev/null
+++ b/t/chainlint/loop-detect-failure.expect
@@ -0,0 +1,15 @@
+git init r1 &&
+for n in 1 2 3 4 5
+do
+	echo "This is file: $n" > r1/file.$n &&
+	git -C r1 add file.$n &&
+	git -C r1 commit -m "$n" || return 1
+done &&
+
+git init r2 &&
+for n in 1000 10000
+do
+	printf "%"$n"s" X > r2/large.$n &&
+	git -C r2 add large.$n &&
+	git -C r2 commit -m "$n" ?!LOOP?!
+done
diff --git a/t/chainlint/loop-detect-failure.test b/t/chainlint/loop-detect-failure.test
new file mode 100644
index 00000000000..b9791cc802e
--- /dev/null
+++ b/t/chainlint/loop-detect-failure.test
@@ -0,0 +1,17 @@
+git init r1 &&
+# LINT: loop handles failure explicitly with "|| return 1"
+for n in 1 2 3 4 5
+do
+	echo "This is file: $n" > r1/file.$n &&
+	git -C r1 add file.$n &&
+	git -C r1 commit -m "$n" || return 1
+done &&
+
+git init r2 &&
+# LINT: loop fails to handle failure explicitly with "|| return 1"
+for n in 1000 10000
+do
+	printf "%"$n"s" X > r2/large.$n &&
+	git -C r2 add large.$n &&
+	git -C r2 commit -m "$n"
+done
diff --git a/t/chainlint/loop-detect-status.expect b/t/chainlint/loop-detect-status.expect
new file mode 100644
index 00000000000..0ad23bb35e4
--- /dev/null
+++ b/t/chainlint/loop-detect-status.expect
@@ -0,0 +1,18 @@
+( while test $i -le $blobcount
+do
+	printf "Generating blob $i/$blobcount\r" >& 2 &&
+	printf "blob\nmark :$i\ndata $blobsize\n" &&
+
+	printf "%-${blobsize}s" $i &&
+	echo "M 100644 :$i $i" >> commit &&
+	i=$(($i+1)) ||
+	echo $? > exit-status
+done &&
+echo "commit refs/heads/main" &&
+echo "author A U Thor <author@email.com> 123456789 +0000" &&
+echo "committer C O Mitter <committer@email.com> 123456789 +0000" &&
+echo "data 5" &&
+echo ">2gb" &&
+cat commit ) |
+git fast-import --big-file-threshold=2 &&
+test ! -f exit-status
diff --git a/t/chainlint/loop-detect-status.test b/t/chainlint/loop-detect-status.test
new file mode 100644
index 00000000000..1c6c23cfc9e
--- /dev/null
+++ b/t/chainlint/loop-detect-status.test
@@ -0,0 +1,19 @@
+# LINT: "$?" handled explicitly within loop body
+(while test $i -le $blobcount
+ do
+	printf "Generating blob $i/$blobcount\r" >&2 &&
+	printf "blob\nmark :$i\ndata $blobsize\n" &&
+	#test-tool genrandom $i $blobsize &&
+	printf "%-${blobsize}s" $i &&
+	echo "M 100644 :$i $i" >> commit &&
+	i=$(($i+1)) ||
+	echo $? > exit-status
+ done &&
+ echo "commit refs/heads/main" &&
+ echo "author A U Thor <author@email.com> 123456789 +0000" &&
+ echo "committer C O Mitter <committer@email.com> 123456789 +0000" &&
+ echo "data 5" &&
+ echo ">2gb" &&
+ cat commit) |
+git fast-import --big-file-threshold=2 &&
+test ! -f exit-status
diff --git a/t/chainlint/loop-in-if.expect b/t/chainlint/loop-in-if.expect
index e1be42376c5..6c5d6e5b243 100644
--- a/t/chainlint/loop-in-if.expect
+++ b/t/chainlint/loop-in-if.expect
@@ -4,7 +4,7 @@
 		while true
 		do
 			echo "pop" ?!AMP?!
-			echo "glup"
+			echo "glup" ?!LOOP?!
 		done ?!AMP?!
 		foo
 	fi ?!AMP?!
diff --git a/t/chainlint/nested-loop-detect-failure.expect b/t/chainlint/nested-loop-detect-failure.expect
new file mode 100644
index 00000000000..4793a0e8e12
--- /dev/null
+++ b/t/chainlint/nested-loop-detect-failure.expect
@@ -0,0 +1,31 @@
+for i in 0 1 2 3 4 5 6 7 8 9 ;
+do
+	for j in 0 1 2 3 4 5 6 7 8 9 ;
+	do
+		echo "$i$j" > "path$i$j" ?!LOOP?!
+	done ?!LOOP?!
+done &&
+
+for i in 0 1 2 3 4 5 6 7 8 9 ;
+do
+	for j in 0 1 2 3 4 5 6 7 8 9 ;
+	do
+		echo "$i$j" > "path$i$j" || return 1
+	done
+done &&
+
+for i in 0 1 2 3 4 5 6 7 8 9 ;
+do
+	for j in 0 1 2 3 4 5 6 7 8 9 ;
+	do
+		echo "$i$j" > "path$i$j" ?!LOOP?!
+	done || return 1
+done &&
+
+for i in 0 1 2 3 4 5 6 7 8 9 ;
+do
+	for j in 0 1 2 3 4 5 6 7 8 9 ;
+	do
+		echo "$i$j" > "path$i$j" || return 1
+	done || return 1
+done
diff --git a/t/chainlint/nested-loop-detect-failure.test b/t/chainlint/nested-loop-detect-failure.test
new file mode 100644
index 00000000000..e6f0c1acfb8
--- /dev/null
+++ b/t/chainlint/nested-loop-detect-failure.test
@@ -0,0 +1,35 @@
+# LINT: neither loop handles failure explicitly with "|| return 1"
+for i in 0 1 2 3 4 5 6 7 8 9;
+do
+	for j in 0 1 2 3 4 5 6 7 8 9;
+	do
+		echo "$i$j" >"path$i$j"
+	done
+done &&
+
+# LINT: inner loop handles failure explicitly with "|| return 1"
+for i in 0 1 2 3 4 5 6 7 8 9;
+do
+	for j in 0 1 2 3 4 5 6 7 8 9;
+	do
+		echo "$i$j" >"path$i$j" || return 1
+	done
+done &&
+
+# LINT: outer loop handles failure explicitly with "|| return 1"
+for i in 0 1 2 3 4 5 6 7 8 9;
+do
+	for j in 0 1 2 3 4 5 6 7 8 9;
+	do
+		echo "$i$j" >"path$i$j"
+	done || return 1
+done &&
+
+# LINT: inner & outer loops handles failure explicitly with "|| return 1"
+for i in 0 1 2 3 4 5 6 7 8 9;
+do
+	for j in 0 1 2 3 4 5 6 7 8 9;
+	do
+		echo "$i$j" >"path$i$j" || return 1
+	done || return 1
+done
diff --git a/t/chainlint/semicolon.expect b/t/chainlint/semicolon.expect
index ed0b3707ae9..3aa2259f36c 100644
--- a/t/chainlint/semicolon.expect
+++ b/t/chainlint/semicolon.expect
@@ -15,5 +15,5 @@
 ) &&
 (cd foo &&
 	for i in a b c; do
-		echo;
+		echo; ?!LOOP?!
 	done)
diff --git a/t/chainlint/while-loop.expect b/t/chainlint/while-loop.expect
index 0d3a9b3d128..f272aa21fee 100644
--- a/t/chainlint/while-loop.expect
+++ b/t/chainlint/while-loop.expect
@@ -2,10 +2,10 @@
 	while true
 	do
 		echo foo ?!AMP?!
-		cat <<-EOF
+		cat <<-EOF ?!LOOP?!
 	done ?!AMP?!
 	while true; do
 		echo foo &&
-		cat bar
+		cat bar ?!LOOP?!
 	done
 )
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 13/18] chainlint.pl: allow `|| echo` to signal failure upstream of a pipe
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (11 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 12/18] chainlint.pl: complain about loops lacking explicit failure handling Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 14/18] t/chainlint: add more chainlint.pl self-tests Eric Sunshine via GitGitGadget
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

The use of `|| return` (or `|| exit`) to signal failure within a loop
isn't effective when the loop is upstream of a pipe since the pipe
swallows all upstream exit codes and returns only the exit code of the
final command in the pipeline.

To work around this limitation, tests may adopt an alternative strategy
of signaling failure by emitting text which would never be emitted in
the non-failing case. For instance:

    while condition
    do
        command1 &&
        command2 ||
        echo "impossible text"
    done |
    sort >actual &&

Such usage indicates deliberate thought about failure cases by the test
author, thus flagging them as missing `|| return` (or `|| exit`) is not
helpful. Therefore, take this case into consideration when checking for
explicit loop termination.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.pl                        |  3 +++
 t/chainlint/loop-upstream-pipe.expect | 10 ++++++++++
 t/chainlint/loop-upstream-pipe.test   | 11 +++++++++++
 3 files changed, 24 insertions(+)
 create mode 100644 t/chainlint/loop-upstream-pipe.expect
 create mode 100644 t/chainlint/loop-upstream-pipe.test

diff --git a/t/chainlint.pl b/t/chainlint.pl
index 674b3ddf696..386999ce65d 100755
--- a/t/chainlint.pl
+++ b/t/chainlint.pl
@@ -487,6 +487,9 @@ sub parse_loop_body {
 	my @tokens = $self->SUPER::parse_loop_body(@_);
 	# did loop signal failure via "|| return" or "|| exit"?
 	return @tokens if !@tokens || grep(/^(?:return|exit|\$\?)$/, @tokens);
+	# did loop upstream of a pipe signal failure via "|| echo 'impossible
+	# text'" as the final command in the loop body?
+	return @tokens if ends_with(\@tokens, [qr/^\|\|$/, "\n", qr/^echo$/, qr/^.+$/]);
 	# flag missing "return/exit" handling explicit failure in loop body
 	my $n = find_non_nl(\@tokens);
 	splice(@tokens, $n + 1, 0, '?!LOOP?!');
diff --git a/t/chainlint/loop-upstream-pipe.expect b/t/chainlint/loop-upstream-pipe.expect
new file mode 100644
index 00000000000..0b82ecc4b96
--- /dev/null
+++ b/t/chainlint/loop-upstream-pipe.expect
@@ -0,0 +1,10 @@
+(
+	git rev-list --objects --no-object-names base..loose |
+	while read oid
+	do
+		path="$objdir/$(test_oid_to_path "$oid")" &&
+		printf "%s %d\n" "$oid" "$(test-tool chmtime --get "$path")" ||
+		echo "object list generation failed for $oid"
+	done |
+	sort -k1
+) >expect &&
diff --git a/t/chainlint/loop-upstream-pipe.test b/t/chainlint/loop-upstream-pipe.test
new file mode 100644
index 00000000000..efb77da897c
--- /dev/null
+++ b/t/chainlint/loop-upstream-pipe.test
@@ -0,0 +1,11 @@
+(
+	git rev-list --objects --no-object-names base..loose |
+	while read oid
+	do
+# LINT: "|| echo" signals failure in loop upstream of a pipe
+		path="$objdir/$(test_oid_to_path "$oid")" &&
+		printf "%s %d\n" "$oid" "$(test-tool chmtime --get "$path")" ||
+		echo "object list generation failed for $oid"
+	done |
+	sort -k1
+) >expect &&
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 14/18] t/chainlint: add more chainlint.pl self-tests
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (12 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 13/18] chainlint.pl: allow `|| echo` to signal failure upstream of a pipe Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 15/18] test-lib: retire "lint harder" optimization hack Eric Sunshine via GitGitGadget
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

During the development of chainlint.pl, numerous new self-tests were
created to verify correct functioning beyond the checks already
represented by the existing self-tests. The new checks fall into several
categories:

* behavior of the lexical analyzer for complex cases, such as line
  splicing, token pasting, entering and exiting string contexts inside
  and outside of test script bodies; for instance:

    test_expect_success 'title' '
      x=$(echo "something" |
        sed -e '\''s/\\/\\\\/g'\'' -e '\''s/[[/.*^$]/\\&/g'\''
    '

* behavior of the parser for all compound grammatical constructs, such
  as `if...fi`, `case...esac`, `while...done`, `{...}`, etc., and for
  other legal shell grammatical constructs not covered by existing
  chainlint.sed self-tests, as well as complex cases, such as:

    OUT=$( ((large_git 1>&3) | :) 3>&1 ) &&

* detection of problems, such as &&-chain breakage, from top-level to
  any depth since the existing self-tests do not cover any top-level
  context and only cover subshells one level deep due to limitations of
  chainlint.sed

* address blind spots in chainlint.sed (such as not detecting a broken
  &&-chain on a one-line for-loop in a subshell[1]) which chainlint.pl
  correctly detects

* real-world cases which tripped up chainlint.pl during its development

[1]: https://lore.kernel.org/git/dce35a47012fecc6edc11c68e91dbb485c5bc36f.1661663880.git.gitgitgadget@gmail.com/

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint/blank-line-before-esac.expect     | 18 +++++++++++
 t/chainlint/blank-line-before-esac.test       | 19 +++++++++++
 t/chainlint/block.expect                      | 13 +++++++-
 t/chainlint/block.test                        | 15 ++++++++-
 t/chainlint/chained-block.expect              |  9 ++++++
 t/chainlint/chained-block.test                | 11 +++++++
 t/chainlint/chained-subshell.expect           | 10 ++++++
 t/chainlint/chained-subshell.test             | 13 ++++++++
 .../command-substitution-subsubshell.expect   |  2 ++
 .../command-substitution-subsubshell.test     |  3 ++
 t/chainlint/double-here-doc.expect            |  2 ++
 t/chainlint/double-here-doc.test              | 12 +++++++
 t/chainlint/dqstring-line-splice.expect       |  3 ++
 t/chainlint/dqstring-line-splice.test         |  7 ++++
 t/chainlint/dqstring-no-interpolate.expect    | 11 +++++++
 t/chainlint/dqstring-no-interpolate.test      | 15 +++++++++
 t/chainlint/empty-here-doc.expect             |  3 ++
 t/chainlint/empty-here-doc.test               |  5 +++
 t/chainlint/exclamation.expect                |  4 +++
 t/chainlint/exclamation.test                  |  8 +++++
 t/chainlint/for-loop-abbreviated.expect       |  5 +++
 t/chainlint/for-loop-abbreviated.test         |  6 ++++
 t/chainlint/function.expect                   | 11 +++++++
 t/chainlint/function.test                     | 13 ++++++++
 t/chainlint/here-doc-indent-operator.expect   |  5 +++
 t/chainlint/here-doc-indent-operator.test     | 13 ++++++++
 t/chainlint/if-condition-split.expect         |  7 ++++
 t/chainlint/if-condition-split.test           |  8 +++++
 t/chainlint/one-liner-for-loop.expect         |  9 ++++++
 t/chainlint/one-liner-for-loop.test           | 10 ++++++
 t/chainlint/sqstring-in-sqstring.expect       |  4 +++
 t/chainlint/sqstring-in-sqstring.test         |  5 +++
 t/chainlint/token-pasting.expect              | 27 ++++++++++++++++
 t/chainlint/token-pasting.test                | 32 +++++++++++++++++++
 34 files changed, 336 insertions(+), 2 deletions(-)
 create mode 100644 t/chainlint/blank-line-before-esac.expect
 create mode 100644 t/chainlint/blank-line-before-esac.test
 create mode 100644 t/chainlint/chained-block.expect
 create mode 100644 t/chainlint/chained-block.test
 create mode 100644 t/chainlint/chained-subshell.expect
 create mode 100644 t/chainlint/chained-subshell.test
 create mode 100644 t/chainlint/command-substitution-subsubshell.expect
 create mode 100644 t/chainlint/command-substitution-subsubshell.test
 create mode 100644 t/chainlint/double-here-doc.expect
 create mode 100644 t/chainlint/double-here-doc.test
 create mode 100644 t/chainlint/dqstring-line-splice.expect
 create mode 100644 t/chainlint/dqstring-line-splice.test
 create mode 100644 t/chainlint/dqstring-no-interpolate.expect
 create mode 100644 t/chainlint/dqstring-no-interpolate.test
 create mode 100644 t/chainlint/empty-here-doc.expect
 create mode 100644 t/chainlint/empty-here-doc.test
 create mode 100644 t/chainlint/exclamation.expect
 create mode 100644 t/chainlint/exclamation.test
 create mode 100644 t/chainlint/for-loop-abbreviated.expect
 create mode 100644 t/chainlint/for-loop-abbreviated.test
 create mode 100644 t/chainlint/function.expect
 create mode 100644 t/chainlint/function.test
 create mode 100644 t/chainlint/here-doc-indent-operator.expect
 create mode 100644 t/chainlint/here-doc-indent-operator.test
 create mode 100644 t/chainlint/if-condition-split.expect
 create mode 100644 t/chainlint/if-condition-split.test
 create mode 100644 t/chainlint/one-liner-for-loop.expect
 create mode 100644 t/chainlint/one-liner-for-loop.test
 create mode 100644 t/chainlint/sqstring-in-sqstring.expect
 create mode 100644 t/chainlint/sqstring-in-sqstring.test
 create mode 100644 t/chainlint/token-pasting.expect
 create mode 100644 t/chainlint/token-pasting.test

diff --git a/t/chainlint/blank-line-before-esac.expect b/t/chainlint/blank-line-before-esac.expect
new file mode 100644
index 00000000000..48ed4eb1246
--- /dev/null
+++ b/t/chainlint/blank-line-before-esac.expect
@@ -0,0 +1,18 @@
+test_done ( ) {
+	case "$test_failure" in
+	0 )
+		test_at_end_hook_
+
+		exit 0 ;;
+
+	* )
+		if test $test_external_has_tap -eq 0
+		then
+			say_color error "# failed $test_failure among $msg"
+			say "1..$test_count"
+		fi
+
+		exit 1 ;;
+
+		esac
+}
diff --git a/t/chainlint/blank-line-before-esac.test b/t/chainlint/blank-line-before-esac.test
new file mode 100644
index 00000000000..cecccad19f5
--- /dev/null
+++ b/t/chainlint/blank-line-before-esac.test
@@ -0,0 +1,19 @@
+# LINT: blank line before "esac"
+test_done () {
+	case "$test_failure" in
+	0)
+		test_at_end_hook_
+
+		exit 0 ;;
+
+	*)
+		if test $test_external_has_tap -eq 0
+		then
+			say_color error "# failed $test_failure among $msg"
+			say "1..$test_count"
+		fi
+
+		exit 1 ;;
+
+	esac
+}
diff --git a/t/chainlint/block.expect b/t/chainlint/block.expect
index 37dbf7d95fa..a3bcea492a9 100644
--- a/t/chainlint/block.expect
+++ b/t/chainlint/block.expect
@@ -9,4 +9,15 @@
 		echo c
 	} ?!AMP?!
 	baz
-)
+) &&
+
+{
+	echo a ; ?!AMP?! echo b
+} &&
+{ echo a ; ?!AMP?! echo b ; } &&
+
+{
+	echo "${var}9" &&
+	echo "done"
+} &&
+finis
diff --git a/t/chainlint/block.test b/t/chainlint/block.test
index 0a82fd579f6..4ab69a4afc4 100644
--- a/t/chainlint/block.test
+++ b/t/chainlint/block.test
@@ -11,4 +11,17 @@
 		echo c
 	}
 	baz
-)
+) &&
+
+# LINT: ";" not allowed in place of "&&"
+{
+	echo a; echo b
+} &&
+{ echo a; echo b; } &&
+
+# LINT: "}" inside string not mistaken as end of block
+{
+	echo "${var}9" &&
+	echo "done"
+} &&
+finis
diff --git a/t/chainlint/chained-block.expect b/t/chainlint/chained-block.expect
new file mode 100644
index 00000000000..574cdceb071
--- /dev/null
+++ b/t/chainlint/chained-block.expect
@@ -0,0 +1,9 @@
+echo nobody home && {
+	test the doohicky ?!AMP?!
+	right now
+} &&
+
+GIT_EXTERNAL_DIFF=echo git diff | {
+	read path oldfile oldhex oldmode newfile newhex newmode &&
+	test "z$oh" = "z$oldhex"
+}
diff --git a/t/chainlint/chained-block.test b/t/chainlint/chained-block.test
new file mode 100644
index 00000000000..86f81ece639
--- /dev/null
+++ b/t/chainlint/chained-block.test
@@ -0,0 +1,11 @@
+# LINT: start of block chained to preceding command
+echo nobody home && {
+	test the doohicky
+	right now
+} &&
+
+# LINT: preceding command pipes to block on same line
+GIT_EXTERNAL_DIFF=echo git diff | {
+	read path oldfile oldhex oldmode newfile newhex newmode &&
+	test "z$oh" = "z$oldhex"
+}
diff --git a/t/chainlint/chained-subshell.expect b/t/chainlint/chained-subshell.expect
new file mode 100644
index 00000000000..af0369d3285
--- /dev/null
+++ b/t/chainlint/chained-subshell.expect
@@ -0,0 +1,10 @@
+mkdir sub && (
+	cd sub &&
+	foo the bar ?!AMP?!
+	nuff said
+) &&
+
+cut "-d " -f actual | ( read s1 s2 s3 &&
+test -f $s1 ?!AMP?!
+test $(cat $s2) = tree2path1 &&
+test $(cat $s3) = tree3path1 )
diff --git a/t/chainlint/chained-subshell.test b/t/chainlint/chained-subshell.test
new file mode 100644
index 00000000000..4ff6ddd8cbd
--- /dev/null
+++ b/t/chainlint/chained-subshell.test
@@ -0,0 +1,13 @@
+# LINT: start of subshell chained to preceding command
+mkdir sub && (
+	cd sub &&
+	foo the bar
+	nuff said
+) &&
+
+# LINT: preceding command pipes to subshell on same line
+cut "-d " -f actual | (read s1 s2 s3 &&
+test -f $s1
+test $(cat $s2) = tree2path1 &&
+# LINT: closing subshell ")" correctly detected on same line as "$(...)"
+test $(cat $s3) = tree3path1)
diff --git a/t/chainlint/command-substitution-subsubshell.expect b/t/chainlint/command-substitution-subsubshell.expect
new file mode 100644
index 00000000000..ab2f79e8457
--- /dev/null
+++ b/t/chainlint/command-substitution-subsubshell.expect
@@ -0,0 +1,2 @@
+OUT=$(( ( large_git 1 >& 3 ) | : ) 3 >& 1) &&
+test_match_signal 13 "$OUT"
diff --git a/t/chainlint/command-substitution-subsubshell.test b/t/chainlint/command-substitution-subsubshell.test
new file mode 100644
index 00000000000..321de2951ce
--- /dev/null
+++ b/t/chainlint/command-substitution-subsubshell.test
@@ -0,0 +1,3 @@
+# LINT: subshell nested in subshell nested in command substitution
+OUT=$( ((large_git 1>&3) | :) 3>&1 ) &&
+test_match_signal 13 "$OUT"
diff --git a/t/chainlint/double-here-doc.expect b/t/chainlint/double-here-doc.expect
new file mode 100644
index 00000000000..75477bb1add
--- /dev/null
+++ b/t/chainlint/double-here-doc.expect
@@ -0,0 +1,2 @@
+run_sub_test_lib_test_err run-inv-range-start "--run invalid range start" --run="a-5" <<-EOF &&
+check_sub_test_lib_test_err run-inv-range-start <<-EOF_OUT 3 <<-EOF_ERR
diff --git a/t/chainlint/double-here-doc.test b/t/chainlint/double-here-doc.test
new file mode 100644
index 00000000000..cd584a43573
--- /dev/null
+++ b/t/chainlint/double-here-doc.test
@@ -0,0 +1,12 @@
+run_sub_test_lib_test_err run-inv-range-start \
+	"--run invalid range start" \
+	--run="a-5" <<-\EOF &&
+test_expect_success "passing test #1" "true"
+test_done
+EOF
+check_sub_test_lib_test_err run-inv-range-start \
+	<<-\EOF_OUT 3<<-EOF_ERR
+> FATAL: Unexpected exit with code 1
+EOF_OUT
+> error: --run: invalid non-numeric in range start: ${SQ}a-5${SQ}
+EOF_ERR
diff --git a/t/chainlint/dqstring-line-splice.expect b/t/chainlint/dqstring-line-splice.expect
new file mode 100644
index 00000000000..bf9ced60d4c
--- /dev/null
+++ b/t/chainlint/dqstring-line-splice.expect
@@ -0,0 +1,3 @@
+echo 'fatal: reword option of --fixup is mutually exclusive with' '--patch/--interactive/--all/--include/--only' > expect &&
+test_must_fail git commit --fixup=reword:HEAD~ $1 2 > actual &&
+test_cmp expect actual
diff --git a/t/chainlint/dqstring-line-splice.test b/t/chainlint/dqstring-line-splice.test
new file mode 100644
index 00000000000..b40714439f6
--- /dev/null
+++ b/t/chainlint/dqstring-line-splice.test
@@ -0,0 +1,7 @@
+# LINT: line-splice within DQ-string
+'"
+echo 'fatal: reword option of --fixup is mutually exclusive with'\
+	'--patch/--interactive/--all/--include/--only' >expect &&
+test_must_fail git commit --fixup=reword:HEAD~ $1 2>actual &&
+test_cmp expect actual
+"'
diff --git a/t/chainlint/dqstring-no-interpolate.expect b/t/chainlint/dqstring-no-interpolate.expect
new file mode 100644
index 00000000000..10724987a5f
--- /dev/null
+++ b/t/chainlint/dqstring-no-interpolate.expect
@@ -0,0 +1,11 @@
+grep "^ ! [rejected][ ]*$BRANCH -> $BRANCH (non-fast-forward)$" out &&
+
+grep "^\.git$" output.txt &&
+
+
+(
+	cd client$version &&
+	GIT_TEST_PROTOCOL_VERSION=$version git fetch-pack --no-progress .. $(cat ../input)
+) > output &&
+	cut -d ' ' -f 2 < output | sort > actual &&
+	test_cmp expect actual
diff --git a/t/chainlint/dqstring-no-interpolate.test b/t/chainlint/dqstring-no-interpolate.test
new file mode 100644
index 00000000000..d2f4219cbbb
--- /dev/null
+++ b/t/chainlint/dqstring-no-interpolate.test
@@ -0,0 +1,15 @@
+# LINT: regex dollar-sign eol anchor in double-quoted string not special
+grep "^ ! \[rejected\][ ]*$BRANCH -> $BRANCH (non-fast-forward)$" out &&
+
+# LINT: escaped "$" not mistaken for variable expansion
+grep "^\\.git\$" output.txt &&
+
+'"
+(
+	cd client$version &&
+# LINT: escaped dollar-sign in double-quoted test body
+	GIT_TEST_PROTOCOL_VERSION=$version git fetch-pack --no-progress .. \$(cat ../input)
+) >output &&
+	cut -d ' ' -f 2 <output | sort >actual &&
+	test_cmp expect actual
+"'
diff --git a/t/chainlint/empty-here-doc.expect b/t/chainlint/empty-here-doc.expect
new file mode 100644
index 00000000000..f42f2d41ba8
--- /dev/null
+++ b/t/chainlint/empty-here-doc.expect
@@ -0,0 +1,3 @@
+git ls-tree $tree path > current &&
+cat > expected <<EOF &&
+test_output
diff --git a/t/chainlint/empty-here-doc.test b/t/chainlint/empty-here-doc.test
new file mode 100644
index 00000000000..24fc165de3f
--- /dev/null
+++ b/t/chainlint/empty-here-doc.test
@@ -0,0 +1,5 @@
+git ls-tree $tree path >current &&
+# LINT: empty here-doc
+cat >expected <<\EOF &&
+EOF
+test_output
diff --git a/t/chainlint/exclamation.expect b/t/chainlint/exclamation.expect
new file mode 100644
index 00000000000..2d961a58c66
--- /dev/null
+++ b/t/chainlint/exclamation.expect
@@ -0,0 +1,4 @@
+if ! condition ; then echo nope ; else yep ; fi &&
+test_prerequisite !MINGW &&
+mail uucp!address &&
+echo !whatever!
diff --git a/t/chainlint/exclamation.test b/t/chainlint/exclamation.test
new file mode 100644
index 00000000000..323595b5bd8
--- /dev/null
+++ b/t/chainlint/exclamation.test
@@ -0,0 +1,8 @@
+# LINT: "! word" is two tokens
+if ! condition; then echo nope; else yep; fi &&
+# LINT: "!word" is single token, not two tokens "!" and "word"
+test_prerequisite !MINGW &&
+# LINT: "word!word" is single token, not three tokens "word", "!", and "word"
+mail uucp!address &&
+# LINT: "!word!" is single token, not three tokens "!", "word", and "!"
+echo !whatever!
diff --git a/t/chainlint/for-loop-abbreviated.expect b/t/chainlint/for-loop-abbreviated.expect
new file mode 100644
index 00000000000..a21007a63f1
--- /dev/null
+++ b/t/chainlint/for-loop-abbreviated.expect
@@ -0,0 +1,5 @@
+for it
+do
+	path=$(expr "$it" : ( [^:]*) ) &&
+	git update-index --add "$path" || exit
+done
diff --git a/t/chainlint/for-loop-abbreviated.test b/t/chainlint/for-loop-abbreviated.test
new file mode 100644
index 00000000000..1084eccb89c
--- /dev/null
+++ b/t/chainlint/for-loop-abbreviated.test
@@ -0,0 +1,6 @@
+# LINT: for-loop lacking optional "in [word...]" before "do"
+for it
+do
+	path=$(expr "$it" : '\([^:]*\)') &&
+	git update-index --add "$path" || exit
+done
diff --git a/t/chainlint/function.expect b/t/chainlint/function.expect
new file mode 100644
index 00000000000..a14388e6b9f
--- /dev/null
+++ b/t/chainlint/function.expect
@@ -0,0 +1,11 @@
+sha1_file ( ) {
+	echo "$*" | sed "s#..#.git/objects/&/#"
+} &&
+
+remove_object ( ) {
+	file=$(sha1_file "$*") &&
+	test -e "$file" ?!AMP?!
+	rm -f "$file"
+} ?!AMP?!
+
+sha1_file arg && remove_object arg
diff --git a/t/chainlint/function.test b/t/chainlint/function.test
new file mode 100644
index 00000000000..5ee59562c93
--- /dev/null
+++ b/t/chainlint/function.test
@@ -0,0 +1,13 @@
+# LINT: "()" in function definition not mistaken for subshell
+sha1_file() {
+	echo "$*" | sed "s#..#.git/objects/&/#"
+} &&
+
+# LINT: broken &&-chain in function and after function
+remove_object() {
+	file=$(sha1_file "$*") &&
+	test -e "$file"
+	rm -f "$file"
+}
+
+sha1_file arg && remove_object arg
diff --git a/t/chainlint/here-doc-indent-operator.expect b/t/chainlint/here-doc-indent-operator.expect
new file mode 100644
index 00000000000..fb6cf7285d0
--- /dev/null
+++ b/t/chainlint/here-doc-indent-operator.expect
@@ -0,0 +1,5 @@
+cat > expect <<-EOF &&
+
+cat > expect <<-EOF ?!AMP?!
+
+cleanup
diff --git a/t/chainlint/here-doc-indent-operator.test b/t/chainlint/here-doc-indent-operator.test
new file mode 100644
index 00000000000..c8a6f18eb45
--- /dev/null
+++ b/t/chainlint/here-doc-indent-operator.test
@@ -0,0 +1,13 @@
+# LINT: whitespace between operator "<<-" and tag legal
+cat >expect <<- EOF &&
+header: 43475048 1 $(test_oid oid_version) $NUM_CHUNKS 0
+num_commits: $1
+chunks: oid_fanout oid_lookup commit_metadata generation_data bloom_indexes bloom_data
+EOF
+
+# LINT: not an indented here-doc; just a plain here-doc with tag named "-EOF"
+cat >expect << -EOF
+this is not indented
+-EOF
+
+cleanup
diff --git a/t/chainlint/if-condition-split.expect b/t/chainlint/if-condition-split.expect
new file mode 100644
index 00000000000..ee745ef8d7f
--- /dev/null
+++ b/t/chainlint/if-condition-split.expect
@@ -0,0 +1,7 @@
+if bob &&
+   marcia ||
+   kevin
+then
+	echo "nomads" ?!AMP?!
+	echo "for sure"
+fi
diff --git a/t/chainlint/if-condition-split.test b/t/chainlint/if-condition-split.test
new file mode 100644
index 00000000000..240daa9fd5d
--- /dev/null
+++ b/t/chainlint/if-condition-split.test
@@ -0,0 +1,8 @@
+# LINT: "if" condition split across multiple lines at "&&" or "||"
+if bob &&
+   marcia ||
+   kevin
+then
+	echo "nomads"
+	echo "for sure"
+fi
diff --git a/t/chainlint/one-liner-for-loop.expect b/t/chainlint/one-liner-for-loop.expect
new file mode 100644
index 00000000000..51a3dc7c544
--- /dev/null
+++ b/t/chainlint/one-liner-for-loop.expect
@@ -0,0 +1,9 @@
+git init dir-rename-and-content &&
+(
+	cd dir-rename-and-content &&
+	test_write_lines 1 2 3 4 5 >foo &&
+	mkdir olddir &&
+	for i in a b c; do echo $i >olddir/$i; ?!LOOP?! done ?!AMP?!
+	git add foo olddir &&
+	git commit -m "original" &&
+)
diff --git a/t/chainlint/one-liner-for-loop.test b/t/chainlint/one-liner-for-loop.test
new file mode 100644
index 00000000000..4bd8c066c79
--- /dev/null
+++ b/t/chainlint/one-liner-for-loop.test
@@ -0,0 +1,10 @@
+git init dir-rename-and-content &&
+(
+	cd dir-rename-and-content &&
+	test_write_lines 1 2 3 4 5 >foo &&
+	mkdir olddir &&
+# LINT: one-liner for-loop missing "|| exit"; also broken &&-chain
+	for i in a b c; do echo $i >olddir/$i; done
+	git add foo olddir &&
+	git commit -m "original" &&
+)
diff --git a/t/chainlint/sqstring-in-sqstring.expect b/t/chainlint/sqstring-in-sqstring.expect
new file mode 100644
index 00000000000..cf0b591cf7d
--- /dev/null
+++ b/t/chainlint/sqstring-in-sqstring.expect
@@ -0,0 +1,4 @@
+perl -e '
+	defined($_ = -s $_) or die for @ARGV;
+	exit 1 if $ARGV[0] <= $ARGV[1];
+' test-2-$packname_2.pack test-3-$packname_3.pack
diff --git a/t/chainlint/sqstring-in-sqstring.test b/t/chainlint/sqstring-in-sqstring.test
new file mode 100644
index 00000000000..77a425e0c79
--- /dev/null
+++ b/t/chainlint/sqstring-in-sqstring.test
@@ -0,0 +1,5 @@
+# LINT: SQ-string Perl code fragment within SQ-string
+perl -e '\''
+	defined($_ = -s $_) or die for @ARGV;
+	exit 1 if $ARGV[0] <= $ARGV[1];
+'\'' test-2-$packname_2.pack test-3-$packname_3.pack
diff --git a/t/chainlint/token-pasting.expect b/t/chainlint/token-pasting.expect
new file mode 100644
index 00000000000..342360bcd05
--- /dev/null
+++ b/t/chainlint/token-pasting.expect
@@ -0,0 +1,27 @@
+git config filter.rot13.smudge ./rot13.sh &&
+git config filter.rot13.clean ./rot13.sh &&
+
+{
+    echo "*.t filter=rot13" ?!AMP?!
+    echo "*.i ident"
+} > .gitattributes &&
+
+{
+    echo a b c d e f g h i j k l m ?!AMP?!
+    echo n o p q r s t u v w x y z ?!AMP?!
+    echo '$Id$'
+} > test &&
+cat test > test.t &&
+cat test > test.o &&
+cat test > test.i &&
+git add test test.t test.i &&
+rm -f test test.t test.i &&
+git checkout -- test test.t test.i &&
+
+echo "content-test2" > test2.o &&
+echo "content-test3 - filename with special characters" > "test3 'sq',$x=.o" ?!AMP?!
+
+downstream_url_for_sed=$(
+	printf "%sn" "$downstream_url" |
+	sed -e 's/\/\\/g' -e 's/[[/.*^$]/\&/g'
+)
diff --git a/t/chainlint/token-pasting.test b/t/chainlint/token-pasting.test
new file mode 100644
index 00000000000..b4610ce815a
--- /dev/null
+++ b/t/chainlint/token-pasting.test
@@ -0,0 +1,32 @@
+# LINT: single token; composite of multiple strings
+git config filter.rot13.smudge ./rot13.sh &&
+git config filter.rot13.clean ./rot13.sh &&
+
+{
+    echo "*.t filter=rot13"
+    echo "*.i ident"
+} >.gitattributes &&
+
+{
+    echo a b c d e f g h i j k l m
+    echo n o p q r s t u v w x y z
+# LINT: exit/enter string context and escaped-quote outside of string
+    echo '\''$Id$'\''
+} >test &&
+cat test >test.t &&
+cat test >test.o &&
+cat test >test.i &&
+git add test test.t test.i &&
+rm -f test test.t test.i &&
+git checkout -- test test.t test.i &&
+
+echo "content-test2" >test2.o &&
+# LINT: exit/enter string context and escaped-quote outside of string
+echo "content-test3 - filename with special characters" >"test3 '\''sq'\'',\$x=.o"
+
+# LINT: single token; composite of multiple strings
+downstream_url_for_sed=$(
+	printf "%s\n" "$downstream_url" |
+# LINT: exit/enter string context; "&" inside string not command terminator
+	sed -e '\''s/\\/\\\\/g'\'' -e '\''s/[[/.*^$]/\\&/g'\''
+)
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 15/18] test-lib: retire "lint harder" optimization hack
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (13 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 14/18] t/chainlint: add more chainlint.pl self-tests Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 16/18] test-lib: replace chainlint.sed with chainlint.pl Eric Sunshine via GitGitGadget
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

`test_run_` in test-lib.sh "lints" the body of a test by sending it down
a `sed chainlint.sed | grep` pipeline; this happens once for each test
run by a test script. Although this pipeline may seem relatively cheap
in isolation, it can become expensive when invoked 26800+ times by `make
test`, once for each test run, despite the existence of only 16500+ test
definitions across all tests scripts.

This difference in the number of tests defined in the scripts (16500+)
and the number of tests actually run by `make test` (26800+) is
explained by the fact that some test scripts run a very large number of
small tests, all driven by a series of functions/loops which fill in the
test bodies. This means that certain test definitions are being linted
repeatedly (tens or hundreds of times) unnecessarily. To avoid such
unnecessary work, 2d86a96220 (t: avoid sed-based chain-linting in some
expensive cases, 2021-05-13) added an optimization hack which allows
individual scripts to manually suppress the unnecessary repeated linting
of the same test definition.

However, unlike chainlint.sed which checks a test body as the test is
run, chainlint.pl checks each test definition just once, no matter how
many times the test is run, thus the sort of optimization hack
introduced by 2d86a96220 is no longer needed and can be retired.
Therefore, revert 2d86a96220.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/README             | 5 -----
 t/t0027-auto-crlf.sh | 7 +------
 t/t3070-wildmatch.sh | 5 -----
 t/test-lib.sh        | 7 ++-----
 4 files changed, 3 insertions(+), 21 deletions(-)

diff --git a/t/README b/t/README
index 2f439f96589..979b2d4833d 100644
--- a/t/README
+++ b/t/README
@@ -196,11 +196,6 @@ appropriately before running "make". Short options can be bundled, i.e.
 	this feature by setting the GIT_TEST_CHAIN_LINT environment
 	variable to "1" or "0", respectively.
 
-	A few test scripts disable some of the more advanced
-	chain-linting detection in the name of efficiency. You can
-	override this by setting the GIT_TEST_CHAIN_LINT_HARDER
-	environment variable to "1".
-
 --stress::
 	Run the test script repeatedly in multiple parallel jobs until
 	one of them fails.  Useful for reproducing rare failures in
diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index a22e0e1382c..a94ac1eae37 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -387,9 +387,7 @@ test_expect_success 'setup main' '
 	test_tick
 '
 
-# Disable extra chain-linting for the next set of tests. There are many
-# auto-generated ones that are not worth checking over and over.
-GIT_TEST_CHAIN_LINT_HARDER_DEFAULT=0
+
 
 warn_LF_CRLF="LF will be replaced by CRLF"
 warn_CRLF_LF="CRLF will be replaced by LF"
@@ -606,9 +604,6 @@ do
 	checkout_files     ""    "$id" "crlf" true    ""       CRLF  CRLF  CRLF         CRLF_mix_CR  CRLF_nul
 done
 
-# The rest of the tests are unique; do the usual linting.
-unset GIT_TEST_CHAIN_LINT_HARDER_DEFAULT
-
 # Should be the last test case: remove some files from the worktree
 test_expect_success 'ls-files --eol -d -z' '
 	rm crlf_false_attr__CRLF.txt crlf_false_attr__CRLF_mix_LF.txt crlf_false_attr__LF.txt .gitattributes &&
diff --git a/t/t3070-wildmatch.sh b/t/t3070-wildmatch.sh
index f9539968e4c..5d871fde960 100755
--- a/t/t3070-wildmatch.sh
+++ b/t/t3070-wildmatch.sh
@@ -5,11 +5,6 @@ test_description='wildmatch tests'
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
-# Disable expensive chain-lint tests; all of the tests in this script
-# are variants of a few trivial test-tool invocations, and there are a lot of
-# them.
-GIT_TEST_CHAIN_LINT_HARDER_DEFAULT=0
-
 should_create_test_file() {
 	file=$1
 
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 377cc1c1203..dc0d0591095 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1091,11 +1091,8 @@ test_run_ () {
 		trace=
 		# 117 is magic because it is unlikely to match the exit
 		# code of other programs
-		if test "OK-117" != "$(test_eval_ "(exit 117) && $1${LF}${LF}echo OK-\$?" 3>&1)" ||
-		   {
-			test "${GIT_TEST_CHAIN_LINT_HARDER:-${GIT_TEST_CHAIN_LINT_HARDER_DEFAULT:-1}}" != 0 &&
-			$(printf '%s\n' "$1" | sed -f "$GIT_BUILD_DIR/t/chainlint.sed" | grep -q '?![A-Z][A-Z]*?!')
-		   }
+		if $(printf '%s\n' "$1" | sed -f "$GIT_BUILD_DIR/t/chainlint.sed" | grep -q '?![A-Z][A-Z]*?!') ||
+			test "OK-117" != "$(test_eval_ "(exit 117) && $1${LF}${LF}echo OK-\$?" 3>&1)"
 		then
 			BUG "broken &&-chain or run-away HERE-DOC: $1"
 		fi
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 16/18] test-lib: replace chainlint.sed with chainlint.pl
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (14 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 15/18] test-lib: retire "lint harder" optimization hack Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-03  5:07   ` Elijah Newren
  2022-09-01  0:29 ` [PATCH 17/18] t/Makefile: teach `make test` and `make prove` to run chainlint.pl Eric Sunshine via GitGitGadget
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

By automatically invoking chainlint.sed upon each test it runs,
`test_run_` in test-lib.sh ensures that broken &&-chains will be
detected early as tests are modified or new are tests created since it
is typical to run a test script manually (i.e. `./t1234-test-script.sh`)
during test development. Now that the implementation of chainlint.pl is
complete, modify test-lib.sh to invoke it automatically instead of
chainlint.sed each time a test script is run.

This change reduces the number of "linter" invocations from 26800+ (once
per test run) down to 1050+ (once per test script), however, a
subsequent change will drop the number of invocations to 1 per `make
test`, thus fully realizing the benefit of the new linter.

Note that the "magic exit code 117" &&-chain checker added by bb79af9d09
(t/test-lib: introduce --chain-lint option, 2015-03-20) which is built
into t/test-lib.sh is retained since it has near zero-cost and
(theoretically) may catch a broken &&-chain not caught by chainlint.pl.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 contrib/buildsystems/CMakeLists.txt | 2 +-
 t/test-lib.sh                       | 9 +++++++--
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
index 2237109b57f..ca358a21a5f 100644
--- a/contrib/buildsystems/CMakeLists.txt
+++ b/contrib/buildsystems/CMakeLists.txt
@@ -1076,7 +1076,7 @@ if(NOT ${CMAKE_BINARY_DIR}/CMakeCache.txt STREQUAL ${CACHE_PATH})
 		"string(REPLACE \"\${GIT_BUILD_DIR_REPL}\" \"GIT_BUILD_DIR=\\\"$TEST_DIRECTORY/../${BUILD_DIR_RELATIVE}\\\"\" content \"\${content}\")\n"
 		"file(WRITE ${CMAKE_SOURCE_DIR}/t/test-lib.sh \${content})")
 	#misc copies
-	file(COPY ${CMAKE_SOURCE_DIR}/t/chainlint.sed DESTINATION ${CMAKE_BINARY_DIR}/t/)
+	file(COPY ${CMAKE_SOURCE_DIR}/t/chainlint.pl DESTINATION ${CMAKE_BINARY_DIR}/t/)
 	file(COPY ${CMAKE_SOURCE_DIR}/po/is.po DESTINATION ${CMAKE_BINARY_DIR}/po/)
 	file(COPY ${CMAKE_SOURCE_DIR}/mergetools/tkdiff DESTINATION ${CMAKE_BINARY_DIR}/mergetools/)
 	file(COPY ${CMAKE_SOURCE_DIR}/contrib/completion/git-prompt.sh DESTINATION ${CMAKE_BINARY_DIR}/contrib/completion/)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index dc0d0591095..a65df2fd220 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1091,8 +1091,7 @@ test_run_ () {
 		trace=
 		# 117 is magic because it is unlikely to match the exit
 		# code of other programs
-		if $(printf '%s\n' "$1" | sed -f "$GIT_BUILD_DIR/t/chainlint.sed" | grep -q '?![A-Z][A-Z]*?!') ||
-			test "OK-117" != "$(test_eval_ "(exit 117) && $1${LF}${LF}echo OK-\$?" 3>&1)"
+		if test "OK-117" != "$(test_eval_ "(exit 117) && $1${LF}${LF}echo OK-\$?" 3>&1)"
 		then
 			BUG "broken &&-chain or run-away HERE-DOC: $1"
 		fi
@@ -1588,6 +1587,12 @@ then
 	BAIL_OUT_ENV_NEEDS_SANITIZE_LEAK "GIT_TEST_SANITIZE_LEAK_LOG=true"
 fi
 
+if test "${GIT_TEST_CHAIN_LINT:-1}" != 0
+then
+	"$PERL_PATH" "$TEST_DIRECTORY/chainlint.pl" "$0" ||
+		BUG "lint error (see '?!...!? annotations above)"
+fi
+
 # Last-minute variable setup
 USER_HOME="$HOME"
 HOME="$TRASH_DIRECTORY"
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 17/18] t/Makefile: teach `make test` and `make prove` to run chainlint.pl
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (15 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 16/18] test-lib: replace chainlint.sed with chainlint.pl Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-01  0:29 ` [PATCH 18/18] t: retire unused chainlint.sed Eric Sunshine via GitGitGadget
  2022-09-11  5:28 ` [PATCH 00/18] make test "linting" more comprehensive Jeff King
  18 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

Unlike chainlint.sed which "lints" a single test body at a time, thus is
invoked once per test, chainlint.pl can check all test bodies in all
test scripts with a single invocation. As such, it is akin to other bulk
"linters" run by the Makefile, such as `test-lint-shell-syntax`,
`test-lint-duplicates`, etc.

Therefore, teach `make test` and `make prove` to invoke chainlint.pl
along with the other bulk linters. Also, since the single chainlint.pl
invocation by `make test` or `make prove` has already checked all tests
in all scripts, instruct the individual test scripts not to run
chainlint.pl on themselves unnecessarily.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/Makefile | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/t/Makefile b/t/Makefile
index 11f276774ea..3db48c0cb64 100644
--- a/t/Makefile
+++ b/t/Makefile
@@ -36,14 +36,21 @@ CHAINLINTTMP_SQ = $(subst ','\'',$(CHAINLINTTMP))
 
 T = $(sort $(wildcard t[0-9][0-9][0-9][0-9]-*.sh))
 THELPERS = $(sort $(filter-out $(T),$(wildcard *.sh)))
+TLIBS = $(sort $(wildcard lib-*.sh)) annotate-tests.sh
 TPERF = $(sort $(wildcard perf/p[0-9][0-9][0-9][0-9]-*.sh))
+TINTEROP = $(sort $(wildcard interop/i[0-9][0-9][0-9][0-9]-*.sh))
 CHAINLINTTESTS = $(sort $(patsubst chainlint/%.test,%,$(wildcard chainlint/*.test)))
 CHAINLINT = '$(PERL_PATH_SQ)' chainlint.pl
 
+# `test-chainlint` (which is a dependency of `test-lint`, `test` and `prove`)
+# checks all tests in all scripts via a single invocation, so tell individual
+# scripts not to "chainlint" themselves
+CHAINLINTSUPPRESS = GIT_TEST_CHAIN_LINT=0 && export GIT_TEST_CHAIN_LINT &&
+
 all: $(DEFAULT_TEST_TARGET)
 
 test: pre-clean check-chainlint $(TEST_LINT)
-	$(MAKE) aggregate-results-and-cleanup
+	$(CHAINLINTSUPPRESS) $(MAKE) aggregate-results-and-cleanup
 
 failed:
 	@failed=$$(cd '$(TEST_RESULTS_DIRECTORY_SQ)' && \
@@ -52,7 +59,7 @@ failed:
 	test -z "$$failed" || $(MAKE) $$failed
 
 prove: pre-clean check-chainlint $(TEST_LINT)
-	@echo "*** prove ***"; $(PROVE) --exec '$(TEST_SHELL_PATH_SQ)' $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS)
+	@echo "*** prove ***"; $(CHAINLINTSUPPRESS) $(PROVE) --exec '$(TEST_SHELL_PATH_SQ)' $(GIT_PROVE_OPTS) $(T) :: $(GIT_TEST_OPTS)
 	$(MAKE) clean-except-prove-cache
 
 $(T):
@@ -99,6 +106,9 @@ check-chainlint:
 
 test-lint: test-lint-duplicates test-lint-executable test-lint-shell-syntax \
 	test-lint-filenames
+ifneq ($(GIT_TEST_CHAIN_LINT),0)
+test-lint: test-chainlint
+endif
 
 test-lint-duplicates:
 	@dups=`echo $(T) $(TPERF) | tr ' ' '\n' | sed 's/-.*//' | sort | uniq -d` && \
@@ -121,6 +131,9 @@ test-lint-filenames:
 		test -z "$$bad" || { \
 		echo >&2 "non-portable file name(s): $$bad"; exit 1; }
 
+test-chainlint:
+	@$(CHAINLINT) $(T) $(TLIBS) $(TPERF) $(TINTEROP)
+
 aggregate-results-and-cleanup: $(T)
 	$(MAKE) aggregate-results
 	$(MAKE) clean
@@ -136,4 +149,5 @@ valgrind:
 perf:
 	$(MAKE) -C perf/ all
 
-.PHONY: pre-clean $(T) aggregate-results clean valgrind perf check-chainlint clean-chainlint
+.PHONY: pre-clean $(T) aggregate-results clean valgrind perf \
+	check-chainlint clean-chainlint test-chainlint
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 18/18] t: retire unused chainlint.sed
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (16 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 17/18] t/Makefile: teach `make test` and `make prove` to run chainlint.pl Eric Sunshine via GitGitGadget
@ 2022-09-01  0:29 ` Eric Sunshine via GitGitGadget
  2022-09-02 12:42   ` several messages Johannes Schindelin
  2022-09-11  5:28 ` [PATCH 00/18] make test "linting" more comprehensive Jeff King
  18 siblings, 1 reply; 51+ messages in thread
From: Eric Sunshine via GitGitGadget @ 2022-09-01  0:29 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine, Eric Sunshine

From: Eric Sunshine <sunshine@sunshineco.com>

Retire chainlint.sed since it has been replaced by a more accurate and
functional &&-chain "linter", thus is no longer used.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 t/chainlint.sed | 399 ------------------------------------------------
 1 file changed, 399 deletions(-)
 delete mode 100644 t/chainlint.sed

diff --git a/t/chainlint.sed b/t/chainlint.sed
deleted file mode 100644
index dc4ce37cb51..00000000000
--- a/t/chainlint.sed
+++ /dev/null
@@ -1,399 +0,0 @@
-#------------------------------------------------------------------------------
-# Detect broken &&-chains in tests.
-#
-# At present, only &&-chains in subshells are examined by this linter;
-# top-level &&-chains are instead checked directly by the test framework. Like
-# the top-level &&-chain linter, the subshell linter (intentionally) does not
-# check &&-chains within {...} blocks.
-#
-# Checking for &&-chain breakage is done line-by-line by pure textual
-# inspection.
-#
-# Incomplete lines (those ending with "\") are stitched together with following
-# lines to simplify processing, particularly of "one-liner" statements.
-# Top-level here-docs are swallowed to avoid false positives within the
-# here-doc body, although the statement to which the here-doc is attached is
-# retained.
-#
-# Heuristics are used to detect end-of-subshell when the closing ")" is cuddled
-# with the final subshell statement on the same line:
-#
-#    (cd foo &&
-#        bar)
-#
-# in order to avoid misinterpreting the ")" in constructs such as "x=$(...)"
-# and "case $x in *)" as ending the subshell.
-#
-# Lines missing a final "&&" are flagged with "?!AMP?!", as are lines which
-# chain commands with ";" internally rather than "&&". A line may be flagged
-# for both violations.
-#
-# Detection of a missing &&-link in a multi-line subshell is complicated by the
-# fact that the last statement before the closing ")" must not end with "&&".
-# Since processing is line-by-line, it is not known whether a missing "&&" is
-# legitimate or not until the _next_ line is seen. To accommodate this, within
-# multi-line subshells, each line is stored in sed's "hold" area until after
-# the next line is seen and processed. If the next line is a stand-alone ")",
-# then a missing "&&" on the previous line is legitimate; otherwise a missing
-# "&&" is a break in the &&-chain.
-#
-#    (
-#         cd foo &&
-#         bar
-#    )
-#
-# In practical terms, when "bar" is encountered, it is flagged with "?!AMP?!",
-# but when the stand-alone ")" line is seen which closes the subshell, the
-# "?!AMP?!" violation is removed from the "bar" line (retrieved from the "hold"
-# area) since the final statement of a subshell must not end with "&&". The
-# final line of a subshell may still break the &&-chain by using ";" internally
-# to chain commands together rather than "&&", but an internal "?!AMP?!" is
-# never removed from a line even though a line-ending "?!AMP?!" might be.
-#
-# Care is taken to recognize the last _statement_ of a multi-line subshell, not
-# necessarily the last textual _line_ within the subshell, since &&-chaining
-# applies to statements, not to lines. Consequently, blank lines, comment
-# lines, and here-docs are swallowed (but not the command to which the here-doc
-# is attached), leaving the last statement in the "hold" area, not the last
-# line, thus simplifying &&-link checking.
-#
-# The final statement before "done" in for- and while-loops, and before "elif",
-# "else", and "fi" in if-then-else likewise must not end with "&&", thus
-# receives similar treatment.
-#
-# Swallowing here-docs with arbitrary tags requires a bit of finesse. When a
-# line such as "cat <<EOF" is seen, the here-doc tag is copied to the front of
-# the line enclosed in angle brackets as a sentinel, giving "<EOF>cat <<EOF".
-# As each subsequent line is read, it is appended to the target line and a
-# (whitespace-loose) back-reference match /^<(.*)>\n\1$/ is attempted to see if
-# the content inside "<...>" matches the entirety of the newly-read line. For
-# instance, if the next line read is "some data", when concatenated with the
-# target line, it becomes "<EOF>cat <<EOF\nsome data", and a match is attempted
-# to see if "EOF" matches "some data". Since it doesn't, the next line is
-# attempted. When a line consisting of only "EOF" (and possible whitespace) is
-# encountered, it is appended to the target line giving "<EOF>cat <<EOF\nEOF",
-# in which case the "EOF" inside "<...>" does match the text following the
-# newline, thus the closing here-doc tag has been found. The closing tag line
-# and the "<...>" prefix on the target line are then discarded, leaving just
-# the target line "cat <<EOF".
-#------------------------------------------------------------------------------
-
-# incomplete line -- slurp up next line
-:squash
-/\\$/ {
-	N
-	s/\\\n//
-	bsquash
-}
-
-# here-doc -- swallow it to avoid false hits within its body (but keep the
-# command to which it was attached)
-/<<-*[ 	]*[\\'"]*[A-Za-z0-9_]/ {
-	/"[^"]*<<[^"]*"/bnotdoc
-	s/^\(.*<<-*[ 	]*\)[\\'"]*\([A-Za-z0-9_][A-Za-z0-9_]*\)['"]*/<\2>\1\2/
-	:hered
-	N
-	/^<\([^>]*\)>.*\n[ 	]*\1[ 	]*$/!{
-		s/\n.*$//
-		bhered
-	}
-	s/^<[^>]*>//
-	s/\n.*$//
-}
-:notdoc
-
-# one-liner "(...) &&"
-/^[ 	]*!*[ 	]*(..*)[ 	]*&&[ 	]*$/boneline
-
-# same as above but without trailing "&&"
-/^[ 	]*!*[ 	]*(..*)[ 	]*$/boneline
-
-# one-liner "(...) >x" (or "2>x" or "<x" or "|x" or "&"
-/^[ 	]*!*[ 	]*(..*)[ 	]*[0-9]*[<>|&]/boneline
-
-# multi-line "(...\n...)"
-/^[ 	]*(/bsubsh
-
-# innocuous line -- print it and advance to next line
-b
-
-# found one-liner "(...)" -- mark suspect if it uses ";" internally rather than
-# "&&" (but not ";" in a string)
-:oneline
-/;/{
-	/"[^"]*;[^"]*"/!s/;/; ?!AMP?!/
-}
-b
-
-:subsh
-# bare "(" line? -- stash for later printing
-/^[ 	]*([	]*$/ {
-	h
-	bnextln
-}
-# "(..." line -- "(" opening subshell cuddled with command; temporarily replace
-# "(" with sentinel "^" and process the line as if "(" had been seen solo on
-# the preceding line; this temporary replacement prevents several rules from
-# accidentally thinking "(" introduces a nested subshell; "^" is changed back
-# to "(" at output time
-x
-s/.*//
-x
-s/(/^/
-bslurp
-
-:nextln
-N
-s/.*\n//
-
-:slurp
-# incomplete line "...\"
-/\\$/bicmplte
-# multi-line quoted string "...\n..."?
-/"/bdqstr
-# multi-line quoted string '...\n...'? (but not contraction in string "it's")
-/'/{
-	/"[^'"]*'[^'"]*"/!bsqstr
-}
-:folded
-# here-doc -- swallow it (but not "<<" in a string)
-/<<-*[ 	]*[\\'"]*[A-Za-z0-9_]/{
-	/"[^"]*<<[^"]*"/!bheredoc
-}
-# comment or empty line -- discard since final non-comment, non-empty line
-# before closing ")", "done", "elsif", "else", or "fi" will need to be
-# re-visited to drop "suspect" marking since final line of those constructs
-# legitimately lacks "&&", so "suspect" mark must be removed
-/^[ 	]*#/bnextln
-/^[ 	]*$/bnextln
-# in-line comment -- strip it (but not "#" in a string, Bash ${#...} array
-# length, or Perforce "//depot/path#42" revision in filespec)
-/[ 	]#/{
-	/"[^"]*#[^"]*"/!s/[ 	]#.*$//
-}
-# one-liner "case ... esac"
-/^[ 	^]*case[ 	]*..*esac/bchkchn
-# multi-line "case ... esac"
-/^[ 	^]*case[ 	]..*[ 	]in/bcase
-# multi-line "for ... done" or "while ... done"
-/^[ 	^]*for[ 	]..*[ 	]in/bcont
-/^[ 	^]*while[ 	]/bcont
-/^[ 	]*do[ 	]/bcont
-/^[ 	]*do[ 	]*$/bcont
-/;[ 	]*do/bcont
-/^[ 	]*done[ 	]*&&[ 	]*$/bdone
-/^[ 	]*done[ 	]*$/bdone
-/^[ 	]*done[ 	]*[<>|]/bdone
-/^[ 	]*done[ 	]*)/bdone
-/||[ 	]*exit[ 	]/bcont
-/||[ 	]*exit[ 	]*$/bcont
-# multi-line "if...elsif...else...fi"
-/^[ 	^]*if[ 	]/bcont
-/^[ 	]*then[ 	]/bcont
-/^[ 	]*then[ 	]*$/bcont
-/;[ 	]*then/bcont
-/^[ 	]*elif[ 	]/belse
-/^[ 	]*elif[ 	]*$/belse
-/^[ 	]*else[ 	]/belse
-/^[ 	]*else[ 	]*$/belse
-/^[ 	]*fi[ 	]*&&[ 	]*$/bdone
-/^[ 	]*fi[ 	]*$/bdone
-/^[ 	]*fi[ 	]*[<>|]/bdone
-/^[ 	]*fi[ 	]*)/bdone
-# nested one-liner "(...) &&"
-/^[ 	^]*(.*)[ 	]*&&[ 	]*$/bchkchn
-# nested one-liner "(...)"
-/^[ 	^]*(.*)[ 	]*$/bchkchn
-# nested one-liner "(...) >x" (or "2>x" or "<x" or "|x")
-/^[ 	^]*(.*)[ 	]*[0-9]*[<>|]/bchkchn
-# nested multi-line "(...\n...)"
-/^[ 	^]*(/bnest
-# multi-line "{...\n...}"
-/^[ 	^]*{/bblock
-# closing ")" on own line -- exit subshell
-/^[ 	]*)/bclssolo
-# "$((...))" -- arithmetic expansion; not closing ")"
-/\$(([^)][^)]*))[^)]*$/bchkchn
-# "$(...)" -- command substitution; not closing ")"
-/\$([^)][^)]*)[^)]*$/bchkchn
-# multi-line "$(...\n...)" -- command substitution; treat as nested subshell
-/\$([^)]*$/bnest
-# "=(...)" -- Bash array assignment; not closing ")"
-/=(/bchkchn
-# closing "...) &&"
-/)[ 	]*&&[ 	]*$/bclose
-# closing "...)"
-/)[ 	]*$/bclose
-# closing "...) >x" (or "2>x" or "<x" or "|x")
-/)[ 	]*[<>|]/bclose
-:chkchn
-# mark suspect if line uses ";" internally rather than "&&" (but not ";" in a
-# string and not ";;" in one-liner "case...esac")
-/;/{
-	/;;/!{
-		/"[^"]*;[^"]*"/!s/;/; ?!AMP?!/
-	}
-}
-# line ends with pipe "...|" -- valid; not missing "&&"
-/|[ 	]*$/bcont
-# missing end-of-line "&&" -- mark suspect
-/&&[ 	]*$/!s/$/ ?!AMP?!/
-:cont
-# retrieve and print previous line
-x
-s/^\([ 	]*\)^/\1(/
-s/?!HERE?!/<</g
-n
-bslurp
-
-# found incomplete line "...\" -- slurp up next line
-:icmplte
-N
-s/\\\n//
-bslurp
-
-# check for multi-line double-quoted string "...\n..." -- fold to one line
-:dqstr
-# remove all quote pairs
-s/"\([^"]*\)"/@!\1@!/g
-# done if no dangling quote
-/"/!bdqdone
-# otherwise, slurp next line and try again
-N
-s/\n//
-bdqstr
-:dqdone
-s/@!/"/g
-bfolded
-
-# check for multi-line single-quoted string '...\n...' -- fold to one line
-:sqstr
-# remove all quote pairs
-s/'\([^']*\)'/@!\1@!/g
-# done if no dangling quote
-/'/!bsqdone
-# otherwise, slurp next line and try again
-N
-s/\n//
-bsqstr
-:sqdone
-s/@!/'/g
-bfolded
-
-# found here-doc -- swallow it to avoid false hits within its body (but keep
-# the command to which it was attached)
-:heredoc
-s/^\(.*\)<<\(-*[ 	]*\)[\\'"]*\([A-Za-z0-9_][A-Za-z0-9_]*\)['"]*/<\3>\1?!HERE?!\2\3/
-:hdocsub
-N
-/^<\([^>]*\)>.*\n[ 	]*\1[ 	]*$/!{
-	s/\n.*$//
-	bhdocsub
-}
-s/^<[^>]*>//
-s/\n.*$//
-bfolded
-
-# found "case ... in" -- pass through untouched
-:case
-x
-s/^\([ 	]*\)^/\1(/
-s/?!HERE?!/<</g
-n
-:cascom
-/^[ 	]*#/{
-	N
-	s/.*\n//
-	bcascom
-}
-/^[ 	]*esac/bslurp
-bcase
-
-# found "else" or "elif" -- drop "suspect" from final line before "else" since
-# that line legitimately lacks "&&"
-:else
-x
-s/\( ?!AMP?!\)* ?!AMP?!$//
-x
-bcont
-
-# found "done" closing for-loop or while-loop, or "fi" closing if-then -- drop
-# "suspect" from final contained line since that line legitimately lacks "&&"
-:done
-x
-s/\( ?!AMP?!\)* ?!AMP?!$//
-x
-# is 'done' or 'fi' cuddled with ")" to close subshell?
-/done.*)/bclose
-/fi.*)/bclose
-bchkchn
-
-# found nested multi-line "(...\n...)" -- pass through untouched
-:nest
-x
-:nstslrp
-s/^\([ 	]*\)^/\1(/
-s/?!HERE?!/<</g
-n
-:nstcom
-# comment -- not closing ")" if in comment
-/^[ 	]*#/{
-	N
-	s/.*\n//
-	bnstcom
-}
-# closing ")" on own line -- stop nested slurp
-/^[ 	]*)/bnstcl
-# "$((...))" -- arithmetic expansion; not closing ")"
-/\$(([^)][^)]*))[^)]*$/bnstcnt
-# "$(...)" -- command substitution; not closing ")"
-/\$([^)][^)]*)[^)]*$/bnstcnt
-# closing "...)" -- stop nested slurp
-/)/bnstcl
-:nstcnt
-x
-bnstslrp
-:nstcl
-# is it "))" which closes nested and parent subshells?
-/)[ 	]*)/bslurp
-bchkchn
-
-# found multi-line "{...\n...}" block -- pass through untouched
-:block
-x
-s/^\([ 	]*\)^/\1(/
-s/?!HERE?!/<</g
-n
-:blkcom
-/^[ 	]*#/{
-	N
-	s/.*\n//
-	bblkcom
-}
-# closing "}" -- stop block slurp
-/}/bchkchn
-bblock
-
-# found closing ")" on own line -- drop "suspect" from final line of subshell
-# since that line legitimately lacks "&&" and exit subshell loop
-:clssolo
-x
-s/\( ?!AMP?!\)* ?!AMP?!$//
-s/^\([ 	]*\)^/\1(/
-s/?!HERE?!/<</g
-p
-x
-s/^\([ 	]*\)^/\1(/
-s/?!HERE?!/<</g
-b
-
-# found closing "...)" -- exit subshell loop
-:close
-x
-s/^\([ 	]*\)^/\1(/
-s/?!HERE?!/<</g
-p
-x
-s/^\([ 	]*\)^/\1(/
-s/?!HERE?!/<</g
-b
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/18] t: add skeleton chainlint.pl
  2022-09-01  0:29 ` [PATCH 01/18] t: add skeleton chainlint.pl Eric Sunshine via GitGitGadget
@ 2022-09-01 12:27   ` Ævar Arnfjörð Bjarmason
  2022-09-02 18:53     ` Eric Sunshine
  0 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-09-01 12:27 UTC (permalink / raw)
  To: Eric Sunshine via GitGitGadget
  Cc: git, Jeff King, Elijah Newren, Fabian Stelzer,
	Johannes Schindelin, Eric Sunshine


On Thu, Sep 01 2022, Eric Sunshine via GitGitGadget wrote:

> From: Eric Sunshine <sunshine@sunshineco.com>
> [...]
> diff --git a/t/chainlint.pl b/t/chainlint.pl

I really like this overall direction...

> +use warnings;
> +use strict;

I think that in general we're way overdue for at least a :

	use v5.10.1;

Or even something more aggresive, I think we can definitely depend on a
newer version for this bit of dev tooling.

That makes a lot of things in this series more pleasing to look
at. E.g. you could use named $+{} variables for regexes.

> +package ScriptParser;

I really wish this could be changed to just put this in
t/chainlint/ScriptParser.pm early on, we could set @INC appropriately
and "use" these, which...

> +my $getnow = sub { return time(); };
> +my $interval = sub { return time() - shift; };

Would eliminate any scoping concerns about this sort of thing.

> +if (eval {require Time::HiRes; Time::HiRes->import(); 1;}) {
> +	$getnow = sub { return [Time::HiRes::gettimeofday()]; };
> +	$interval = sub { return Time::HiRes::tv_interval(shift); };
> +}

Is this "require" even needed, Time::HiRes is there since 5.7.* says
"corelist -l Time::HIRes".

> [...]
> +sub check_script {
> +	my ($id, $next_script, $emit) = @_;
> +	my ($nscripts, $ntests, $nerrs) = (0, 0, 0);
> +	while (my $path = $next_script->()) {
> +		$nscripts++;
> +		my $fh;
> +		unless (open($fh, "<", $path)) {
> +			$emit->("?!ERR?! $path: $!\n");

If we can depend on v5.10.1 this can surely become:

	use autodie qw(open close);

No?

> +			$nerrs += () = $s =~ /\?![^?]+\?!/g;

y'know if we add some whitespace there we can conform to
https://metacpan.org/dist/perlsecret/view/lib/perlsecret.pod >:) (not
serious...)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer
  2022-09-01  0:29 ` [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer Eric Sunshine via GitGitGadget
@ 2022-09-01 12:32   ` Ævar Arnfjörð Bjarmason
  2022-09-03  6:00     ` Eric Sunshine
  0 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-09-01 12:32 UTC (permalink / raw)
  To: Eric Sunshine via GitGitGadget
  Cc: git, Jeff King, Elijah Newren, Fabian Stelzer,
	Johannes Schindelin, Eric Sunshine


On Thu, Sep 01 2022, Eric Sunshine via GitGitGadget wrote:

> From: Eric Sunshine <sunshine@sunshineco.com>

Just generally on this series:

> +	$tag =~ s/['"\\]//g;

I think this would be a *lot* easier to read if all of these little
regex decls could be split out into some "grammar" class, or other
helper module/namespace. So e.g.:

	my $SCRIPT_QUOTE_RX = qr/['"\\]/;

Then:

> +	return $cc if $cc =~ /^(?:&&|\|\||>>|;;|<&|>&|<>|>\|)$/;

	my $SCRIPT_WHATEVER_RX = qr/
		^(?:
		&&
		|
		\|\|
		[...]
	/x;

etc., i.e. we could then make use of /x to add inline comments to these.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-09-01  0:29 ` [PATCH 06/18] chainlint.pl: validate test scripts in parallel Eric Sunshine via GitGitGadget
@ 2022-09-01 12:36   ` Ævar Arnfjörð Bjarmason
  2022-09-03  7:51     ` Eric Sunshine
  2022-09-06 22:35   ` Eric Wong
  1 sibling, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-09-01 12:36 UTC (permalink / raw)
  To: Eric Sunshine via GitGitGadget
  Cc: git, Jeff King, Elijah Newren, Fabian Stelzer,
	Johannes Schindelin, Eric Sunshine


On Thu, Sep 01 2022, Eric Sunshine via GitGitGadget wrote:

> From: Eric Sunshine <sunshine@sunshineco.com>
>
> Although chainlint.pl has undergone a good deal of optimization during
> its development -- increasing in speed significantly -- parsing and
> validating 1050+ scripts and 16500+ tests via Perl is not exactly
> instantaneous. However, perceived performance can be improved by taking
> advantage of the fact that there is no interdependence between test
> scripts or test definitions, thus parsing and validating can be done in
> parallel. The number of available cores is determined automatically but
> can be overridden via the --jobs option.

Per your CL:
	
	Ævar offered some sensible comments[2,3] about optimizing the Makefile rules
	related to chainlint, but those optimizations are not tackled here for a few
	reasons: (1) this series is already quite long, (2) I'd like to keep the
	series focused on its primary goal of installing a new and improved linter,
	(3) these patches do not make the Makefile situation any worse[4], and (4)
	those optimizations can easily be done atop this series[5].

I have been running with those t/Makefile changesg locally, but didn't
submit them. FWIW that's here:

	https://github.com/git/git/compare/master...avar:git:avar/t-Makefile-use-dependency-graph-for-check-chainlint

Which I'm not entirely sure I'm happy about, and it's jeust about the
chainlint tests, but...

> +sub ncores {
> +	# Windows
> +	return $ENV{NUMBER_OF_PROCESSORS} if exists($ENV{NUMBER_OF_PROCESSORS});
> +	# Linux / MSYS2 / Cygwin / WSL
> +	do { local @ARGV='/proc/cpuinfo'; return scalar(grep(/^processor\s*:/, <>)); } if -r '/proc/cpuinfo';
> +	# macOS & BSD
> +	return qx/sysctl -n hw.ncpu/ if $^O =~ /(?:^darwin$|bsd)/;
> +	return 1;
> +}
> +
>  sub show_stats {
>  	my ($start_time, $stats) = @_;
>  	my $walltime = $interval->($start_time);
> @@ -621,7 +633,9 @@ sub exit_code {
>  Getopt::Long::Configure(qw{bundling});
>  GetOptions(
>  	"emit-all!" => \$emit_all,
> +	"jobs|j=i" => \$jobs,
>  	"stats|show-stats!" => \$show_stats) or die("option error\n");
> +$jobs = ncores() if $jobs < 1;
>  
>  my $start_time = $getnow->();
>  my @stats;
> @@ -633,6 +647,40 @@ unless (@scripts) {
>  	exit;
>  }
>  
> -push(@stats, check_script(1, sub { shift(@scripts); }, sub { print(@_); }));
> +unless ($Config{useithreads} && eval {
> +	require threads; threads->import();
> +	require Thread::Queue; Thread::Queue->import();
> +	1;
> +	}) {
> +	push(@stats, check_script(1, sub { shift(@scripts); }, sub { print(@_); }));
> +	show_stats($start_time, \@stats) if $show_stats;
> +	exit(exit_code(\@stats));
> +}
> +
> +my $script_queue = Thread::Queue->new();
> +my $output_queue = Thread::Queue->new();
> +
> +sub next_script { return $script_queue->dequeue(); }
> +sub emit { $output_queue->enqueue(@_); }
> +
> +sub monitor {
> +	while (my $s = $output_queue->dequeue()) {
> +		print($s);
> +	}
> +}
> +
> +my $mon = threads->create({'context' => 'void'}, \&monitor);
> +threads->create({'context' => 'list'}, \&check_script, $_, \&next_script, \&emit) for 1..$jobs;
> +
> +$script_queue->enqueue(@scripts);
> +$script_queue->end();
> +
> +for (threads->list()) {
> +	push(@stats, $_->join()) unless $_ == $mon;
> +}
> +
> +$output_queue->end();
> +$mon->join();

Maybe I'm misunderstanding this whole thing, but this really seems like
the wrong direction in an otherwise fantastic direction of a series.

I.e. it's *great* that we can do chain-lint without needing to actually
execute the *.sh file, this series adds a lint parser that can parse
those *.sh "at rest".

But in your 16/18 you then do:
	
	+if test "${GIT_TEST_CHAIN_LINT:-1}" != 0
	+then
	+	"$PERL_PATH" "$TEST_DIRECTORY/chainlint.pl" "$0" ||
	+		BUG "lint error (see '?!...!? annotations above)"
	+fi
	
I may just be missing something here, but why not instead just borrow
what I did for "lint-docs" in 8650c6298c1 (doc lint: make "lint-docs"
non-.PHONY, 2021-10-15)?

I.e. if we can run against t0001-init.sh or whatever *once* to see if it
chain-lints OK then surely we could have a rule like:

	t0001-init.sh.chainlint-ok: t0001-init.sh
		perl chainlint.pl $< >$@

Then whenever you change t0001-init.sh we refresh that
t0001-init.sh.chainlint-ok, if the chainlint.pl exits non-zero we'll
fail to make it, and will unlink that t0001-init.sh.chainlint-ok.

That way you wouldn't need any parallelism in the Perl script, because
you'd have "make" take care of it, and the common case of re-testing
where the speed matters would be that we woudln't need to run this at
all, or would only re-run it for the test scripts that changed.

(Obviously a "real" implementation would want to create that ".ok" file
in t/.build/chainlint" or whatever)

A drawback is that you'd probably be slower on the initial run, as you'd
spwn N chainlint.pl. You could use $? instead of $< to get around that,
but that requires some re-structuring, and I've found it to generally
not be worth it.

It would also have the drawback that a:

	./t0001-init.sh

wouldn't run the chain-lint, but this would:

	make T=t0001-init.sh

But if want the former to work we could carry some
"GIT_TEST_VIA_MAKEFILE" variable or whatever, and only run the
test-via-test-lib.sh if it isn't set.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: several messages
  2022-09-01  0:29 ` [PATCH 18/18] t: retire unused chainlint.sed Eric Sunshine via GitGitGadget
@ 2022-09-02 12:42   ` Johannes Schindelin
  2022-09-02 18:16     ` Eric Sunshine
  0 siblings, 1 reply; 51+ messages in thread
From: Johannes Schindelin @ 2022-09-02 12:42 UTC (permalink / raw)
  To: Eric Sunshine via GitGitGadget,
	Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Elijah Newren, Fabian Stelzer, Eric Sunshine

Hi Eric,

On Thu, 1 Sep 2022, Eric Sunshine via GitGitGadget wrote:

>  contrib/buildsystems/CMakeLists.txt           |   2 +-
>  t/Makefile                                    |  49 +-
>  t/README                                      |   5 -
>  t/chainlint.pl                                | 730 ++++++++++++++++++
>  t/chainlint.sed                               | 399 ----------
>  t/chainlint/blank-line-before-esac.expect     |  18 +
>  t/chainlint/blank-line-before-esac.test       |  19 +
>  t/chainlint/block.expect                      |  15 +-
>  t/chainlint/block.test                        |  15 +-
>  t/chainlint/chain-break-background.expect     |   9 +
>  t/chainlint/chain-break-background.test       |  10 +
>  t/chainlint/chain-break-continue.expect       |  12 +
>  t/chainlint/chain-break-continue.test         |  13 +
>  t/chainlint/chain-break-false.expect          |   9 +
>  t/chainlint/chain-break-false.test            |  10 +
>  t/chainlint/chain-break-return-exit.expect    |  19 +
>  t/chainlint/chain-break-return-exit.test      |  23 +
>  t/chainlint/chain-break-status.expect         |   9 +
>  t/chainlint/chain-break-status.test           |  11 +
>  t/chainlint/chained-block.expect              |   9 +
>  t/chainlint/chained-block.test                |  11 +
>  t/chainlint/chained-subshell.expect           |  10 +
>  t/chainlint/chained-subshell.test             |  13 +
>  .../command-substitution-subsubshell.expect   |   2 +
>  .../command-substitution-subsubshell.test     |   3 +
>  t/chainlint/complex-if-in-cuddled-loop.expect |   2 +-
>  t/chainlint/double-here-doc.expect            |   2 +
>  t/chainlint/double-here-doc.test              |  12 +
>  t/chainlint/dqstring-line-splice.expect       |   3 +
>  t/chainlint/dqstring-line-splice.test         |   7 +
>  t/chainlint/dqstring-no-interpolate.expect    |  11 +
>  t/chainlint/dqstring-no-interpolate.test      |  15 +
>  t/chainlint/empty-here-doc.expect             |   3 +
>  t/chainlint/empty-here-doc.test               |   5 +
>  t/chainlint/exclamation.expect                |   4 +
>  t/chainlint/exclamation.test                  |   8 +
>  t/chainlint/for-loop-abbreviated.expect       |   5 +
>  t/chainlint/for-loop-abbreviated.test         |   6 +
>  t/chainlint/for-loop.expect                   |   4 +-
>  t/chainlint/function.expect                   |  11 +
>  t/chainlint/function.test                     |  13 +
>  t/chainlint/here-doc-indent-operator.expect   |   5 +
>  t/chainlint/here-doc-indent-operator.test     |  13 +
>  t/chainlint/here-doc-multi-line-string.expect |   3 +-
>  t/chainlint/if-condition-split.expect         |   7 +
>  t/chainlint/if-condition-split.test           |   8 +
>  t/chainlint/if-in-loop.expect                 |   2 +-
>  t/chainlint/if-in-loop.test                   |   2 +-
>  t/chainlint/loop-detect-failure.expect        |  15 +
>  t/chainlint/loop-detect-failure.test          |  17 +
>  t/chainlint/loop-detect-status.expect         |  18 +
>  t/chainlint/loop-detect-status.test           |  19 +
>  t/chainlint/loop-in-if.expect                 |   2 +-
>  t/chainlint/loop-upstream-pipe.expect         |  10 +
>  t/chainlint/loop-upstream-pipe.test           |  11 +
>  t/chainlint/multi-line-string.expect          |  11 +-
>  t/chainlint/nested-loop-detect-failure.expect |  31 +
>  t/chainlint/nested-loop-detect-failure.test   |  35 +
>  t/chainlint/nested-subshell.expect            |   2 +-
>  t/chainlint/one-liner-for-loop.expect         |   9 +
>  t/chainlint/one-liner-for-loop.test           |  10 +
>  t/chainlint/return-loop.expect                |   5 +
>  t/chainlint/return-loop.test                  |   6 +
>  t/chainlint/semicolon.expect                  |   2 +-
>  t/chainlint/sqstring-in-sqstring.expect       |   4 +
>  t/chainlint/sqstring-in-sqstring.test         |   5 +
>  t/chainlint/t7900-subtree.expect              |  13 +-
>  t/chainlint/token-pasting.expect              |  27 +
>  t/chainlint/token-pasting.test                |  32 +
>  t/chainlint/while-loop.expect                 |   4 +-
>  t/t0027-auto-crlf.sh                          |   7 +-
>  t/t3070-wildmatch.sh                          |   5 -
>  t/test-lib.sh                                 |  12 +-
>  73 files changed, 1439 insertions(+), 449 deletions(-)
>  create mode 100755 t/chainlint.pl
>  delete mode 100644 t/chainlint.sed
>  create mode 100644 t/chainlint/blank-line-before-esac.expect
>  create mode 100644 t/chainlint/blank-line-before-esac.test
>  create mode 100644 t/chainlint/chain-break-background.expect
>  create mode 100644 t/chainlint/chain-break-background.test
>  create mode 100644 t/chainlint/chain-break-continue.expect
>  create mode 100644 t/chainlint/chain-break-continue.test
>  create mode 100644 t/chainlint/chain-break-false.expect
>  create mode 100644 t/chainlint/chain-break-false.test
>  create mode 100644 t/chainlint/chain-break-return-exit.expect
>  create mode 100644 t/chainlint/chain-break-return-exit.test
>  create mode 100644 t/chainlint/chain-break-status.expect
>  create mode 100644 t/chainlint/chain-break-status.test
>  create mode 100644 t/chainlint/chained-block.expect
>  create mode 100644 t/chainlint/chained-block.test
>  create mode 100644 t/chainlint/chained-subshell.expect
>  create mode 100644 t/chainlint/chained-subshell.test
>  create mode 100644 t/chainlint/command-substitution-subsubshell.expect
>  create mode 100644 t/chainlint/command-substitution-subsubshell.test
>  create mode 100644 t/chainlint/double-here-doc.expect
>  create mode 100644 t/chainlint/double-here-doc.test
>  create mode 100644 t/chainlint/dqstring-line-splice.expect
>  create mode 100644 t/chainlint/dqstring-line-splice.test
>  create mode 100644 t/chainlint/dqstring-no-interpolate.expect
>  create mode 100644 t/chainlint/dqstring-no-interpolate.test
>  create mode 100644 t/chainlint/empty-here-doc.expect
>  create mode 100644 t/chainlint/empty-here-doc.test
>  create mode 100644 t/chainlint/exclamation.expect
>  create mode 100644 t/chainlint/exclamation.test
>  create mode 100644 t/chainlint/for-loop-abbreviated.expect
>  create mode 100644 t/chainlint/for-loop-abbreviated.test
>  create mode 100644 t/chainlint/function.expect
>  create mode 100644 t/chainlint/function.test
>  create mode 100644 t/chainlint/here-doc-indent-operator.expect
>  create mode 100644 t/chainlint/here-doc-indent-operator.test
>  create mode 100644 t/chainlint/if-condition-split.expect
>  create mode 100644 t/chainlint/if-condition-split.test
>  create mode 100644 t/chainlint/loop-detect-failure.expect
>  create mode 100644 t/chainlint/loop-detect-failure.test
>  create mode 100644 t/chainlint/loop-detect-status.expect
>  create mode 100644 t/chainlint/loop-detect-status.test
>  create mode 100644 t/chainlint/loop-upstream-pipe.expect
>  create mode 100644 t/chainlint/loop-upstream-pipe.test
>  create mode 100644 t/chainlint/nested-loop-detect-failure.expect
>  create mode 100644 t/chainlint/nested-loop-detect-failure.test
>  create mode 100644 t/chainlint/one-liner-for-loop.expect
>  create mode 100644 t/chainlint/one-liner-for-loop.test
>  create mode 100644 t/chainlint/return-loop.expect
>  create mode 100644 t/chainlint/return-loop.test
>  create mode 100644 t/chainlint/sqstring-in-sqstring.expect
>  create mode 100644 t/chainlint/sqstring-in-sqstring.test
>  create mode 100644 t/chainlint/token-pasting.expect
>  create mode 100644 t/chainlint/token-pasting.test

This looks like it was a lot of work. And that it would be a lot of work
to review, too, and certainly even more work to maintain.

Are we really sure that we want to burden the Git project with this much
stuff that is not actually related to Git's core functionality?

It would be one thing if we could use a well-maintained third-party tool
to do this job. But adding this to our plate? I hope we can avoid that.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: several messages
  2022-09-02 12:42   ` several messages Johannes Schindelin
@ 2022-09-02 18:16     ` Eric Sunshine
  2022-09-02 18:34       ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Eric Sunshine @ 2022-09-02 18:16 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Eric Sunshine via GitGitGadget,
	Ævar Arnfjörð Bjarmason, Git List, Jeff King,
	Elijah Newren, Fabian Stelzer

On Fri, Sep 2, 2022 at 8:42 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> On Thu, 1 Sep 2022, Eric Sunshine via GitGitGadget wrote:
> >  t/chainlint.pl                                | 730 ++++++++++++++++++
> >  t/chainlint.sed                               | 399 ----------
> >  t/chainlint/blank-line-before-esac.expect     |  18 +
> >  t/chainlint/blank-line-before-esac.test       |  19 +
> >  ...
>
> This looks like it was a lot of work. And that it would be a lot of work
> to review, too, and certainly even more work to maintain.
>
> Are we really sure that we want to burden the Git project with this much
> stuff that is not actually related to Git's core functionality?
>
> It would be one thing if we could use a well-maintained third-party tool
> to do this job. But adding this to our plate? I hope we can avoid that.

I understand your concerns about review and maintenance burden, and
you're not the first to make such observations; when chainlint.sed was
submitted, it was greeted with similar concerns[1,2], all very
understandable. The key takeaway[3] from that conversation, though,
was that, unlike user-facing features which must be reviewed in detail
and maintained in perpetuity, this is a mere developer aid which can
be easily ejected from the project if it ever becomes a maintenance
burden or shows itself to be unreliable. Potential maintenance burden
aside, a very real benefit of such a tool is that it should help
prevent bugs from slipping into the project going forward[4], which is
indeed the aim of all our developer-focused aids.

In more practical terms, despite initial concerns, in the 4+ years
since its introduction, the maintenance cost of chainlint.sed has been
nearly zero. Very early on, there was a report[5] that chainlint.sed
was showing a false-positive in a `contrib` test script; the developer
quickly responded with a fix[6]. The only other maintenance issues
were a couple dead-simple changes[7,8] to shorten "labels" to support
older versions of `sed`. (As for the chainlint self-tests, the
maintenance cost has been exactly zero). My hope is that chainlint.pl
should have a similar track-record, but it can easily be dropped from
the project if not.

[1]: https://lore.kernel.org/git/xmqqk1q11mkj.fsf@gitster-ct.c.googlers.com/
[2]: https://lore.kernel.org/git/20180712165608.GA10515@sigill.intra.peff.net/
[3]: https://lore.kernel.org/git/CAPig+cRmAkiYqFXwRAkQALDoOo-79r2iAumdEJEZhBnETvL-fw@mail.gmail.com/
[4]: https://lore.kernel.org/git/xmqqin5kw7q3.fsf@gitster-ct.c.googlers.com/
[5]: https://lore.kernel.org/git/20180730181356.GA156463@aiede.svl.corp.google.com/
[6]: https://lore.kernel.org/git/20180807082135.60913-1-sunshine@sunshineco.com/
[7]: https://lore.kernel.org/git/20180824152016.20286-5-avarab@gmail.com/
[8]: https://lore.kernel.org/git/d15ed626de65c51ef2ba31020eeb2111fb8e091f.1596675905.git.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: several messages
  2022-09-02 18:16     ` Eric Sunshine
@ 2022-09-02 18:34       ` Jeff King
  2022-09-02 18:44         ` Junio C Hamano
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2022-09-02 18:34 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Johannes Schindelin, Eric Sunshine via GitGitGadget,
	Ævar Arnfjörð Bjarmason, Git List, Elijah Newren,
	Fabian Stelzer

On Fri, Sep 02, 2022 at 02:16:21PM -0400, Eric Sunshine wrote:

> > It would be one thing if we could use a well-maintained third-party tool
> > to do this job. But adding this to our plate? I hope we can avoid that.
> 
> I understand your concerns about review and maintenance burden, and
> you're not the first to make such observations; when chainlint.sed was
> submitted, it was greeted with similar concerns[1,2], all very
> understandable. The key takeaway[3] from that conversation, though,
> was that, unlike user-facing features which must be reviewed in detail
> and maintained in perpetuity, this is a mere developer aid which can
> be easily ejected from the project if it ever becomes a maintenance
> burden or shows itself to be unreliable. Potential maintenance burden
> aside, a very real benefit of such a tool is that it should help
> prevent bugs from slipping into the project going forward[4], which is
> indeed the aim of all our developer-focused aids.

Thanks for this response and especially the links. My initial gut
response was similar to Dscho's. Which is not surprising, because it
apparently was also my initial response to chainlint.sed back then. ;)

But I do think that chainlint.sed has proven itself to be both useful
and not much of a maintenance burden. My only real complaint was the
additional runtime in a few corner cases, and that is exactly what
you're addressing here.

I'm not excited about carefully reviewing it. At the same time, given
the low stakes, I'm kind of willing to accept that between the tests and
the results of running it on the current code base, the proof is in the
pudding.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: several messages
  2022-09-02 18:34       ` Jeff King
@ 2022-09-02 18:44         ` Junio C Hamano
  0 siblings, 0 replies; 51+ messages in thread
From: Junio C Hamano @ 2022-09-02 18:44 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Sunshine, Johannes Schindelin,
	Eric Sunshine via GitGitGadget,
	Ævar Arnfjörð Bjarmason, Git List, Elijah Newren,
	Fabian Stelzer

Jeff King <peff@peff.net> writes:

> Thanks for this response and especially the links. My initial gut
> response was similar to Dscho's. Which is not surprising, because it
> apparently was also my initial response to chainlint.sed back then. ;)
>
> But I do think that chainlint.sed has proven itself to be both useful
> and not much of a maintenance burden. My only real complaint was the
> additional runtime in a few corner cases, and that is exactly what
> you're addressing here.

I have nothing to add to the above ;-)  Thanks all (including Dscho
who made us be more explicit in pros-and-cons).



^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 01/18] t: add skeleton chainlint.pl
  2022-09-01 12:27   ` Ævar Arnfjörð Bjarmason
@ 2022-09-02 18:53     ` Eric Sunshine
  0 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine @ 2022-09-02 18:53 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Eric Sunshine via GitGitGadget, Git List, Jeff King,
	Elijah Newren, Fabian Stelzer, Johannes Schindelin

On Thu, Sep 1, 2022 at 8:32 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Thu, Sep 01 2022, Eric Sunshine via GitGitGadget wrote:
> > From: Eric Sunshine <sunshine@sunshineco.com>
> > [...]
> > diff --git a/t/chainlint.pl b/t/chainlint.pl
>
> I really like this overall direction...

Thanks for running an eye over the patches.

> > +use warnings;
> > +use strict;
>
> I think that in general we're way overdue for at least a :
>
>         use v5.10.1;
>
> Or even something more aggresive, I think we can definitely depend on a
> newer version for this bit of dev tooling.

Being stuck with an 11+ year-old primary development machine which
can't be upgraded to a newer OS due to vendor end-of-life declaration,
and with old tools installed, I have little or no interest in bumping
the minimum version, especially since older Perl versions are
perfectly adequate for this task. Undertaking such a version bump
would also be outside the scope of this patch series (and I simply
don't have the free time or desire to pursue it).

> That makes a lot of things in this series more pleasing to look
> at. E.g. you could use named $+{} variables for regexes.

Perhaps, but (1) that would not be very relevant for this script which
typically only extracts "$1", and (2) I've rarely found cases when
named variables help significantly with clarity, but then most of my
real-life regexes generally only extract one or two bits of
information, periodically three, and those bits ("$1", "$2", etc.) are
immediately assigned to variables with meaningful names.

> > +package ScriptParser;
>
> I really wish this could be changed to just put this in
> t/chainlint/ScriptParser.pm early on, we could set @INC appropriately
> and "use" these, which...

I intentionally avoided splitting this into multiple modules because I
wanted it to be easy drop into or adapt to other projects (i.e.
sharness[1]). Of course, it is effectively a shell parser written in
Perl, and it's conceivable that the parser part of it could have uses
outside of Git, so modularizing it might be a good idea, but that's a
task for some future date if such a need arises.

[1]: https://github.com/chriscool/sharness

> > +my $getnow = sub { return time(); };
> > +my $interval = sub { return time() - shift; };
>
> Would eliminate any scoping concerns about this sort of thing.

As above, this is easily addressed if/when someone ever wants to reuse
the code outside of Git for some other purpose. I doubt it's worth
worrying about now.

> > +if (eval {require Time::HiRes; Time::HiRes->import(); 1;}) {
> > +     $getnow = sub { return [Time::HiRes::gettimeofday()]; };
> > +     $interval = sub { return Time::HiRes::tv_interval(shift); };
> > +}
>
> Is this "require" even needed, Time::HiRes is there since 5.7.* says
> "corelist -l Time::HIRes".

Unfortunately, this is needed. The Windows CI instances the Git
project uses don't have Time::HiRes installed (and it's outside the
scope of this series to address shortcomings in the CI
infrastructure).

> > +sub check_script {
> > +     my ($id, $next_script, $emit) = @_;
> > +     my ($nscripts, $ntests, $nerrs) = (0, 0, 0);
> > +     while (my $path = $next_script->()) {
> > +             $nscripts++;
> > +             my $fh;
> > +             unless (open($fh, "<", $path)) {
> > +                     $emit->("?!ERR?! $path: $!\n");
>
> If we can depend on v5.10.1 this can surely become:
>
>         use autodie qw(open close);
>
> No?

No. It's clipped in your response, but the full snippet looks like this:

    unless (open($fh, "<", $path)) {
        $emit->("?!ERR?! $path: $!\n");
        next;
    }

The important point is that I _don't_ want the program to "die" if it
can't open an input file; instead, it should continue processing all
the other input files, and the open-failure should be reported as just
another error/problem it encountered along the way.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 16/18] test-lib: replace chainlint.sed with chainlint.pl
  2022-09-01  0:29 ` [PATCH 16/18] test-lib: replace chainlint.sed with chainlint.pl Eric Sunshine via GitGitGadget
@ 2022-09-03  5:07   ` Elijah Newren
  2022-09-03  5:24     ` Eric Sunshine
  0 siblings, 1 reply; 51+ messages in thread
From: Elijah Newren @ 2022-09-03  5:07 UTC (permalink / raw)
  To: Eric Sunshine via GitGitGadget
  Cc: Git Mailing List, Jeff King,
	Ævar Arnfjörð Bjarmason, Fabian Stelzer,
	Johannes Schindelin, Eric Sunshine

On Wed, Aug 31, 2022 at 5:30 PM Eric Sunshine via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Eric Sunshine <sunshine@sunshineco.com>
>
> By automatically invoking chainlint.sed upon each test it runs,
> `test_run_` in test-lib.sh ensures that broken &&-chains will be
> detected early as tests are modified or new are tests created since it

s/new are tests created/new tests are created/  ?


> is typical to run a test script manually (i.e. `./t1234-test-script.sh`)
> during test development. Now that the implementation of chainlint.pl is
> complete, modify test-lib.sh to invoke it automatically instead of
> chainlint.sed each time a test script is run.
>
> This change reduces the number of "linter" invocations from 26800+ (once
> per test run) down to 1050+ (once per test script), however, a
> subsequent change will drop the number of invocations to 1 per `make
> test`, thus fully realizing the benefit of the new linter.
>
> Note that the "magic exit code 117" &&-chain checker added by bb79af9d09
> (t/test-lib: introduce --chain-lint option, 2015-03-20) which is built
> into t/test-lib.sh is retained since it has near zero-cost and
> (theoretically) may catch a broken &&-chain not caught by chainlint.pl.
>
> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
> ---
>  contrib/buildsystems/CMakeLists.txt | 2 +-
>  t/test-lib.sh                       | 9 +++++++--
>  2 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/contrib/buildsystems/CMakeLists.txt b/contrib/buildsystems/CMakeLists.txt
> index 2237109b57f..ca358a21a5f 100644
> --- a/contrib/buildsystems/CMakeLists.txt
> +++ b/contrib/buildsystems/CMakeLists.txt
> @@ -1076,7 +1076,7 @@ if(NOT ${CMAKE_BINARY_DIR}/CMakeCache.txt STREQUAL ${CACHE_PATH})
>                 "string(REPLACE \"\${GIT_BUILD_DIR_REPL}\" \"GIT_BUILD_DIR=\\\"$TEST_DIRECTORY/../${BUILD_DIR_RELATIVE}\\\"\" content \"\${content}\")\n"
>                 "file(WRITE ${CMAKE_SOURCE_DIR}/t/test-lib.sh \${content})")
>         #misc copies
> -       file(COPY ${CMAKE_SOURCE_DIR}/t/chainlint.sed DESTINATION ${CMAKE_BINARY_DIR}/t/)
> +       file(COPY ${CMAKE_SOURCE_DIR}/t/chainlint.pl DESTINATION ${CMAKE_BINARY_DIR}/t/)
>         file(COPY ${CMAKE_SOURCE_DIR}/po/is.po DESTINATION ${CMAKE_BINARY_DIR}/po/)
>         file(COPY ${CMAKE_SOURCE_DIR}/mergetools/tkdiff DESTINATION ${CMAKE_BINARY_DIR}/mergetools/)
>         file(COPY ${CMAKE_SOURCE_DIR}/contrib/completion/git-prompt.sh DESTINATION ${CMAKE_BINARY_DIR}/contrib/completion/)
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index dc0d0591095..a65df2fd220 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1091,8 +1091,7 @@ test_run_ () {
>                 trace=
>                 # 117 is magic because it is unlikely to match the exit
>                 # code of other programs
> -               if $(printf '%s\n' "$1" | sed -f "$GIT_BUILD_DIR/t/chainlint.sed" | grep -q '?![A-Z][A-Z]*?!') ||
> -                       test "OK-117" != "$(test_eval_ "(exit 117) && $1${LF}${LF}echo OK-\$?" 3>&1)"
> +               if test "OK-117" != "$(test_eval_ "(exit 117) && $1${LF}${LF}echo OK-\$?" 3>&1)"
>                 then
>                         BUG "broken &&-chain or run-away HERE-DOC: $1"
>                 fi
> @@ -1588,6 +1587,12 @@ then
>         BAIL_OUT_ENV_NEEDS_SANITIZE_LEAK "GIT_TEST_SANITIZE_LEAK_LOG=true"
>  fi
>
> +if test "${GIT_TEST_CHAIN_LINT:-1}" != 0
> +then
> +       "$PERL_PATH" "$TEST_DIRECTORY/chainlint.pl" "$0" ||
> +               BUG "lint error (see '?!...!? annotations above)"
> +fi
> +
>  # Last-minute variable setup
>  USER_HOME="$HOME"
>  HOME="$TRASH_DIRECTORY"
> --
> gitgitgadget
>

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 16/18] test-lib: replace chainlint.sed with chainlint.pl
  2022-09-03  5:07   ` Elijah Newren
@ 2022-09-03  5:24     ` Eric Sunshine
  0 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine @ 2022-09-03  5:24 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Eric Sunshine via GitGitGadget, Git Mailing List, Jeff King,
	Ævar Arnfjörð Bjarmason, Fabian Stelzer,
	Johannes Schindelin

On Sat, Sep 3, 2022 at 1:07 AM Elijah Newren <newren@gmail.com> wrote:
> On Wed, Aug 31, 2022 at 5:30 PM Eric Sunshine via GitGitGadget
> > By automatically invoking chainlint.sed upon each test it runs,
> > `test_run_` in test-lib.sh ensures that broken &&-chains will be
> > detected early as tests are modified or new are tests created since it
>
> s/new are tests created/new tests are created/  ?

That does sound better (except perhaps to Yoda).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer
  2022-09-01 12:32   ` Ævar Arnfjörð Bjarmason
@ 2022-09-03  6:00     ` Eric Sunshine
  0 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine @ 2022-09-03  6:00 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Eric Sunshine via GitGitGadget, Git List, Jeff King,
	Elijah Newren, Fabian Stelzer, Johannes Schindelin

On Thu, Sep 1, 2022 at 8:35 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Thu, Sep 01 2022, Eric Sunshine via GitGitGadget wrote:
> Just generally on this series:
>
> > +     $tag =~ s/['"\\]//g;
>
> I think this would be a *lot* easier to read if all of these little
> regex decls could be split out into some "grammar" class, or other
> helper module/namespace. So e.g.:
>
>         my $SCRIPT_QUOTE_RX = qr/['"\\]/;

Taken out of context (as in the quoted snippet), it may indeed be
difficult to understand what that line is doing, however in context
with a meaningful function name:

    sub scan_heredoc_tag {
        ...
        my $tag = $self->scan_token();
        $tag =~ s/['"\\]//g;
        push(@{$self->{heretags}}, $indented ? "\t$tag" : "$tag");
        ...
    }

for someone who is familiar with common heredoc tag quoting/escaping
(i.e. <<'EOF', <<"EOF", <<\EOF), I find the inline character class
`['"\\]` much easier to understand than some opaque name such as
$SCRIPT_QUOTE_RX, doubly so because the definition of the named regex
might be far removed from the actual code which uses it, which would
require going and studying that definition before being able to
understand what this code is doing.

I grasp you made that name up on-the-fly as an example, but that does
highlight another reason why I'd be hesitant to try to pluck out and
name these regexes. Specifically, naming is hard and I don't trust
that I could come up with succinct meaningful names which would convey
what a regex does as well as the actual regex itself conveys what it
does. In context within the well-named function, `s/['"\\]//g` is
obviously stripping quoting/escaping from the tag name; trying to come
up with a succinct yet accurate name to convey that intention is
difficult. And this is just one example. The script is littered with
little regexes like this, and they are almost all unique, thus making
the task of inventing succinct meaningful names extra difficult. And,
as noted above, I'm not at all convinced that plucking the regex out
of its natural context -- thus making the reader go elsewhere to find
the definition of the regex -- would help improve comprehension.

> Then:
>
> > +     return $cc if $cc =~ /^(?:&&|\|\||>>|;;|<&|>&|<>|>\|)$/;
>
>         my $SCRIPT_WHATEVER_RX = qr/
>                 ^(?:
>                 &&
>                 |
>                 \|\|
>                 [...]
>         /x;
>
> etc., i.e. we could then make use of /x to add inline comments to these.

`/x` does make this slightly easier to grok, and this is a an example
of a regex which might be easy to name (i.e. $TWO_CHAR_OPERATOR), but
-- extra mandatory escaping aside -- it's not hard to understand this
one as-is; it's pretty obvious that it's looking for operators `&&`,
`||`, `>>`, `;;`, `<&`, `>&`, `<>`, and `>|`.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-09-01 12:36   ` Ævar Arnfjörð Bjarmason
@ 2022-09-03  7:51     ` Eric Sunshine
  0 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine @ 2022-09-03  7:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Eric Sunshine via GitGitGadget, Git List, Jeff King,
	Elijah Newren, Fabian Stelzer, Johannes Schindelin

On Thu, Sep 1, 2022 at 8:47 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Thu, Sep 01 2022, Eric Sunshine via GitGitGadget wrote:
> > Although chainlint.pl has undergone a good deal of optimization during
> > its development -- increasing in speed significantly -- parsing and
> > validating 1050+ scripts and 16500+ tests via Perl is not exactly
> > instantaneous. However, perceived performance can be improved by taking
> > advantage of the fact that there is no interdependence between test
> > scripts or test definitions, thus parsing and validating can be done in
> > parallel. The number of available cores is determined automatically but
> > can be overridden via the --jobs option.
>
> Per your CL:
>
>         Ævar offered some sensible comments[2,3] about optimizing the Makefile rules
>         related to chainlint, but those optimizations are not tackled here for a few
>         reasons: (1) this series is already quite long, (2) I'd like to keep the
>         series focused on its primary goal of installing a new and improved linter,
>         (3) these patches do not make the Makefile situation any worse[4], and (4)
>         those optimizations can easily be done atop this series[5].
>
> I have been running with those t/Makefile changesg locally, but didn't
> submit them. FWIW that's here:
>
>         https://github.com/git/git/compare/master...avar:git:avar/t-Makefile-use-dependency-graph-for-check-chainlint

Thanks for the link. It's nice to see an actual implementation. I
think most of what you wrote in the commit message and the patch
itself are still meaningful following this series.

> > +my $script_queue = Thread::Queue->new();
> > +my $output_queue = Thread::Queue->new();
> > +
> > +my $mon = threads->create({'context' => 'void'}, \&monitor);
> > +threads->create({'context' => 'list'}, \&check_script, $_, \&next_script, \&emit) for 1..$jobs;
>
> Maybe I'm misunderstanding this whole thing, but this really seems like
> the wrong direction in an otherwise fantastic direction of a series.
>
> I.e. it's *great* that we can do chain-lint without needing to actually
> execute the *.sh file, this series adds a lint parser that can parse
> those *.sh "at rest".
>
> But in your 16/18 you then do:
>
>         +if test "${GIT_TEST_CHAIN_LINT:-1}" != 0
>         +then
>         +       "$PERL_PATH" "$TEST_DIRECTORY/chainlint.pl" "$0" ||
>         +               BUG "lint error (see '?!...!? annotations above)"
>         +fi
>
> I may just be missing something here, but why not instead just borrow
> what I did for "lint-docs" in 8650c6298c1 (doc lint: make "lint-docs"
> non-.PHONY, 2021-10-15)?

I may be misunderstanding, but regarding patch [16/18], I think you
answered your own question at the end of your response when you
pointed out the drawback that you wouldn't get linting when running
the test script manually (i.e. `./t1234-test-stuff.sh`). Ensuring that
the linter is invoked when running a test script manually is important
(at least to me) since it's a frequent step when developing a new test
or modifying an existing test. [16/18] is present to ensure that we
still get that behavior.

> I.e. if we can run against t0001-init.sh or whatever *once* to see if it
> chain-lints OK then surely we could have a rule like:
>
>         t0001-init.sh.chainlint-ok: t0001-init.sh
>                 perl chainlint.pl $< >$@
>
> Then whenever you change t0001-init.sh we refresh that
> t0001-init.sh.chainlint-ok, if the chainlint.pl exits non-zero we'll
> fail to make it, and will unlink that t0001-init.sh.chainlint-ok.
>
> That way you wouldn't need any parallelism in the Perl script, because
> you'd have "make" take care of it, and the common case of re-testing
> where the speed matters would be that we woudln't need to run this at
> all, or would only re-run it for the test scripts that changed.

A couple comments regarding parallelism: (1) as mentioned in another
response, when developing the script, I had in mind that it might be
useful for other projects (i.e. `sharness`), thus should be able to
stand on its own without advanced Makefile support, and (2) process
creation on Microsoft Windows is _very_ expensive and slow, so on that
platform, being able to lint all tests in all script with a single
invocation is a big win over running the linter 1050+ times, once for
each test script.

That's not to discredit any of your points... I'm just conveying some
of my thought process.

> (Obviously a "real" implementation would want to create that ".ok" file
> in t/.build/chainlint" or whatever)
>
> A drawback is that you'd probably be slower on the initial run, as you'd
> spwn N chainlint.pl. You could use $? instead of $< to get around that,
> but that requires some re-structuring, and I've found it to generally
> not be worth it.

The $? trick might be something Windows folk would appreciate, and
even those of us in macOS land (at least those of us with old hardware
and OS).

> It would also have the drawback that a:
>
>         ./t0001-init.sh
>
> wouldn't run the chain-lint, but this would:
>
>         make T=t0001-init.sh
>
> But if want the former to work we could carry some
> "GIT_TEST_VIA_MAKEFILE" variable or whatever, and only run the
> test-via-test-lib.sh if it isn't set.

I may be misunderstanding, but isn't the GIT_TEST_CHAIN_LINT variable
useful for this already, as in [16/18]?

Regarding your observations as a whole, I think the extract from the
cover letter which you cited above is relevant to my response. I don't
disagree with your points about using the Makefile to optimize away
unnecessary invocations of the linter, or that doing so can be a
useful future direction. As mentioned in the cover letter, though, I
think that such optimizations are outside the scope of this series
which -- aside from installing an improved linter -- aims to maintain
the status quo; in particular, this series ensures that (1) tests get
linted as they are being written/modified when the developer runs the
script manually `./t1234-test-stuff.sh`, and (2) all tests get linted
upon `make test`.

(The other reason why I'd prefer to see such optimizations applied
atop this series is that I simply don't have the time these days to
devote to major changes of direction in this series, which I think
meets its stated goals without making the situation any worse or
making it any more difficult to apply the optimizations you describe.
And the new linter has been languishing on my computer for far too
long; the implementation has been complete for well over a year, but
it took me this long to finish polishing the patch series. I'd like to
see the new linter make it into the toolchest of other developers
since it can be beneficial; it has already found scores or hundreds[1]
of possible hiding places for bugs due to broken &&-chain or missing
`|| return`, and has sniffed out some actual broken tests[2,3].)

[1]: https://lore.kernel.org/git/20211209051115.52629-1-sunshine@sunshineco.com/
[2]: https://lore.kernel.org/git/20211209051115.52629-3-sunshine@sunshineco.com/
[3]: https://lore.kernel.org/git/7b0784056f3cc0c96e9543ae44d0f5a7b0bf85fa.1661192802.git.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-09-01  0:29 ` [PATCH 06/18] chainlint.pl: validate test scripts in parallel Eric Sunshine via GitGitGadget
  2022-09-01 12:36   ` Ævar Arnfjörð Bjarmason
@ 2022-09-06 22:35   ` Eric Wong
  2022-09-06 22:52     ` Eric Sunshine
  1 sibling, 1 reply; 51+ messages in thread
From: Eric Wong @ 2022-09-06 22:35 UTC (permalink / raw)
  To: Eric Sunshine via GitGitGadget
  Cc: git, Jeff King, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Fabian Stelzer,
	Johannes Schindelin, Eric Sunshine

Eric Sunshine via GitGitGadget <gitgitgadget@gmail.com> wrote:
> +unless ($Config{useithreads} && eval {
> +	require threads; threads->import();

Fwiw, the threads(3perl) manpage has this since 2014:

       The use of interpreter-based threads in perl is officially discouraged.

I was bummed, too :<  but I've decided it wasn't worth the
effort to deal with the problems threads could cause down the
line in future Perl versions.  For example, common libraries
like File::Temp will chdir behind-the-scenes which is
thread-unsafe.

(of course I only care about *BSD and Linux on MMU hardware,
so I use SOCK_SEQPACKET and fork() freely :>)

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-09-06 22:35   ` Eric Wong
@ 2022-09-06 22:52     ` Eric Sunshine
  2022-09-06 23:26       ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Eric Sunshine @ 2022-09-06 22:52 UTC (permalink / raw)
  To: Eric Wong
  Cc: Eric Sunshine via GitGitGadget, Git List, Jeff King,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin

On Tue, Sep 6, 2022 at 6:35 PM Eric Wong <e@80x24.org> wrote:
> Eric Sunshine via GitGitGadget <gitgitgadget@gmail.com> wrote:
> > +unless ($Config{useithreads} && eval {
> > +     require threads; threads->import();
>
> Fwiw, the threads(3perl) manpage has this since 2014:
>
>        The use of interpreter-based threads in perl is officially discouraged.

Thanks for pointing this out. I did see that, but as no better
alternative was offered, and since I did want this to work on Windows,
I went with it.

> I was bummed, too :<  but I've decided it wasn't worth the
> effort to deal with the problems threads could cause down the
> line in future Perl versions.  For example, common libraries
> like File::Temp will chdir behind-the-scenes which is
> thread-unsafe.
>
> (of course I only care about *BSD and Linux on MMU hardware,
> so I use SOCK_SEQPACKET and fork() freely :>)

I'm not overly worried about the deprecation at the moment since (1)
chainlint.pl isn't a widely used script -- it's audience is very
narrow; (2) the `$Config{useithreads}` conditional can be seen as an
automatic escape-hatch, and (if need be) I can even make `--jobs=1` be
an explicit escape hatch, and there's already --no-chain-lint for an
extreme escape-hatch; (3) the script is pretty much standalone -- it
doesn't rely upon any libraries like File::Temp or others; (4) Ævar
has ideas for using the Makefile for parallelism instead; (5) we can
cross the deprecation-bridge when/if it actually does become a
problem, either by dropping parallelism from chainlint.pl or by
dropping chainlint.pl itself.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-09-06 22:52     ` Eric Sunshine
@ 2022-09-06 23:26       ` Jeff King
  2022-11-21  4:02         ` Eric Sunshine
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2022-09-06 23:26 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin

On Tue, Sep 06, 2022 at 06:52:26PM -0400, Eric Sunshine wrote:

> On Tue, Sep 6, 2022 at 6:35 PM Eric Wong <e@80x24.org> wrote:
> > Eric Sunshine via GitGitGadget <gitgitgadget@gmail.com> wrote:
> > > +unless ($Config{useithreads} && eval {
> > > +     require threads; threads->import();
> >
> > Fwiw, the threads(3perl) manpage has this since 2014:
> >
> >        The use of interpreter-based threads in perl is officially discouraged.
> 
> Thanks for pointing this out. I did see that, but as no better
> alternative was offered, and since I did want this to work on Windows,
> I went with it.

I did some timings the other night, and I found something quite curious
with the thread stuff.

Here's a hyperfine run of "make" in the t/ directory before any of your
patches. It uses "prove" to do parallelism under the hood:

  Benchmark 1: make
    Time (mean ± σ):     68.895 s ±  0.840 s    [User: 620.914 s, System: 428.498 s]
    Range (min … max):   67.943 s … 69.531 s    3 runs

So that gives us a baseline. Now the first thing I wondered is how bad
it would be to just run chainlint.pl once per script. So I applied up to
that patch:

  Benchmark 1: make
    Time (mean ± σ):     71.289 s ±  1.302 s    [User: 673.300 s, System: 417.912 s]
    Range (min … max):   69.788 s … 72.120 s    3 runs

I was quite surprised that it made things slower! It's nice that we're
only calling it once per script instead of once per test, but it seems
the startup overhead of the script is really high.

And since in this mode we're only feeding it one script at a time, I
tried reverting the "chainlint.pl: validate test scripts in parallel"
commit. And indeed, now things are much faster:

  Benchmark 1: make
    Time (mean ± σ):     61.544 s ±  3.364 s    [User: 556.486 s, System: 384.001 s]
    Range (min … max):   57.660 s … 63.490 s    3 runs

And you can see the same thing just running chainlint by itself:

  $ time perl chainlint.pl /dev/null
  real	0m0.069s
  user	0m0.042s
  sys	0m0.020s

  $ git revert HEAD^{/validate.test.scripts.in.parallel}
  $ time perl chainlint.pl /dev/null
  real	0m0.014s
  user	0m0.010s
  sys	0m0.004s

I didn't track down the source of the slowness. Maybe it's loading extra
modules, or maybe it's opening /proc/cpuinfo, or maybe it's the thread
setup. But it's a surprising slowdown.

Now of course your intent is to do a single repo-wide invocation. And
that is indeed a bit faster. Here it is without the parallel code:

  Benchmark 1: make
    Time (mean ± σ):     61.727 s ±  2.140 s    [User: 507.712 s, System: 377.753 s]
    Range (min … max):   59.259 s … 63.074 s    3 runs

The wall-clock time didn't improve much, but the CPU time did. Restoring
the parallel code does improve the wall-clock time a bit, but at the
cost of some extra CPU:

  Benchmark 1: make
    Time (mean ± σ):     59.029 s ±  2.851 s    [User: 515.690 s, System: 380.369 s]
    Range (min … max):   55.736 s … 60.693 s    3 runs

which makes sense. If I do a with/without of just "make test-chainlint",
the parallelism is buying a few seconds of wall-clock:

  Benchmark 1: make test-chainlint
    Time (mean ± σ):     900.1 ms ± 102.9 ms    [User: 12049.8 ms, System: 79.7 ms]
    Range (min … max):   704.2 ms … 994.4 ms    10 runs

  Benchmark 1: make test-chainlint
    Time (mean ± σ):      3.778 s ±  0.042 s    [User: 3.756 s, System: 0.023 s]
    Range (min … max):    3.706 s …  3.833 s    10 runs

I'm not sure what it all means. For Linux, I think I'd be just as happy
with a single non-parallelized test-chainlint run for each file. But
maybe on Windows the startup overhead is worse? OTOH, the whole test run
is so much worse there. One process per script is not going to be that
much in relative terms either way.

And if we did cache the results and avoid extra invocations via "make",
then we'd want all the parallelism to move to there anyway.

Maybe that gives you more food for thought about whether perl's "use
threads" is worth having.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 00/18] make test "linting" more comprehensive
  2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
                   ` (17 preceding siblings ...)
  2022-09-01  0:29 ` [PATCH 18/18] t: retire unused chainlint.sed Eric Sunshine via GitGitGadget
@ 2022-09-11  5:28 ` Jeff King
  2022-09-11  7:01   ` Eric Sunshine
  18 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2022-09-11  5:28 UTC (permalink / raw)
  To: Eric Sunshine via GitGitGadget
  Cc: git, Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin, Eric Sunshine

On Thu, Sep 01, 2022 at 12:29:38AM +0000, Eric Sunshine via GitGitGadget wrote:

> A while back, Peff successfully nerd-sniped[1] me into tackling a
> long-brewing idea I had about (possibly) improving "chainlint" performance

Oops, sorry. :)

I gave this a read-through, and it looks sensible overall. I have to
admit that I did not carefully check all of your regexes. Given the
relatively low stakes of the code (as an internal build-time tool only)
and the set of tests accompanying it, I'm willing to assume it's good
enough until we see counter-examples.

I posted some timings and thoughts on the use of threads elsewhere. But
in the end the timings are close enough that I don't care that much
either way.

I'd also note that I got some first-hand experience with the script as I
merged it with all of my other long-brewing topics, and it found a half
dozen spots, mostly LOOP annotations. At least one was a real "oops,
we'd miss a bug in Git here" spot. Several were "we'd probably notice
the problem because the loop output wouldn't be as expected". One was a
"we're on the left-hand of a pipe, so the exit code doesn't matter
anyway" case, but I am more than happy to fix those if it lets us be
linter-clean.

The output took me a minute to adjust to, just because it feels pretty
jumbled when there are several cases. Mostly this is because the
script eats indentation. So it's hard to see the "# chainlint:" comment
starts, let alone the ?! annotations. Here's an example:

-- >8 --
# chainlint: t4070-diff-pairs.sh
# chainlint: split input across multiple diff-pairs
write_script split-raw-diff "$PERL_PATH" <<-EOF &&

git diff-tree -p -M -C -C base new > expect &&

git diff-tree -r -z -M -C -C base new |
./split-raw-diff &&
for i in diff* ; do
git diff-pairs -p < $i ?!LOOP?!
done > actual &&
test_cmp expect actual
# chainlint: perf/p5305-pack-limits.sh
# chainlint: set up delta islands
head=$(git rev-parse HEAD) &&
git for-each-ref --format="delete %(refname)" |
git update-ref --no-deref --stdin &&

n=0 &&
fork=0 &&
git rev-list --first-parent $head |
while read commit ; do
n=$((n+1)) ?!AMP?!
if test "$n" = 100 ; then
echo "create refs/forks/$fork/master $commit" ?!AMP?!
fork=$((fork+1)) ?!AMP?!
n=0
fi ?!LOOP?!
done |
git update-ref --stdin &&

git config pack.island "refs/forks/([0-9]*)/"
-- 8< --

It wasn't too bad once I got the hang of it, but I wonder if a user
writing a single test for the first time may get a bit overwhelmed.  I
assume that the indentation is removed as part of the normalization (I
notice extra whitespace around "<", too). That might be hard to address.

I wonder if color output for "# chainlint" and "?!" annotations would
help, too. It looks like that may be tricky, though, because the
annotations re-parsed internally in some cases.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 00/18] make test "linting" more comprehensive
  2022-09-11  5:28 ` [PATCH 00/18] make test "linting" more comprehensive Jeff King
@ 2022-09-11  7:01   ` Eric Sunshine
  2022-09-11 18:31     ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Eric Sunshine @ 2022-09-11  7:01 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Sunshine via GitGitGadget, Git List, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Fabian Stelzer,
	Johannes Schindelin

On Sun, Sep 11, 2022 at 1:28 AM Jeff King <peff@peff.net> wrote:
> On Thu, Sep 01, 2022 at 12:29:38AM +0000, Eric Sunshine via GitGitGadget wrote:
> > A while back, Peff successfully nerd-sniped[1] me into tackling a
> > long-brewing idea I had about (possibly) improving "chainlint" performance
>
> I gave this a read-through, and it looks sensible overall. I have to
> admit that I did not carefully check all of your regexes. Given the
> relatively low stakes of the code (as an internal build-time tool only)
> and the set of tests accompanying it, I'm willing to assume it's good
> enough until we see counter-examples.

Thanks for the feedback.

> I posted some timings and thoughts on the use of threads elsewhere. But
> in the end the timings are close enough that I don't care that much
> either way.

I ran my eye over that message quickly and have been meaning to dig
into it and give it a proper response but haven't yet found the time.

> I'd also note that I got some first-hand experience with the script as I
> merged it with all of my other long-brewing topics, and it found a half
> dozen spots, mostly LOOP annotations. At least one was a real "oops,
> we'd miss a bug in Git here" spot. Several were "we'd probably notice
> the problem because the loop output wouldn't be as expected". One was a
> "we're on the left-hand of a pipe, so the exit code doesn't matter
> anyway" case, but I am more than happy to fix those if it lets us be
> linter-clean.

Indeed, I'm not super happy about the linter complaining about cases
which obviously can't have an impact on the test's outcome, but (as
mentioned elsewhere in the thread), finally convinced myself that the
relatively low number of these was outweighed by the quite large
number of cases caught by the linter which could have let real
problems slip though. Perhaps some day the linter can be made smarter
about these cases.

> The output took me a minute to adjust to, just because it feels pretty
> jumbled when there are several cases. Mostly this is because the
> script eats indentation. So it's hard to see the "# chainlint:" comment
> starts, let alone the ?! annotations. Here's an example:
> [...snip...]
> It wasn't too bad once I got the hang of it, but I wonder if a user
> writing a single test for the first time may get a bit overwhelmed.  I
> assume that the indentation is removed as part of the normalization (I
> notice extra whitespace around "<", too). That might be hard to address.

The script implements a proper parser and lexer, and the lexer is
tokenizing the input (throwing away whitespace in the process), thus
by the time the parser notices something to complain about with a
"?!FOO?!" annotation, the original whitespace is long gone, and it
just emits the token stream with "?!FOO?!" inserted at the correct
place. In retrospect, the way this perhaps should have been done would
have been for the parser to instruct the lexer to emit a "?!FOO?!"
annotation at the appropriate point in the input stream. But even that
might get a bit hairy since there are cases in which the parser
back-patches by removing some "?!AMP?!" annotations when it has
decided that it doesn't need to complain about &&-chain breakage. I'm
sure it's fixable, but don't know how important it is at this point.

> I wonder if color output for "# chainlint" and "?!" annotations would
> help, too. It looks like that may be tricky, though, because the
> annotations re-parsed internally in some cases.

I had the exact same thought about coloring the "# chainlint:" lines
and "?!FOO?!" annotations, and how helpful that could be to anyone
(not just newcomers). Aside from not having much free time these days,
a big reason I didn't tackle it was because doing so properly probably
means relying upon some third-party Perl module, and I intentionally
wanted to keep the linter independent of add-on modules. Even without
a "coloring" module of some sort, if Perl had a standard `curses`
module (which it doesn't), then it would have been easy enough to ask
`curses` for the proper color codes and apply them as needed. I'm
old-school, so it doesn't appeal to me, but an alternative would be to
assume it's safe to use ANSI color codes, but even that may have to be
done carefully (i.e. checking TERM and accepting only some whitelisted
entries, and worrying about about Windows consoles).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 00/18] make test "linting" more comprehensive
  2022-09-11  7:01   ` Eric Sunshine
@ 2022-09-11 18:31     ` Jeff King
  2022-09-12 23:17       ` Eric Sunshine
  0 siblings, 1 reply; 51+ messages in thread
From: Jeff King @ 2022-09-11 18:31 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Eric Sunshine via GitGitGadget, Git List, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Fabian Stelzer,
	Johannes Schindelin

On Sun, Sep 11, 2022 at 03:01:41AM -0400, Eric Sunshine wrote:

> > I wonder if color output for "# chainlint" and "?!" annotations would
> > help, too. It looks like that may be tricky, though, because the
> > annotations re-parsed internally in some cases.
> 
> I had the exact same thought about coloring the "# chainlint:" lines
> and "?!FOO?!" annotations, and how helpful that could be to anyone
> (not just newcomers). Aside from not having much free time these days,
> a big reason I didn't tackle it was because doing so properly probably
> means relying upon some third-party Perl module, and I intentionally
> wanted to keep the linter independent of add-on modules. Even without
> a "coloring" module of some sort, if Perl had a standard `curses`
> module (which it doesn't), then it would have been easy enough to ask
> `curses` for the proper color codes and apply them as needed. I'm
> old-school, so it doesn't appeal to me, but an alternative would be to
> assume it's safe to use ANSI color codes, but even that may have to be
> done carefully (i.e. checking TERM and accepting only some whitelisted
> entries, and worrying about about Windows consoles).

We're pretty happy to just use ANSI in the rest of Git, but there is a
complication on Windows. See compat/winansi.c where we decode those
internally into SetConsoleTextAttribute() calls.

I think we can live with it as-is for now and see how people react. If
lots of people are getting confused by the output, then that motivates
finding a solution. If not, then it's probably not worth the time.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 00/18] make test "linting" more comprehensive
  2022-09-11 18:31     ` Jeff King
@ 2022-09-12 23:17       ` Eric Sunshine
  2022-09-13  0:04         ` Jeff King
  0 siblings, 1 reply; 51+ messages in thread
From: Eric Sunshine @ 2022-09-12 23:17 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Sunshine via GitGitGadget, Git List, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Fabian Stelzer,
	Johannes Schindelin

On Sun, Sep 11, 2022 at 2:31 PM Jeff King <peff@peff.net> wrote:
> On Sun, Sep 11, 2022 at 03:01:41AM -0400, Eric Sunshine wrote:
> > > I wonder if color output for "# chainlint" and "?!" annotations would
> > > help, too. It looks like that may be tricky, though, because the
> > > annotations re-parsed internally in some cases.
> >
> > I had the exact same thought about coloring the "# chainlint:" lines
> > and "?!FOO?!" annotations, and how helpful that could be to anyone
> > (not just newcomers). Aside from not having much free time these days,
> > a big reason I didn't tackle it was because doing so properly probably
> > means relying upon some third-party Perl module, and I intentionally
> > wanted to keep the linter independent of add-on modules. Even without
> > a "coloring" module of some sort, if Perl had a standard `curses`
> > module (which it doesn't), then it would have been easy enough to ask
> > `curses` for the proper color codes and apply them as needed. I'm
> > old-school, so it doesn't appeal to me, but an alternative would be to
> > assume it's safe to use ANSI color codes, but even that may have to be
> > done carefully (i.e. checking TERM and accepting only some whitelisted
> > entries, and worrying about about Windows consoles).
>
> We're pretty happy to just use ANSI in the rest of Git, but there is a
> complication on Windows. See compat/winansi.c where we decode those
> internally into SetConsoleTextAttribute() calls.
>
> I think we can live with it as-is for now and see how people react. If
> lots of people are getting confused by the output, then that motivates
> finding a solution. If not, then it's probably not worth the time.

Well, you nerd-sniped me anyhow. The result is at [1]. Following the
example of t/test-lib.sh, it uses `tput` if available to avoid
hardcoding color codes, and `tput` is invoked lazily, only if it
detects problems in the tests, so a normal (non-problematic) run
doesn't incur the overhead of shelling out to `tput`.

My first attempt just assumed ANSI color codes, but then I discovered
the precedence set by t/test-lib.sh of using `tput`, so I went with
that (since I'm old-school). The ANSI-only version was, of course,
much simpler.

[1]: https://lore.kernel.org/git/pull.1324.git.git.1663023888412.gitgitgadget@gmail.com/

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 00/18] make test "linting" more comprehensive
  2022-09-12 23:17       ` Eric Sunshine
@ 2022-09-13  0:04         ` Jeff King
  0 siblings, 0 replies; 51+ messages in thread
From: Jeff King @ 2022-09-13  0:04 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Eric Sunshine via GitGitGadget, Git List, Elijah Newren,
	Ævar Arnfjörð Bjarmason, Fabian Stelzer,
	Johannes Schindelin

On Mon, Sep 12, 2022 at 07:17:12PM -0400, Eric Sunshine wrote:

> > I think we can live with it as-is for now and see how people react. If
> > lots of people are getting confused by the output, then that motivates
> > finding a solution. If not, then it's probably not worth the time.
> 
> Well, you nerd-sniped me anyhow. The result is at [1]. Following the

It seems we've discovered my true talent. :)

> example of t/test-lib.sh, it uses `tput` if available to avoid
> hardcoding color codes, and `tput` is invoked lazily, only if it
> detects problems in the tests, so a normal (non-problematic) run
> doesn't incur the overhead of shelling out to `tput`.

Ah, of course. I didn't think about the fact that the regular tests
already had to deal with this problem. Following that lead makes perfect
sense.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-09-06 23:26       ` Jeff King
@ 2022-11-21  4:02         ` Eric Sunshine
  2022-11-21 13:28           ` Ævar Arnfjörð Bjarmason
  2022-11-21 18:04           ` Jeff King
  0 siblings, 2 replies; 51+ messages in thread
From: Eric Sunshine @ 2022-11-21  4:02 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin

On Tue, Sep 6, 2022 at 7:27 PM Jeff King <peff@peff.net> wrote:
> I did some timings the other night, and I found something quite curious
> with the thread stuff.
>
> I was quite surprised that it made things slower! It's nice that we're
> only calling it once per script instead of once per test, but it seems
> the startup overhead of the script is really high.
>
> And since in this mode we're only feeding it one script at a time, I
> tried reverting the "chainlint.pl: validate test scripts in parallel"
> commit. And indeed, now things are much faster:
>
>   Benchmark 1: make
>     Time (mean ± σ):     61.544 s ±  3.364 s    [User: 556.486 s, System: 384.001 s]
>     Range (min … max):   57.660 s … 63.490 s    3 runs
>
> And you can see the same thing just running chainlint by itself:
>
>   $ time perl chainlint.pl /dev/null
>   real  0m0.069s
>   user  0m0.042s
>   sys   0m0.020s
>
>   $ git revert HEAD^{/validate.test.scripts.in.parallel}
>   $ time perl chainlint.pl /dev/null
>   real  0m0.014s
>   user  0m0.010s
>   sys   0m0.004s
>
> I didn't track down the source of the slowness. Maybe it's loading extra
> modules, or maybe it's opening /proc/cpuinfo, or maybe it's the thread
> setup. But it's a surprising slowdown.

It is surprising, and unfortunate. Ditching "ithreads" would probably
be a good idea. (more on that below)

> Now of course your intent is to do a single repo-wide invocation. And
> that is indeed a bit faster. Here it is without the parallel code:
>
>   Benchmark 1: make
>     Time (mean ± σ):     61.727 s ±  2.140 s    [User: 507.712 s, System: 377.753 s]
>     Range (min … max):   59.259 s … 63.074 s    3 runs
>
> The wall-clock time didn't improve much, but the CPU time did. Restoring
> the parallel code does improve the wall-clock time a bit, but at the
> cost of some extra CPU:
>
>   Benchmark 1: make
>     Time (mean ± σ):     59.029 s ±  2.851 s    [User: 515.690 s, System: 380.369 s]
>     Range (min … max):   55.736 s … 60.693 s    3 runs
>
> which makes sense. If I do a with/without of just "make test-chainlint",
> the parallelism is buying a few seconds of wall-clock:
>
>   Benchmark 1: make test-chainlint
>     Time (mean ± σ):     900.1 ms ± 102.9 ms    [User: 12049.8 ms, System: 79.7 ms]
>     Range (min … max):   704.2 ms … 994.4 ms    10 runs
>
>   Benchmark 1: make test-chainlint
>     Time (mean ± σ):      3.778 s ±  0.042 s    [User: 3.756 s, System: 0.023 s]
>     Range (min … max):    3.706 s …  3.833 s    10 runs
>
> I'm not sure what it all means. For Linux, I think I'd be just as happy
> with a single non-parallelized test-chainlint run for each file. But
> maybe on Windows the startup overhead is worse? OTOH, the whole test run
> is so much worse there. One process per script is not going to be that
> much in relative terms either way.

Somehow Windows manages to be unbelievably slow no matter what. I
mentioned elsewhere (after you sent this) that I tested on a five or
six year old 8-core dual-boot machine. Booted to Linux, running a
single chainlint.pl invocation using all 8 cores to check all scripts
in the project took under 1 second walltime. The same machine booted
to Windows using all 8 cores took just under two minutes(!) walltime
for the single Perl invocation to check all scripts in the project.

So, at this point, I have no hope for making linting fast on Windows;
it seems to be a lost cause.

> And if we did cache the results and avoid extra invocations via "make",
> then we'd want all the parallelism to move to there anyway.
>
> Maybe that gives you more food for thought about whether perl's "use
> threads" is worth having.

I'm not especially happy about the significant overhead of "ithreads";
on my (old) machine, although it does improve perceived time
significantly, it eats up quite a bit of additional user-time. As
such, I would not be unhappy to see "ithreads" go away, especially
since fast linting on Windows seems unattainable (at least with Perl).

Overall, I think Ævar's plan to parallelize linting via "make" is
probably the way to go.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21  4:02         ` Eric Sunshine
@ 2022-11-21 13:28           ` Ævar Arnfjörð Bjarmason
  2022-11-21 14:07             ` Eric Sunshine
  2022-11-21 18:04           ` Jeff King
  1 sibling, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-21 13:28 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Jeff King, Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Fabian Stelzer, Johannes Schindelin


On Sun, Nov 20 2022, Eric Sunshine wrote:

> On Tue, Sep 6, 2022 at 7:27 PM Jeff King <peff@peff.net> wrote:
>> I did some timings the other night, and I found something quite curious
>> with the thread stuff.
>>
>> I was quite surprised that it made things slower! It's nice that we're
>> only calling it once per script instead of once per test, but it seems
>> the startup overhead of the script is really high.
>>
>> And since in this mode we're only feeding it one script at a time, I
>> tried reverting the "chainlint.pl: validate test scripts in parallel"
>> commit. And indeed, now things are much faster:
>>
>>   Benchmark 1: make
>>     Time (mean ± σ):     61.544 s ±  3.364 s    [User: 556.486 s, System: 384.001 s]
>>     Range (min … max):   57.660 s … 63.490 s    3 runs
>>
>> And you can see the same thing just running chainlint by itself:
>>
>>   $ time perl chainlint.pl /dev/null
>>   real  0m0.069s
>>   user  0m0.042s
>>   sys   0m0.020s
>>
>>   $ git revert HEAD^{/validate.test.scripts.in.parallel}
>>   $ time perl chainlint.pl /dev/null
>>   real  0m0.014s
>>   user  0m0.010s
>>   sys   0m0.004s
>>
>> I didn't track down the source of the slowness. Maybe it's loading extra
>> modules, or maybe it's opening /proc/cpuinfo, or maybe it's the thread
>> setup. But it's a surprising slowdown.
>
> It is surprising, and unfortunate. Ditching "ithreads" would probably
> be a good idea. (more on that below)
>
>> Now of course your intent is to do a single repo-wide invocation. And
>> that is indeed a bit faster. Here it is without the parallel code:
>>
>>   Benchmark 1: make
>>     Time (mean ± σ):     61.727 s ±  2.140 s    [User: 507.712 s, System: 377.753 s]
>>     Range (min … max):   59.259 s … 63.074 s    3 runs
>>
>> The wall-clock time didn't improve much, but the CPU time did. Restoring
>> the parallel code does improve the wall-clock time a bit, but at the
>> cost of some extra CPU:
>>
>>   Benchmark 1: make
>>     Time (mean ± σ):     59.029 s ±  2.851 s    [User: 515.690 s, System: 380.369 s]
>>     Range (min … max):   55.736 s … 60.693 s    3 runs
>>
>> which makes sense. If I do a with/without of just "make test-chainlint",
>> the parallelism is buying a few seconds of wall-clock:
>>
>>   Benchmark 1: make test-chainlint
>>     Time (mean ± σ):     900.1 ms ± 102.9 ms    [User: 12049.8 ms, System: 79.7 ms]
>>     Range (min … max):   704.2 ms … 994.4 ms    10 runs
>>
>>   Benchmark 1: make test-chainlint
>>     Time (mean ± σ):      3.778 s ±  0.042 s    [User: 3.756 s, System: 0.023 s]
>>     Range (min … max):    3.706 s …  3.833 s    10 runs
>>
>> I'm not sure what it all means. For Linux, I think I'd be just as happy
>> with a single non-parallelized test-chainlint run for each file. But
>> maybe on Windows the startup overhead is worse? OTOH, the whole test run
>> is so much worse there. One process per script is not going to be that
>> much in relative terms either way.
>
> Somehow Windows manages to be unbelievably slow no matter what. I
> mentioned elsewhere (after you sent this) that I tested on a five or
> six year old 8-core dual-boot machine. Booted to Linux, running a
> single chainlint.pl invocation using all 8 cores to check all scripts
> in the project took under 1 second walltime. The same machine booted
> to Windows using all 8 cores took just under two minutes(!) walltime
> for the single Perl invocation to check all scripts in the project.
>
> So, at this point, I have no hope for making linting fast on Windows;
> it seems to be a lost cause.

I'd be really interested in seeing e.g. the NYTProf output for that run,
compared with that on *nix (if you could upload the HTML versions of
both somewhere, even better).

Maybe "chainlint.pl" is doing something odd, but this goes against the
usual wisdom about what is and isn't slow in Perl on windows, as I
understand it.

I.e. process star-up etc. is slow there, and I/O's a bit slower, but
once you're started up and e.g. slurping up all of those files & parsing
them you're just running "perl-native" code.

Which shouldn't be much slower at all. A perl compiled with ithreads is
(last I checked) around 10-20% slower, and the Windows version is always
compiled with that (it's needed for "fork" emulation).

But most *nix versions are compiled with that too, and certainly the one
you're using with "threads", so that's not the difference.

So I suspect something odd's going on...

>> And if we did cache the results and avoid extra invocations via "make",
>> then we'd want all the parallelism to move to there anyway.
>>
>> Maybe that gives you more food for thought about whether perl's "use
>> threads" is worth having.
>
> I'm not especially happy about the significant overhead of "ithreads";
> on my (old) machine, although it does improve perceived time
> significantly, it eats up quite a bit of additional user-time. As
> such, I would not be unhappy to see "ithreads" go away, especially
> since fast linting on Windows seems unattainable (at least with Perl).
>
> Overall, I think Ævar's plan to parallelize linting via "make" is
> probably the way to go.

Yeah, but that seems to me to be orthagonal to why it's this slow on
Windows, and if it is that wouldn't help much, except for incremental
re-runs.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21 13:28           ` Ævar Arnfjörð Bjarmason
@ 2022-11-21 14:07             ` Eric Sunshine
  2022-11-21 14:18               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 51+ messages in thread
From: Eric Sunshine @ 2022-11-21 14:07 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff King, Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Fabian Stelzer, Johannes Schindelin

On Mon, Nov 21, 2022 at 8:32 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Sun, Nov 20 2022, Eric Sunshine wrote:
> > Somehow Windows manages to be unbelievably slow no matter what. I
> > mentioned elsewhere (after you sent this) that I tested on a five or
> > six year old 8-core dual-boot machine. Booted to Linux, running a
> > single chainlint.pl invocation using all 8 cores to check all scripts
> > in the project took under 1 second walltime. The same machine booted
> > to Windows using all 8 cores took just under two minutes(!) walltime
> > for the single Perl invocation to check all scripts in the project.
>
> I'd be really interested in seeing e.g. the NYTProf output for that run,
> compared with that on *nix (if you could upload the HTML versions of
> both somewhere, even better).

Unfortunately, I no longer have access to that machine, or usable
Windows in general. Of course, someone else with access to a dual-boot
machine could generate such a report, but whether anyone will offer to
do so is a different matter.

> Maybe "chainlint.pl" is doing something odd, but this goes against the
> usual wisdom about what is and isn't slow in Perl on windows, as I
> understand it.
>
> I.e. process star-up etc. is slow there, and I/O's a bit slower, but
> once you're started up and e.g. slurping up all of those files & parsing
> them you're just running "perl-native" code.
>
> Which shouldn't be much slower at all. A perl compiled with ithreads is
> (last I checked) around 10-20% slower, and the Windows version is always
> compiled with that (it's needed for "fork" emulation).
>
> But most *nix versions are compiled with that too, and certainly the one
> you're using with "threads", so that's not the difference.
>
> So I suspect something odd's going on...

This is all my understanding, as well, which is why I was so surprised
by the difference in speed. Aside from suspecting Windows I/O as the
culprit, another obvious possible culprit would be whatever
mechanism/primitives "ithreads" is using on Windows for
locking/synchronizing and passing messages between threads. I wouldn't
be surprised to learn that those mechanisms/primitives have very high
overhead on that platform.

> > Overall, I think Ævar's plan to parallelize linting via "make" is
> > probably the way to go.
>
> Yeah, but that seems to me to be orthagonal to why it's this slow on
> Windows, and if it is that wouldn't help much, except for incremental
> re-runs.

Oh, I didn't at all mean that `make` parallelism would be helpful on
Windows; I can't imagine that it ever would be (though I could once
again be wrong). What I meant was that `make` parallelism would be a
nice improvement and simplification (of sorts), in general,
considering that I've given up hope of ever seeing linting be speedy
on Windows.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21 14:07             ` Eric Sunshine
@ 2022-11-21 14:18               ` Ævar Arnfjörð Bjarmason
  2022-11-21 14:48                 ` Eric Sunshine
  0 siblings, 1 reply; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-21 14:18 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Jeff King, Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Fabian Stelzer, Johannes Schindelin


On Mon, Nov 21 2022, Eric Sunshine wrote:

> On Mon, Nov 21, 2022 at 8:32 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> On Sun, Nov 20 2022, Eric Sunshine wrote:
>> > Somehow Windows manages to be unbelievably slow no matter what. I
>> > mentioned elsewhere (after you sent this) that I tested on a five or
>> > six year old 8-core dual-boot machine. Booted to Linux, running a
>> > single chainlint.pl invocation using all 8 cores to check all scripts
>> > in the project took under 1 second walltime. The same machine booted
>> > to Windows using all 8 cores took just under two minutes(!) walltime
>> > for the single Perl invocation to check all scripts in the project.
>>
>> I'd be really interested in seeing e.g. the NYTProf output for that run,
>> compared with that on *nix (if you could upload the HTML versions of
>> both somewhere, even better).
>
> Unfortunately, I no longer have access to that machine, or usable
> Windows in general. Of course, someone else with access to a dual-boot
> machine could generate such a report, but whether anyone will offer to
> do so is a different matter.

:(

>> Maybe "chainlint.pl" is doing something odd, but this goes against the
>> usual wisdom about what is and isn't slow in Perl on windows, as I
>> understand it.
>>
>> I.e. process star-up etc. is slow there, and I/O's a bit slower, but
>> once you're started up and e.g. slurping up all of those files & parsing
>> them you're just running "perl-native" code.
>>
>> Which shouldn't be much slower at all. A perl compiled with ithreads is
>> (last I checked) around 10-20% slower, and the Windows version is always
>> compiled with that (it's needed for "fork" emulation).
>>
>> But most *nix versions are compiled with that too, and certainly the one
>> you're using with "threads", so that's not the difference.
>>
>> So I suspect something odd's going on...
>
> This is all my understanding, as well, which is why I was so surprised
> by the difference in speed. Aside from suspecting Windows I/O as the
> culprit, another obvious possible culprit would be whatever
> mechanism/primitives "ithreads" is using on Windows for
> locking/synchronizing and passing messages between threads. I wouldn't
> be surprised to learn that those mechanisms/primitives have very high
> overhead on that platform.

Yeah, that could be, but then...

>> > Overall, I think Ævar's plan to parallelize linting via "make" is
>> > probably the way to go.
>>
>> Yeah, but that seems to me to be orthagonal to why it's this slow on
>> Windows, and if it is that wouldn't help much, except for incremental
>> re-runs.
>
> Oh, I didn't at all mean that `make` parallelism would be helpful on
> Windows; I can't imagine that it ever would be (though I could once
> again be wrong). What I meant was that `make` parallelism would be a
> nice improvement and simplification (of sorts), in general,
> considering that I've given up hope of ever seeing linting be speedy
> on Windows.

...that parallelism probably wouldn't be helpful, as it'll run into
another thing that's slow.

But just ditching the "ithreads" commit from chainlint.pl should make it
much faster, as sequentially parsing all the files isn't that slow, and
as that won't use threads should be much faster then.


^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21 14:18               ` Ævar Arnfjörð Bjarmason
@ 2022-11-21 14:48                 ` Eric Sunshine
  0 siblings, 0 replies; 51+ messages in thread
From: Eric Sunshine @ 2022-11-21 14:48 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff King, Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Fabian Stelzer, Johannes Schindelin

On Mon, Nov 21, 2022 at 9:20 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Mon, Nov 21 2022, Eric Sunshine wrote:
> > Oh, I didn't at all mean that `make` parallelism would be helpful on
> > Windows; I can't imagine that it ever would be (though I could once
> > again be wrong). What I meant was that `make` parallelism would be a
> > nice improvement and simplification (of sorts), in general,
> > considering that I've given up hope of ever seeing linting be speedy
> > on Windows.
>
> But just ditching the "ithreads" commit from chainlint.pl should make it
> much faster, as sequentially parsing all the files isn't that slow, and
> as that won't use threads should be much faster then.

On my (old) machine (with spinning hard drive), `make test-chainlint`
with "ithreads" and warm filesystem cache takes about 3.8 seconds
walltime. Without "ithreads", it takes about 11.3 seconds. So, the
improvement in perceived time is significant. As such, I'm somewhat
hesitant to see "ithreads" dropped from chainlint.pl before `make`
parallelism is implemented. (I can easily see "drop ithreads" as the
final patch of a series which adds `make` parallelism.)

But perhaps I'm focussing too much on my own experience with my old
machine. Maybe linting without "ithreads" and without `make`
parallelism would be "fast enough" for developers using beefier modern
machines... (genuine question/thought since I don't have access to any
beefy modern hardware).

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21  4:02         ` Eric Sunshine
  2022-11-21 13:28           ` Ævar Arnfjörð Bjarmason
@ 2022-11-21 18:04           ` Jeff King
  2022-11-21 18:47             ` Eric Sunshine
  1 sibling, 1 reply; 51+ messages in thread
From: Jeff King @ 2022-11-21 18:04 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin

On Sun, Nov 20, 2022 at 11:02:54PM -0500, Eric Sunshine wrote:

> > And if we did cache the results and avoid extra invocations via "make",
> > then we'd want all the parallelism to move to there anyway.
> >
> > Maybe that gives you more food for thought about whether perl's "use
> > threads" is worth having.
> 
> I'm not especially happy about the significant overhead of "ithreads";
> on my (old) machine, although it does improve perceived time
> significantly, it eats up quite a bit of additional user-time. As
> such, I would not be unhappy to see "ithreads" go away, especially
> since fast linting on Windows seems unattainable (at least with Perl).
> 
> Overall, I think Ævar's plan to parallelize linting via "make" is
> probably the way to go.

TBH, I think just running the linter once per test script when the
script is run would be sufficient. That is one extra process per script,
but they are already shell scripts running a bunch of processes. You get
parallelism for free because you're already running the tests in
parallel. You lose out on "don't bother linting because the file hasn't
changed", but I'm not sure that's really worth the extra complexity
overall.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21 18:04           ` Jeff King
@ 2022-11-21 18:47             ` Eric Sunshine
  2022-11-21 18:50               ` Eric Sunshine
  2022-11-21 18:52               ` Jeff King
  0 siblings, 2 replies; 51+ messages in thread
From: Eric Sunshine @ 2022-11-21 18:47 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin

On Mon, Nov 21, 2022 at 1:04 PM Jeff King <peff@peff.net> wrote:
> On Sun, Nov 20, 2022 at 11:02:54PM -0500, Eric Sunshine wrote:
> > Overall, I think Ævar's plan to parallelize linting via "make" is
> > probably the way to go.
>
> TBH, I think just running the linter once per test script when the
> script is run would be sufficient. That is one extra process per script,
> but they are already shell scripts running a bunch of processes. You get
> parallelism for free because you're already running the tests in
> parallel. You lose out on "don't bother linting because the file hasn't
> changed", but I'm not sure that's really worth the extra complexity
> overall.

Hmm, yes, that's appealing (especially since I've essentially given up
on making linting fast on Windows), and it wouldn't be hard to
implement. In fact, it's already implemented by 23a14f3016 (test-lib:
replace chainlint.sed with chainlint.pl, 2022-09-01); making it work
the way you describe would just involve dropping 69b9924b87
(t/Makefile: teach `make test` and `make prove` to run chainlint.pl,
2022-09-01) and 29fb2ec384 (chainlint.pl: validate test scripts in
parallel, 2022-09-01).

I think Ævar's use-case for `make` parallelization was to speed up
git-bisect runs. But thinking about it now, the likelihood of "lint"
problems cropping up during a git-bisect run is effectively nil, in
which case setting GIT_TEST_CHAIN_LINT=1 should be a perfectly
appropriate way to take linting out of the equation when bisecting.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21 18:47             ` Eric Sunshine
@ 2022-11-21 18:50               ` Eric Sunshine
  2022-11-21 18:52               ` Jeff King
  1 sibling, 0 replies; 51+ messages in thread
From: Eric Sunshine @ 2022-11-21 18:50 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin

On Mon, Nov 21, 2022 at 1:47 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
> I think Ævar's use-case for `make` parallelization was to speed up
> git-bisect runs. But thinking about it now, the likelihood of "lint"
> problems cropping up during a git-bisect run is effectively nil, in
> which case setting GIT_TEST_CHAIN_LINT=1 should be a perfectly
> appropriate way to take linting out of the equation when bisecting.

I mean "GIT_TEST_CHAIN_LINT=0", of course.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21 18:47             ` Eric Sunshine
  2022-11-21 18:50               ` Eric Sunshine
@ 2022-11-21 18:52               ` Jeff King
  2022-11-21 19:00                 ` Eric Sunshine
  1 sibling, 1 reply; 51+ messages in thread
From: Jeff King @ 2022-11-21 18:52 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin

On Mon, Nov 21, 2022 at 01:47:42PM -0500, Eric Sunshine wrote:

> On Mon, Nov 21, 2022 at 1:04 PM Jeff King <peff@peff.net> wrote:
> > On Sun, Nov 20, 2022 at 11:02:54PM -0500, Eric Sunshine wrote:
> > > Overall, I think Ævar's plan to parallelize linting via "make" is
> > > probably the way to go.
> >
> > TBH, I think just running the linter once per test script when the
> > script is run would be sufficient. That is one extra process per script,
> > but they are already shell scripts running a bunch of processes. You get
> > parallelism for free because you're already running the tests in
> > parallel. You lose out on "don't bother linting because the file hasn't
> > changed", but I'm not sure that's really worth the extra complexity
> > overall.
> 
> Hmm, yes, that's appealing (especially since I've essentially given up
> on making linting fast on Windows), and it wouldn't be hard to
> implement. In fact, it's already implemented by 23a14f3016 (test-lib:
> replace chainlint.sed with chainlint.pl, 2022-09-01); making it work
> the way you describe would just involve dropping 69b9924b87
> (t/Makefile: teach `make test` and `make prove` to run chainlint.pl,
> 2022-09-01) and 29fb2ec384 (chainlint.pl: validate test scripts in
> parallel, 2022-09-01).

Yes, that was one of the modes I timed in my original email. :)

> I think Ævar's use-case for `make` parallelization was to speed up
> git-bisect runs. But thinking about it now, the likelihood of "lint"
> problems cropping up during a git-bisect run is effectively nil, in
> which case setting GIT_TEST_CHAIN_LINT=1 should be a perfectly
> appropriate way to take linting out of the equation when bisecting.

Yes. It's also dumb to run a straight "make test" while bisecting in the
first place, because you are going to run a zillion tests that aren't
relevant to your bisection. Bisecting on "cd t && ./test-that-fails" is
faster, at which point you're only running the one lint process (and if
it really bothers you, you can disable chain lint as you suggest).

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21 18:52               ` Jeff King
@ 2022-11-21 19:00                 ` Eric Sunshine
  2022-11-21 19:28                   ` Jeff King
  2022-11-22  0:11                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 51+ messages in thread
From: Eric Sunshine @ 2022-11-21 19:00 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin

On Mon, Nov 21, 2022 at 1:52 PM Jeff King <peff@peff.net> wrote:
> On Mon, Nov 21, 2022 at 01:47:42PM -0500, Eric Sunshine wrote:
> > I think Ævar's use-case for `make` parallelization was to speed up
> > git-bisect runs. But thinking about it now, the likelihood of "lint"
> > problems cropping up during a git-bisect run is effectively nil, in
> > which case setting GIT_TEST_CHAIN_LINT=1 should be a perfectly
> > appropriate way to take linting out of the equation when bisecting.
>
> Yes. It's also dumb to run a straight "make test" while bisecting in the
> first place, because you are going to run a zillion tests that aren't
> relevant to your bisection. Bisecting on "cd t && ./test-that-fails" is
> faster, at which point you're only running the one lint process (and if
> it really bothers you, you can disable chain lint as you suggest).

I think I misspoke. Dredging up old memories, I think Ævar's use-case
is that he now runs:

    git rebase -i --exec 'make test' ...

in order to ensure that the entire test suite passes for _every_ patch
in a series. (This is due to him having missed a runtime breakage by
only running "make test" after the final patch in a series was
applied, when the breakage was only temporary -- added by one patch,
but resolved by some other later patch.)

Even so, GIT_TEST_CHAIN_LINT=0 should be appropriate here too.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21 19:00                 ` Eric Sunshine
@ 2022-11-21 19:28                   ` Jeff King
  2022-11-22  0:11                   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 51+ messages in thread
From: Jeff King @ 2022-11-21 19:28 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Ævar Arnfjörð Bjarmason,
	Fabian Stelzer, Johannes Schindelin

On Mon, Nov 21, 2022 at 02:00:41PM -0500, Eric Sunshine wrote:

> On Mon, Nov 21, 2022 at 1:52 PM Jeff King <peff@peff.net> wrote:
> > On Mon, Nov 21, 2022 at 01:47:42PM -0500, Eric Sunshine wrote:
> > > I think Ævar's use-case for `make` parallelization was to speed up
> > > git-bisect runs. But thinking about it now, the likelihood of "lint"
> > > problems cropping up during a git-bisect run is effectively nil, in
> > > which case setting GIT_TEST_CHAIN_LINT=1 should be a perfectly
> > > appropriate way to take linting out of the equation when bisecting.
> >
> > Yes. It's also dumb to run a straight "make test" while bisecting in the
> > first place, because you are going to run a zillion tests that aren't
> > relevant to your bisection. Bisecting on "cd t && ./test-that-fails" is
> > faster, at which point you're only running the one lint process (and if
> > it really bothers you, you can disable chain lint as you suggest).
> 
> I think I misspoke. Dredging up old memories, I think Ævar's use-case
> is that he now runs:
> 
>     git rebase -i --exec 'make test' ...
> 
> in order to ensure that the entire test suite passes for _every_ patch
> in a series. (This is due to him having missed a runtime breakage by
> only running "make test" after the final patch in a series was
> applied, when the breakage was only temporary -- added by one patch,
> but resolved by some other later patch.)

Yeah, I do that sometimes, too, especially when heavy refactoring is
involved.

> Even so, GIT_TEST_CHAIN_LINT=0 should be appropriate here too.

Agreed. But also, my original point stands. If you are running 10 CPU
minutes of tests, then a few CPU seconds of linting is not really that
important.

-Peff

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
  2022-11-21 19:00                 ` Eric Sunshine
  2022-11-21 19:28                   ` Jeff King
@ 2022-11-22  0:11                   ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 51+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-22  0:11 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Jeff King, Eric Wong, Eric Sunshine via GitGitGadget, Git List,
	Elijah Newren, Fabian Stelzer, Johannes Schindelin


On Mon, Nov 21 2022, Eric Sunshine wrote:

> On Mon, Nov 21, 2022 at 1:52 PM Jeff King <peff@peff.net> wrote:
>> On Mon, Nov 21, 2022 at 01:47:42PM -0500, Eric Sunshine wrote:
>> > I think Ævar's use-case for `make` parallelization was to speed up
>> > git-bisect runs. But thinking about it now, the likelihood of "lint"
>> > problems cropping up during a git-bisect run is effectively nil, in
>> > which case setting GIT_TEST_CHAIN_LINT=1 should be a perfectly
>> > appropriate way to take linting out of the equation when bisecting.
>>
>> Yes. It's also dumb to run a straight "make test" while bisecting in the
>> first place, because you are going to run a zillion tests that aren't
>> relevant to your bisection. Bisecting on "cd t && ./test-that-fails" is
>> faster, at which point you're only running the one lint process (and if
>> it really bothers you, you can disable chain lint as you suggest).
>
> I think I misspoke. Dredging up old memories, I think Ævar's use-case
> is that he now runs:
>
>     git rebase -i --exec 'make test' ...
>
> in order to ensure that the entire test suite passes for _every_ patch
> in a series. (This is due to him having missed a runtime breakage by
> only running "make test" after the final patch in a series was
> applied, when the breakage was only temporary -- added by one patch,
> but resolved by some other later patch.)
>
> Even so, GIT_TEST_CHAIN_LINT=0 should be appropriate here too.

I'd like to make "make" fast in terms of avoiding its own overhead
before it gets to actual work mainly because of that use-case, but it
helps in general. E.g. if you switch branches we don't compile a file we
don't need to, we shouldn't re-run test checks we don't need either.

For t/ this is:

 - Running chainlint.pl on the file, even if it didn't change
 - Ditto check-non-portable-shell.pl
 - Ditto "non-portable file name(s)" check
 - Ditto "test -x" on all test files

I have a branch where these are all checked using dependencies instead,
e.g. we run a "test -x" on t0071-sort.sh and create a
".build/check-executable/t0071-sort.sh.ok" if that passed, we don't need
to shell out in the common case.

The results of that are, and this is a best case in picking one where
the test itself is cheap:
	
	$ git hyperfine -L rev @{u},HEAD~,HEAD -s 'make CFLAGS=-O3' 'make test T=t0071-sort.sh' -w 1
	Benchmark 1: make test T=t0071-sort.sh' in '@{u}
	  Time (mean ± σ):      1.168 s ±  0.074 s    [User: 1.534 s, System: 0.082 s]
	  Range (min … max):    1.096 s …  1.316 s    10 runs
	
	Benchmark 2: make test T=t0071-sort.sh' in 'HEAD~
	  Time (mean ± σ):     719.1 ms ±  46.1 ms    [User: 910.6 ms, System: 79.7 ms]
	  Range (min … max):   682.0 ms … 828.2 ms    10 runs
	
	Benchmark 3: make test T=t0071-sort.sh' in 'HEAD
	  Time (mean ± σ):     685.0 ms ±  34.2 ms    [User: 645.0 ms, System: 56.8 ms]
	  Range (min … max):   657.6 ms … 773.6 ms    10 runs
	
	Summary
	  'make test T=t0071-sort.sh' in 'HEAD' ran
	    1.05 ± 0.09 times faster than 'make test T=t0071-sort.sh' in 'HEAD~'
	    1.71 ± 0.14 times faster than 'make test T=t0071-sort.sh' in '@{u}'

The @{u} being "master", HEAD~ is "incremant without chainlint.pl", and
"HEAD" is where it's all incremental.

It's very WIP-quality, but I pushed the chainlint.pl part of it as a POC
just now, I did the others a while ago:
https://github.com/avar/git/tree/avar/t-Makefile-break-T-to-file-association


^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2022-11-22  0:51 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 01/18] t: add skeleton chainlint.pl Eric Sunshine via GitGitGadget
2022-09-01 12:27   ` Ævar Arnfjörð Bjarmason
2022-09-02 18:53     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer Eric Sunshine via GitGitGadget
2022-09-01 12:32   ` Ævar Arnfjörð Bjarmason
2022-09-03  6:00     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 03/18] chainlint.pl: add POSIX shell parser Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 04/18] chainlint.pl: add parser to validate tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 05/18] chainlint.pl: add parser to identify test definitions Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 06/18] chainlint.pl: validate test scripts in parallel Eric Sunshine via GitGitGadget
2022-09-01 12:36   ` Ævar Arnfjörð Bjarmason
2022-09-03  7:51     ` Eric Sunshine
2022-09-06 22:35   ` Eric Wong
2022-09-06 22:52     ` Eric Sunshine
2022-09-06 23:26       ` Jeff King
2022-11-21  4:02         ` Eric Sunshine
2022-11-21 13:28           ` Ævar Arnfjörð Bjarmason
2022-11-21 14:07             ` Eric Sunshine
2022-11-21 14:18               ` Ævar Arnfjörð Bjarmason
2022-11-21 14:48                 ` Eric Sunshine
2022-11-21 18:04           ` Jeff King
2022-11-21 18:47             ` Eric Sunshine
2022-11-21 18:50               ` Eric Sunshine
2022-11-21 18:52               ` Jeff King
2022-11-21 19:00                 ` Eric Sunshine
2022-11-21 19:28                   ` Jeff King
2022-11-22  0:11                   ` Ævar Arnfjörð Bjarmason
2022-09-01  0:29 ` [PATCH 07/18] chainlint.pl: don't require `return|exit|continue` to end with `&&` Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 08/18] t/Makefile: apply chainlint.pl to existing self-tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 09/18] chainlint.pl: don't require `&` background command to end with `&&` Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 10/18] chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 11/18] chainlint.pl: don't flag broken &&-chain if failure indicated explicitly Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 12/18] chainlint.pl: complain about loops lacking explicit failure handling Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 13/18] chainlint.pl: allow `|| echo` to signal failure upstream of a pipe Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 14/18] t/chainlint: add more chainlint.pl self-tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 15/18] test-lib: retire "lint harder" optimization hack Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 16/18] test-lib: replace chainlint.sed with chainlint.pl Eric Sunshine via GitGitGadget
2022-09-03  5:07   ` Elijah Newren
2022-09-03  5:24     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 17/18] t/Makefile: teach `make test` and `make prove` to run chainlint.pl Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 18/18] t: retire unused chainlint.sed Eric Sunshine via GitGitGadget
2022-09-02 12:42   ` several messages Johannes Schindelin
2022-09-02 18:16     ` Eric Sunshine
2022-09-02 18:34       ` Jeff King
2022-09-02 18:44         ` Junio C Hamano
2022-09-11  5:28 ` [PATCH 00/18] make test "linting" more comprehensive Jeff King
2022-09-11  7:01   ` Eric Sunshine
2022-09-11 18:31     ` Jeff King
2022-09-12 23:17       ` Eric Sunshine
2022-09-13  0:04         ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).