Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Eric Sunshine <sunshine@sunshineco.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Eric Sunshine via GitGitGadget <gitgitgadget@gmail.com>,
	Git List <git@vger.kernel.org>, Jeff King <peff@peff.net>,
	Elijah Newren <newren@gmail.com>,
	Fabian Stelzer <fs@gigacodes.de>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH 06/18] chainlint.pl: validate test scripts in parallel
Date: Sat, 3 Sep 2022 03:51:37 -0400	[thread overview]
Message-ID: <CAPig+cThSD12whinyLzhHH9qh+bR7W_AH8ea5GT6B=bd87f2RA@mail.gmail.com> (raw)
In-Reply-To: <220901.86bkrzjm6e.gmgdl@evledraar.gmail.com>

On Thu, Sep 1, 2022 at 8:47 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> On Thu, Sep 01 2022, Eric Sunshine via GitGitGadget wrote:
> > Although chainlint.pl has undergone a good deal of optimization during
> > its development -- increasing in speed significantly -- parsing and
> > validating 1050+ scripts and 16500+ tests via Perl is not exactly
> > instantaneous. However, perceived performance can be improved by taking
> > advantage of the fact that there is no interdependence between test
> > scripts or test definitions, thus parsing and validating can be done in
> > parallel. The number of available cores is determined automatically but
> > can be overridden via the --jobs option.
>
> Per your CL:
>
>         Ævar offered some sensible comments[2,3] about optimizing the Makefile rules
>         related to chainlint, but those optimizations are not tackled here for a few
>         reasons: (1) this series is already quite long, (2) I'd like to keep the
>         series focused on its primary goal of installing a new and improved linter,
>         (3) these patches do not make the Makefile situation any worse[4], and (4)
>         those optimizations can easily be done atop this series[5].
>
> I have been running with those t/Makefile changesg locally, but didn't
> submit them. FWIW that's here:
>
>         https://github.com/git/git/compare/master...avar:git:avar/t-Makefile-use-dependency-graph-for-check-chainlint

Thanks for the link. It's nice to see an actual implementation. I
think most of what you wrote in the commit message and the patch
itself are still meaningful following this series.

> > +my $script_queue = Thread::Queue->new();
> > +my $output_queue = Thread::Queue->new();
> > +
> > +my $mon = threads->create({'context' => 'void'}, \&monitor);
> > +threads->create({'context' => 'list'}, \&check_script, $_, \&next_script, \&emit) for 1..$jobs;
>
> Maybe I'm misunderstanding this whole thing, but this really seems like
> the wrong direction in an otherwise fantastic direction of a series.
>
> I.e. it's *great* that we can do chain-lint without needing to actually
> execute the *.sh file, this series adds a lint parser that can parse
> those *.sh "at rest".
>
> But in your 16/18 you then do:
>
>         +if test "${GIT_TEST_CHAIN_LINT:-1}" != 0
>         +then
>         +       "$PERL_PATH" "$TEST_DIRECTORY/chainlint.pl" "$0" ||
>         +               BUG "lint error (see '?!...!? annotations above)"
>         +fi
>
> I may just be missing something here, but why not instead just borrow
> what I did for "lint-docs" in 8650c6298c1 (doc lint: make "lint-docs"
> non-.PHONY, 2021-10-15)?

I may be misunderstanding, but regarding patch [16/18], I think you
answered your own question at the end of your response when you
pointed out the drawback that you wouldn't get linting when running
the test script manually (i.e. `./t1234-test-stuff.sh`). Ensuring that
the linter is invoked when running a test script manually is important
(at least to me) since it's a frequent step when developing a new test
or modifying an existing test. [16/18] is present to ensure that we
still get that behavior.

> I.e. if we can run against t0001-init.sh or whatever *once* to see if it
> chain-lints OK then surely we could have a rule like:
>
>         t0001-init.sh.chainlint-ok: t0001-init.sh
>                 perl chainlint.pl $< >$@
>
> Then whenever you change t0001-init.sh we refresh that
> t0001-init.sh.chainlint-ok, if the chainlint.pl exits non-zero we'll
> fail to make it, and will unlink that t0001-init.sh.chainlint-ok.
>
> That way you wouldn't need any parallelism in the Perl script, because
> you'd have "make" take care of it, and the common case of re-testing
> where the speed matters would be that we woudln't need to run this at
> all, or would only re-run it for the test scripts that changed.

A couple comments regarding parallelism: (1) as mentioned in another
response, when developing the script, I had in mind that it might be
useful for other projects (i.e. `sharness`), thus should be able to
stand on its own without advanced Makefile support, and (2) process
creation on Microsoft Windows is _very_ expensive and slow, so on that
platform, being able to lint all tests in all script with a single
invocation is a big win over running the linter 1050+ times, once for
each test script.

That's not to discredit any of your points... I'm just conveying some
of my thought process.

> (Obviously a "real" implementation would want to create that ".ok" file
> in t/.build/chainlint" or whatever)
>
> A drawback is that you'd probably be slower on the initial run, as you'd
> spwn N chainlint.pl. You could use $? instead of $< to get around that,
> but that requires some re-structuring, and I've found it to generally
> not be worth it.

The $? trick might be something Windows folk would appreciate, and
even those of us in macOS land (at least those of us with old hardware
and OS).

> It would also have the drawback that a:
>
>         ./t0001-init.sh
>
> wouldn't run the chain-lint, but this would:
>
>         make T=t0001-init.sh
>
> But if want the former to work we could carry some
> "GIT_TEST_VIA_MAKEFILE" variable or whatever, and only run the
> test-via-test-lib.sh if it isn't set.

I may be misunderstanding, but isn't the GIT_TEST_CHAIN_LINT variable
useful for this already, as in [16/18]?

Regarding your observations as a whole, I think the extract from the
cover letter which you cited above is relevant to my response. I don't
disagree with your points about using the Makefile to optimize away
unnecessary invocations of the linter, or that doing so can be a
useful future direction. As mentioned in the cover letter, though, I
think that such optimizations are outside the scope of this series
which -- aside from installing an improved linter -- aims to maintain
the status quo; in particular, this series ensures that (1) tests get
linted as they are being written/modified when the developer runs the
script manually `./t1234-test-stuff.sh`, and (2) all tests get linted
upon `make test`.

(The other reason why I'd prefer to see such optimizations applied
atop this series is that I simply don't have the time these days to
devote to major changes of direction in this series, which I think
meets its stated goals without making the situation any worse or
making it any more difficult to apply the optimizations you describe.
And the new linter has been languishing on my computer for far too
long; the implementation has been complete for well over a year, but
it took me this long to finish polishing the patch series. I'd like to
see the new linter make it into the toolchest of other developers
since it can be beneficial; it has already found scores or hundreds[1]
of possible hiding places for bugs due to broken &&-chain or missing
`|| return`, and has sniffed out some actual broken tests[2,3].)

[1]: https://lore.kernel.org/git/20211209051115.52629-1-sunshine@sunshineco.com/
[2]: https://lore.kernel.org/git/20211209051115.52629-3-sunshine@sunshineco.com/
[3]: https://lore.kernel.org/git/7b0784056f3cc0c96e9543ae44d0f5a7b0bf85fa.1661192802.git.gitgitgadget@gmail.com/

next prev parent reply	other threads:[~2022-09-03  7:51 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01  0:29 [PATCH 00/18] make test "linting" more comprehensive Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 01/18] t: add skeleton chainlint.pl Eric Sunshine via GitGitGadget
2022-09-01 12:27   ` Ævar Arnfjörð Bjarmason
2022-09-02 18:53     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 02/18] chainlint.pl: add POSIX shell lexical analyzer Eric Sunshine via GitGitGadget
2022-09-01 12:32   ` Ævar Arnfjörð Bjarmason
2022-09-03  6:00     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 03/18] chainlint.pl: add POSIX shell parser Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 04/18] chainlint.pl: add parser to validate tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 05/18] chainlint.pl: add parser to identify test definitions Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 06/18] chainlint.pl: validate test scripts in parallel Eric Sunshine via GitGitGadget
2022-09-01 12:36   ` Ævar Arnfjörð Bjarmason
2022-09-03  7:51     ` Eric Sunshine [this message]
2022-09-06 22:35   ` Eric Wong
2022-09-06 22:52     ` Eric Sunshine
2022-09-06 23:26       ` Jeff King
2022-11-21  4:02         ` Eric Sunshine
2022-11-21 13:28           ` Ævar Arnfjörð Bjarmason
2022-11-21 14:07             ` Eric Sunshine
2022-11-21 14:18               ` Ævar Arnfjörð Bjarmason
2022-11-21 14:48                 ` Eric Sunshine
2022-11-21 18:04           ` Jeff King
2022-11-21 18:47             ` Eric Sunshine
2022-11-21 18:50               ` Eric Sunshine
2022-11-21 18:52               ` Jeff King
2022-11-21 19:00                 ` Eric Sunshine
2022-11-21 19:28                   ` Jeff King
2022-11-22  0:11                   ` Ævar Arnfjörð Bjarmason
2022-09-01  0:29 ` [PATCH 07/18] chainlint.pl: don't require `return|exit|continue` to end with `&&` Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 08/18] t/Makefile: apply chainlint.pl to existing self-tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 09/18] chainlint.pl: don't require `&` background command to end with `&&` Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 10/18] chainlint.pl: don't flag broken &&-chain if `$?` handled explicitly Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 11/18] chainlint.pl: don't flag broken &&-chain if failure indicated explicitly Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 12/18] chainlint.pl: complain about loops lacking explicit failure handling Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 13/18] chainlint.pl: allow `|| echo` to signal failure upstream of a pipe Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 14/18] t/chainlint: add more chainlint.pl self-tests Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 15/18] test-lib: retire "lint harder" optimization hack Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 16/18] test-lib: replace chainlint.sed with chainlint.pl Eric Sunshine via GitGitGadget
2022-09-03  5:07   ` Elijah Newren
2022-09-03  5:24     ` Eric Sunshine
2022-09-01  0:29 ` [PATCH 17/18] t/Makefile: teach `make test` and `make prove` to run chainlint.pl Eric Sunshine via GitGitGadget
2022-09-01  0:29 ` [PATCH 18/18] t: retire unused chainlint.sed Eric Sunshine via GitGitGadget
2022-09-02 12:42   ` several messages Johannes Schindelin
2022-09-02 18:16     ` Eric Sunshine
2022-09-02 18:34       ` Jeff King
2022-09-02 18:44         ` Junio C Hamano
2022-09-11  5:28 ` [PATCH 00/18] make test "linting" more comprehensive Jeff King
2022-09-11  7:01   ` Eric Sunshine
2022-09-11 18:31     ` Jeff King
2022-09-12 23:17       ` Eric Sunshine
2022-09-13  0:04         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPig+cThSD12whinyLzhHH9qh+bR7W_AH8ea5GT6B=bd87f2RA@mail.gmail.com' \
    --to=sunshine@sunshineco.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=fs@gigacodes.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).