git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
	Jeff King <peff@peff.net>
Subject: Re: [PATCH 3/5] cocci: make "coccicheck" rule incremental
Date: Fri, 26 Aug 2022 00:18:28 +0200	[thread overview]
Message-ID: <220826.861qt43pcc.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <20220825194418.GI1735@szeder.dev>


On Thu, Aug 25 2022, SZEDER Gábor wrote:

Thanks for taking a look!

> On Thu, Aug 25, 2022 at 04:36:15PM +0200, Ævar Arnfjörð Bjarmason wrote:
>> * Since we create a single "*.cocci.patch+" we don't know where to
>>   pick up where we left off. Note that (per [1]) we've had a race
>>   condition since 960154b9c17 (coccicheck: optionally batch spatch
>>   invocations, 2019-05-06) which might result in us producing corrupt
>>   patches to to the concurrent appending to "$@+" in the pre-image.
>> 
>>   That bug is now fixed.
>
> There is no bug, because there is no concurrent appending to "$@+".
> The message you mention seems to be irrelevant, as it talks about
> 'xargs -P', but the invocation in '%.cocci.patch' targets never used
> '-P'.

I think this is just confusing, I'll amend/rephrase.

And at this point I honestly can't remember if I'm conflating this with
an issue with my earlier proposed series here (I drafted this a while
ago), or a rather obscure "hidden" feature that I did use to speed up
coccicheck for myself for a while. Which is that you can do e.g.:

	make coccicheck SPATCH_BATCH_SIZE="8 -P 8"

Which will invoke xargs with the "-P 8" option in batches of 8 files.

That's never a thing that 960154b9c17 expected, and it only worked as an
accident, but it *would* work, and unless you got unlucky with the races
involved would generally speed up your coccicheck.

>> Which is why we'll not depend on $(FOUND_H_SOURCES) but the *.o file
>> corresponding to the *.c file, if it exists already. This means that
>> we can do:
>> 
>>     make all
>>     make coccicheck
>>     make -W column.h coccicheck
>> 
>> By depending on the *.o we piggy-back on
>> COMPUTE_HEADER_DEPENDENCIES. See c234e8a0ecf (Makefile: make the
>> "sparse" target non-.PHONY, 2021-09-23) for prior art of doing that
>> for the *.sp files. E.g.:
>> 
>>     make contrib/coccinelle/free.cocci.patch
>>     make -W column.h contrib/coccinelle/free.cocci.patch
>> 
>> Will take around 15 seconds for the second command on my 8 core box if
>> I didn't run "make" beforehand to create the *.o files. But around 2
>> seconds if I did and we have those "*.o" files.
>> 
>> Notes about the approach of piggy-backing on *.o for dependencies:
>> 
>> * It *is* a trade-off since we'll pay the extra cost of running the C
>>   compiler, but we're probably doing that anyway.
>
> This assumption doesn't hold, and I very much dislike the idea of
> depending on *.o files:

It's my fault for not calling this out more explicitly, but it
*optionally* depends on the *.o files, but if you don't have them
compiled already a "make coccicheck" will just use "spatch", and nothing
else.

See the CI run/output for this series:
https://github.com/avar/git/runs/8017916844?check_suite_focus=true

>   - Our static-analysis CI job doesn't build Git, now it will have to.

Aside from this series which doesn't change how it works, maintaining
this doesn't seem important to me. E.g. as I noted in [1] coccinelle
will happily run on code that doesn't even compile.

So running the compiler during "static-analysis" (or another step that
it would depend on, maybe "pedantic" or "sparse") seems like a good (but
separate change).

You don't want to wonder about odd coccinelle output, only to see it's
trying to make sense of C source that doesn't even compile.

>   - I don't have Coccinelle installed, because my distro doesn't ship
>     it, and though the previous release did ship it, it was outdated.
>     Instead I use Coccinelle's most recent version from a container
>     which doesn't contain any build tools apart from 'make' for 'make
>     coccicheck'.
>
>     With this patch series I can't use this containerized Coccinelle
>     at all, because even though I've already built git on the host,
>     the dependency on *.o files triggers a BUILD-OPTIONS check during
>     'make coccicheck', and due to the missing 'curl-config' the build
>     options do differ, triggering a rebuild, which in the absence of a
>     compiler fails.
>
>     And then the next 'make' on the host will have to rebuild
>     everything again...

There may be some odd interaction here, but it's unclear if you've
actually tried to do this with this series, because unless I've missed
some edge case this should all still work, per the above.

What *won't* work is avoiding potential re-compilation of *.o files if
you *have them already* when you run "make coccicheck". I'm not familiar
with this type of setup, are you saying you're running "make coccicheck"
on a working directory that already has *.o files, but you want it to
ignore the *.o?

That could easily be made optional, but I just assumed that nobody would
care. If you want that can you try this on top and see if it works for
you?:
	
	diff --git a/Makefile b/Makefile
	index 9410a587fc0..11d83c490b4 100644
	--- a/Makefile
	+++ b/Makefile
	@@ -3174,6 +3174,11 @@ TINY_FOUND_H_SOURCES += strbuf.h
	 	$(call mkdir_p_parent_template)
	 	$(QUIET_GEN) >$@
	 
	+SPATCH_USE_O_DEPENDENCIES = yes
	+ifeq ($(COMPUTE_HEADER_DEPENDENCIES),no)
	+SPATCH_USE_O_DEPENDENCIES =
	+endif
	+
	 define cocci-rule
	 
	 ## Rule for .build/$(1).patch/$(2); Params:
	@@ -3181,7 +3186,7 @@ define cocci-rule
	 # 2 = $(2)
	 COCCI_$(1:contrib/coccinelle/%.cocci=%) += .build/$(1).patch/$(2)
	 .build/$(1).patch/$(2): GIT-SPATCH-DEFINES
	-.build/$(1).patch/$(2): $(if $(wildcard $(3)),$(3),$(if $(filter $(USE_TINY_FOUND_H_SOURCES),$(3)),$(TINY_FOUND_H_SOURCES),.build/contrib/coccinelle/FOUND_H_SOURCES))
	+.build/$(1).patch/$(2): $(if $(and $(SPATCH_USE_O_DEPENDENCIES),$(wildcard $(3))),$(3),$(if $(filter $(USE_TINY_FOUND_H_SOURCES),$(3)),$(TINY_FOUND_H_SOURCES),.build/contrib/coccinelle/FOUND_H_SOURCES))
	 .build/$(1).patch/$(2): $(1)
	 .build/$(1).patch/$(2): .build/$(1).patch/% : %
	 	$$(call mkdir_p_parent_template)

I'd need to split up the already long line, but with *.o files compiled
and:
	
	$ time make -W column.h contrib/coccinelle/free.cocci.patch COMPUTE_HEADER_DEPENDENCIES=yes
	    CC wt-status.o
	    CC builtin/branch.o
	    CC builtin/clean.o
	    CC builtin/column.o
	    CC builtin/commit.o
	    CC builtin/tag.o
	    CC column.o
	    CC help.o
	    SPATCH .build/contrib/coccinelle/free.cocci.patch/builtin/commit.c
	    SPATCH .build/contrib/coccinelle/free.cocci.patch/builtin/tag.c
	    SPATCH .build/contrib/coccinelle/free.cocci.patch/builtin/column.c
	    SPATCH .build/contrib/coccinelle/free.cocci.patch/column.c
	    SPATCH .build/contrib/coccinelle/free.cocci.patch/wt-status.c
	    SPATCH .build/contrib/coccinelle/free.cocci.patch/builtin/branch.c
	    SPATCH .build/contrib/coccinelle/free.cocci.patch/builtin/clean.c
	    SPATCH .build/contrib/coccinelle/free.cocci.patch/help.c
	    SPATCH MERGE contrib/coccinelle/free.cocci.patch
	
	real    0m0.550s
	user    0m0.505s
	sys     0m0.096s

But if you set COMPUTE_HEADER_DEPENDENCIES=yes it'll take ~4s (on that
box), and well re-apply the free.cocci rule to all the *.c files (and
this is all with the caching mechanism in 5/5, real "spatch" will be
much slower).

If you already have git compiled (or partially compiled) the "happy
path" to avoiding work is almost definitely to re-compile that *.o
because the *.c changed, at which point we'll be able to see if we even
need to re-run any of the coccinelle rules. Usually we'll need to re-run
a far smaller set than the full set we operate on now.

>>  * We can take better advantage of parallelism, while making sure that
>>    we don't racily append to the contrib/coccinelle/swap.cocci.patch
>>    file from multiple workers.
>> 
>>    Before this change running "make coccicheck" would by default end
>>    up pegging just one CPU at the very end for a while, usually as
>>    we'd finish whichever *.cocci rule was the most expensive.
>> 
>>    This could be mitigated by combining "make -jN" with
>>    SPATCH_BATCH_SIZE, see 960154b9c17 (coccicheck: optionally batch
>>    spatch invocations, 2019-05-06). But doing so required careful
>>    juggling, as e.g. setting both to 4 would yield 16 workers.
>
> No, setting both to 4 does yield 4 workers.
>
> SPATCH_BATCH_SIZE has nothing to do with parallelism; it is merely the
> number of C source files that we pass to a single 'spatch' invocation,
> but for any given semantic patch it's still a sequential loop.

Thanks, will fix. I see I conflated SPATCH_BATCH_SIZE with spatch's
--jobs there (although from experimentation that seems to have pretty
limited parallelism).

1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/

  reply	other threads:[~2022-08-25 22:56 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-25 14:36 [PATCH 0/5] cocci: make "incremental" possible + a ccache-like tool Ævar Arnfjörð Bjarmason
2022-08-25 14:36 ` [PATCH 1/5] Makefile: add ability to TAB-complete cocci *.patch rules Ævar Arnfjörð Bjarmason
2022-08-25 14:36 ` [PATCH 2/5] Makefile: have "coccicheck" re-run if flags change Ævar Arnfjörð Bjarmason
2022-08-25 15:29   ` SZEDER Gábor
2022-08-25 14:36 ` [PATCH 3/5] cocci: make "coccicheck" rule incremental Ævar Arnfjörð Bjarmason
2022-08-25 19:44   ` SZEDER Gábor
2022-08-25 22:18     ` Ævar Arnfjörð Bjarmason [this message]
2022-08-26 10:43       ` SZEDER Gábor
2022-08-25 14:36 ` [PATCH 4/5] cocci: make incremental compilation even faster Ævar Arnfjörð Bjarmason
2022-08-25 14:36 ` [PATCH 5/5] spatchcache: add a ccache-alike for "spatch" Ævar Arnfjörð Bjarmason
2022-08-31 20:57 ` [PATCH v2 0/9] cocci: make "incremental" possible + a ccache-like tool Ævar Arnfjörð Bjarmason
2022-08-31 20:57   ` [PATCH v2 1/9] cocci rules: remove unused "F" metavariable from pending rule Ævar Arnfjörð Bjarmason
2022-08-31 20:57   ` [PATCH v2 2/9] Makefile: add ability to TAB-complete cocci *.patch rules Ævar Arnfjörð Bjarmason
2022-08-31 20:57   ` [PATCH v2 3/9] Makefile: have "coccicheck" re-run if flags change Ævar Arnfjörð Bjarmason
2022-08-31 20:57   ` [PATCH v2 4/9] Makefile: split off SPATCH_BATCH_SIZE comment from "cocci" heading Ævar Arnfjörð Bjarmason
2022-08-31 20:57   ` [PATCH v2 5/9] cocci: split off include-less "tests" from SPATCH_FLAGS Ævar Arnfjörð Bjarmason
2022-08-31 20:57   ` [PATCH v2 6/9] cocci: split off "--all-includes" " Ævar Arnfjörð Bjarmason
2022-08-31 20:57   ` [PATCH v2 7/9] cocci: make "coccicheck" rule incremental Ævar Arnfjörð Bjarmason
2022-09-01 16:38     ` SZEDER Gábor
2022-09-01 18:04       ` Ævar Arnfjörð Bjarmason
2022-08-31 20:57   ` [PATCH v2 8/9] cocci: optimistically use COMPUTE_HEADER_DEPENDENCIES Ævar Arnfjörð Bjarmason
2022-08-31 20:57   ` [PATCH v2 9/9] spatchcache: add a ccache-alike for "spatch" Ævar Arnfjörð Bjarmason
2022-10-14 15:31   ` [PATCH v3 00/11] cocci: make "incremental" possible + a ccache-like tool Ævar Arnfjörð Bjarmason
2022-10-14 15:31     ` [PATCH v3 01/11] Makefile + shared.mak: rename and indent $(QUIET_SPATCH_T) Ævar Arnfjörð Bjarmason
2022-10-14 15:31     ` [PATCH v3 02/11] cocci rules: remove unused "F" metavariable from pending rule Ævar Arnfjörð Bjarmason
2022-10-14 15:31     ` [PATCH v3 03/11] Makefile: add ability to TAB-complete cocci *.patch rules Ævar Arnfjörð Bjarmason
2022-10-14 15:31     ` [PATCH v3 04/11] Makefile: have "coccicheck" re-run if flags change Ævar Arnfjörð Bjarmason
2022-10-14 15:31     ` [PATCH v3 05/11] Makefile: split off SPATCH_BATCH_SIZE comment from "cocci" heading Ævar Arnfjörð Bjarmason
2022-10-14 20:39       ` Taylor Blau
2022-10-14 15:31     ` [PATCH v3 06/11] cocci: split off include-less "tests" from SPATCH_FLAGS Ævar Arnfjörð Bjarmason
2022-10-14 15:31     ` [PATCH v3 07/11] cocci: split off "--all-includes" " Ævar Arnfjörð Bjarmason
2022-10-14 15:31     ` [PATCH v3 08/11] cocci: make "coccicheck" rule incremental Ævar Arnfjörð Bjarmason
2022-10-14 15:31     ` [PATCH v3 09/11] cocci: optimistically use COMPUTE_HEADER_DEPENDENCIES Ævar Arnfjörð Bjarmason
2022-10-14 15:31     ` [PATCH v3 10/11] cocci: run against a generated ALL.cocci Ævar Arnfjörð Bjarmason
2022-10-14 15:31     ` [PATCH v3 11/11] spatchcache: add a ccache-alike for "spatch" Ævar Arnfjörð Bjarmason
2022-10-17 17:50     ` [PATCH v3 00/11] cocci: make "incremental" possible + a ccache-like tool Jeff King
2022-10-17 18:36       ` Ævar Arnfjörð Bjarmason
2022-10-17 19:08         ` Junio C Hamano
2022-10-17 19:18         ` Jeff King
2022-10-26 14:20     ` [PATCH v4 00/12] " Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 01/12] Makefile + shared.mak: rename and indent $(QUIET_SPATCH_T) Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 02/12] cocci rules: remove unused "F" metavariable from pending rule Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 03/12] Makefile: add ability to TAB-complete cocci *.patch rules Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 04/12] Makefile: have "coccicheck" re-run if flags change Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 05/12] Makefile: split off SPATCH_BATCH_SIZE comment from "cocci" heading Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 06/12] cocci: split off include-less "tests" from SPATCH_FLAGS Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 07/12] cocci: split off "--all-includes" " Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 08/12] cocci: make "coccicheck" rule incremental Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 09/12] cocci: optimistically use COMPUTE_HEADER_DEPENDENCIES Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 10/12] cocci rules: remove <id>'s from rules that don't need them Ævar Arnfjörð Bjarmason
2022-10-26 14:20       ` [PATCH v4 11/12] cocci: run against a generated ALL.cocci Ævar Arnfjörð Bjarmason
2022-10-28 12:58         ` SZEDER Gábor
2022-10-26 14:20       ` [PATCH v4 12/12] spatchcache: add a ccache-alike for "spatch" Ævar Arnfjörð Bjarmason
2022-11-01 22:35       ` [PATCH v5 00/13] cocci: make "incremental" possible + a ccache-like tool Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 01/13] Makefile + shared.mak: rename and indent $(QUIET_SPATCH_T) Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 02/13] cocci rules: remove unused "F" metavariable from pending rule Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 03/13] Makefile: add ability to TAB-complete cocci *.patch rules Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 04/13] Makefile: have "coccicheck" re-run if flags change Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 05/13] Makefile: split off SPATCH_BATCH_SIZE comment from "cocci" heading Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 06/13] cocci: split off include-less "tests" from SPATCH_FLAGS Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 07/13] cocci: split off "--all-includes" " Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 08/13] cocci: make "coccicheck" rule incremental Ævar Arnfjörð Bjarmason
2022-11-09 14:57           ` SZEDER Gábor
2022-11-01 22:35         ` [PATCH v5 09/13] cocci: optimistically use COMPUTE_HEADER_DEPENDENCIES Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 10/13] Makefile: copy contrib/coccinelle/*.cocci to build/ Ævar Arnfjörð Bjarmason
2022-11-09 15:05           ` SZEDER Gábor
2022-11-09 15:42             ` Ævar Arnfjörð Bjarmason
2022-11-10 16:14               ` [PATCH] Makefile: don't create a ".build/.build/" for cocci, fix output Ævar Arnfjörð Bjarmason
2022-11-11 22:22                 ` Taylor Blau
2022-11-01 22:35         ` [PATCH v5 11/13] cocci rules: remove <id>'s from rules that don't need them Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 12/13] cocci: run against a generated ALL.cocci Ævar Arnfjörð Bjarmason
2022-11-01 22:35         ` [PATCH v5 13/13] spatchcache: add a ccache-alike for "spatch" Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220826.861qt43pcc.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).