From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Jacob Keller <jacob.e.keller@intel.com>,
git@vger.kernel.org, Jacob Keller <jacob.keller@gmail.com>
Subject: Re: [PATCH] coccicheck: process every source file at once
Date: Wed, 3 Oct 2018 12:16:58 +0200 [thread overview]
Message-ID: <20181003101658.GM23446@localhost> (raw)
In-Reply-To: <20181002195519.GB2014@sigill.intra.peff.net>
On Tue, Oct 02, 2018 at 03:55:19PM -0400, Jeff King wrote:
> On Tue, Oct 02, 2018 at 12:16:42PM -0700, Jacob Keller wrote:
>
> > make coccicheck is used in order to apply coccinelle semantic patches,
> > and see if any of the transformations found within contrib/coccinelle/
> > can be applied to the current code base.
> >
> > Pass every file to a single invocation of spatch, instead of running
> > spatch once per source file.
> >
> > This reduces the time required to run make coccicheck by a significant
> > amount of time:
> >
> > Prior timing of make coccicheck
> > real 6m14.090s
> > user 25m2.606s
> > sys 1m22.919s
> >
> > New timing of make coccicheck
> > real 1m36.580s
> > user 7m55.933s
> > sys 0m18.219s
>
> Yay! This is a nice result.
>
> It's also one of the things that Julia suggested in an earlier thread.
> One thing I wasn't quite sure about after digging into the various
> versions (1.0.4 on Debian stable/unstable, 1.0.6 in experimental, and
> pre-1.0.7 at the bleeding edge) was whether the old versions would be
> similarly helped (or work at all).
>
> I just replicated your results with 1.0.4.deb-3+b2 from Debian stable.
> It's possible there are older versions floating around, but for
> something developer-only like this, I think "in Debian stable" is a
> reasonable enough cutoff.
Linux build jobs on Travis CI run Ubuntu Trusty 14.04 LTS, and
therefore our static analysis build job still runs 1.0.0~rc19.deb-3.
This patch appears to work fine with that version, too, though note
that I also changed the build job to don't run two parallel jobs; for
the reason see below.
> > This is nearly a 4x decrease in the time required to run make
> > coccicheck. This is due to the overhead of restarting spatch for every
> > file. By processing all files at once, we can amortize this startup cost
> > across the total number of files, rather than paying it once per file.
>
> One minor side effect is that we lose the opportunity to parallelize
> quite as much. However, I think the reduction in total CPU makes it well
> worth that (and we still have 8 cocci files and growing, which should
> keep at least 8 cores busy).
One major side effect, however, is the drastically increased memory
usage resulting from processing all files at once. With your patch on
top of current master:
$ for cocci in contrib/coccinelle/*.cocci ; do command time -f 'max RSS: %MkB' make ${cocci}.patch ; done
SPATCH contrib/coccinelle/array.cocci
max RSS: 1537068kB
SPATCH contrib/coccinelle/commit.cocci
max RSS: 1510920kB
SPATCH contrib/coccinelle/free.cocci
max RSS: 1393712kB
SPATCH contrib/coccinelle/object_id.cocci
SPATCH result: contrib/coccinelle/object_id.cocci.patch
max RSS: 1831700kB
SPATCH contrib/coccinelle/qsort.cocci
max RSS: 1384960kB
SPATCH contrib/coccinelle/strbuf.cocci
max RSS: 1395800kB
SPATCH contrib/coccinelle/swap.cocci
max RSS: 1393620kB
SPATCH contrib/coccinelle/xstrdup_or_null.cocci
max RSS: 1371716kB
Without your patch the max RSS lingers around 87048kB - 101980kB,
meaning ~15x - 18x increase
This might cause quite the surprise to developers who are used to run
'make -jN coccicheck'; my tiny laptop came to a grinding halt with
-j4.
I think this side effect should be mentioned in the commit message.
Furthermore, the above mentioned static analysis build job on Travis
CI runs two parallel jobs. Perhaps we should be considerate and
reduce that to a single job, even if that means that he build job will
run longer.
> I think recent versions of Coccinelle will actually parallelize
> internally, too, but my 1.0.4 doesn't seem to. That's probably what I
> was thinking of earlier (but this is still a win even without that).
I think Coccinelle parallelizes only when invoked with -j <N>, but
that option is not documented in 1.0.4. I wrote "documented" instead
of "present", because e.g.:
$ spatch --sp-file contrib/coccinelle/swap.cocci -j 2 a*.c
doesn't abort with "unknown option" and usage, but runs for a bit and
then outputs:
init_defs_builtins: /usr/lib/coccinelle/standard.h
HANDLING: abspath.c advice.c alias.c alloc.c apply.c archive.c archive-tar.c archive-zip.c argv-array.c attr.c
Fatal error: exception Sys_error("swap: No such file or directory")
Without '-j 2' the command runs just fine. I don't know what to make
of it.
> It looks like this also fixes a problem I ran into when doing the oideq
> work, which is that the patch for a header file would get shown multiple
> times (once for each file that includes it). So this is faster _and_
> more correct. Double-yay.
In that case I used to apply the change to the header first, and then
apply the semantic patch one more time.
next prev parent reply other threads:[~2018-10-03 10:17 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-02 19:16 [PATCH] coccicheck: process every source file at once Jacob Keller
2018-10-02 19:55 ` Jeff King
2018-10-02 20:00 ` Jacob Keller
2018-10-02 20:31 ` Jeff King
2018-10-02 20:58 ` Jacob Keller
2018-10-02 21:08 ` Jeff King
2018-10-03 10:16 ` SZEDER Gábor [this message]
2018-10-03 15:05 ` SZEDER Gábor
2018-10-03 15:52 ` Jacob Keller
2018-10-03 17:54 ` Stefan Beller
2018-10-03 15:51 ` Jacob Keller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181003101658.GM23446@localhost \
--to=szeder.dev@gmail.com \
--cc=git@vger.kernel.org \
--cc=jacob.e.keller@intel.com \
--cc=jacob.keller@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).