From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
"Johannes Sixt" <j6t@kdbg.org>,
"Øystein Walle" <oystwa@gmail.com>,
"Eric Sunshine" <sunshine@sunshineco.com>,
"Taylor Blau" <me@ttaylorr.com>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH v3 00/10] generate-cmdlist.sh: make it (and "make") run faster
Date: Fri, 5 Nov 2021 15:07:58 +0100 [thread overview]
Message-ID: <cover-v3-00.10-00000000000-20211105T135058Z-avarab@gmail.com> (raw)
In-Reply-To: <cover-v2-00.10-00000000000-20211022T193027Z-avarab@gmail.com>
This series makes the rather slow generate-cmdlist.sh shellscript run
in ~20ms on my box instead of ~400ms. This helps a lot with "make"
runtime, e.g. during interactive rebase.
This hopefully addresses all the feedback on v2. I kept the
penultimate patch as explained in an updated commit message it helped
quite a bit in the CI environment, and presumably on various other
setup.
There's also a new ~1.3x speedup of "generate-cmdlist.sh: don't parse
command-list.txt thrice" new at the end here, but on some faster boxes
that's against a relative slowdown in 9/10 compared to Jeff's
8/10[1]. I opted to keep 9/10 to give slower systems where process
spawning isn't as fast the benefit of the doubt. We spend less user
time in 10/10 than in 8/10, but more system time.
1. This is HEAD~n at the tip of the series. So .2 = 8/10, .1 = 9/10 '' = 10/10
$ hyperfine --warmup 20 -L v .2,.1, 'sh generate-cmdlist.sh{v} command-list.txt'
Benchmark #1: sh generate-cmdlist.sh.2 command-list.txt
Time (mean ± σ): 19.5 ms ± 0.2 ms [User: 16.7 ms, System: 8.1 ms]
Range (min … max): 18.9 ms … 20.5 ms 151 runs
Benchmark #2: sh generate-cmdlist.sh.1 command-list.txt
Time (mean ± σ): 30.1 ms ± 0.3 ms [User: 24.8 ms, System: 17.1 ms]
Range (min … max): 29.3 ms … 31.3 ms 97 runs
Benchmark #3: sh generate-cmdlist.sh command-list.txt
Time (mean ± σ): 22.8 ms ± 0.3 ms [User: 15.2 ms, System: 10.1 ms]
Range (min … max): 22.5 ms … 23.7 ms 125 runs
Summary
'sh generate-cmdlist.sh.2 command-list.txt' ran
1.17 ± 0.02 times faster than 'sh generate-cmdlist.sh command-list.txt'
1.54 ± 0.02 times faster than 'sh generate-cmdlist.sh.1 command-list.txt'
Jeff King (1):
generate-cmdlist.sh: do not shell out to "sed"
Johannes Sixt (2):
generate-cmdlist.sh: spawn fewer processes
generate-cmdlist.sh: replace for loop by printf's auto-repeat feature
Ævar Arnfjörð Bjarmason (7):
command-list.txt: sort with "LC_ALL=C sort"
generate-cmdlist.sh: trivial whitespace change
generate-cmdlist.sh: don't call get_categories() from category_list()
generate-cmdlist.sh: run "grep | sort", not "sort | grep"
generate-cmdlist.sh: stop sorting category lines
generate-cmdlist.sh: replace "grep' invocation with a shell version
generate-cmdlist.sh: don't parse command-list.txt thrice
command-list.txt | 22 ++++++-------
generate-cmdlist.sh | 78 ++++++++++++++++++++++++++-------------------
2 files changed, 56 insertions(+), 44 deletions(-)
Range-diff against v2:
1: 96885282988 ! 1: c385e84c04c command-list.txt: sort with "LC_ALL=C sort"
@@ Commit message
separate step, right now the generate-cmdlist.sh script just uses the
order found in this file.
+ Note that this refers to the sort order of the lines in
+ command-list.txt, a subsequent commit will also change how we treat
+ the sort order of the "category" fields, but that's unrelated to this
+ change.
+
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
## command-list.txt ##
@@ command-list.txt: git-request-pull foreignscminterface
git-show-ref plumbinginterrogators
-git-sh-i18n purehelpers
-git-sh-setup purehelpers
- git-sparse-checkout mainporcelain worktree
+ git-sparse-checkout mainporcelain
-git-stash mainporcelain
git-stage complete
+git-stash mainporcelain
@@ command-list.txt: git-var plumbinginterrogators
git-whatchanged ancillaryinterrogators complete
git-worktree mainporcelain
git-write-tree plumbingmanipulators
+@@ command-list.txt: gitfaq guide
+ gitglossary guide
+ githooks guide
+ gitignore guide
+gitk mainporcelain
-+gitweb ancillaryinterrogators
- gitattributes guide
- gitcli guide
- gitcore-tutorial guide
+ gitmailmap guide
+ gitmodules guide
+ gitnamespaces guide
@@ command-list.txt: gitremote-helpers guide
gitrepository-layout guide
gitrevisions guide
@@ command-list.txt: gitremote-helpers guide
-gittutorial-2 guide
gittutorial guide
+gittutorial-2 guide
++gitweb ancillaryinterrogators
gitworkflows guide
2: 5e8fef90e42 ! 2: b4b4c3aa135 generate-cmdlist.sh: trivial whitespace change
@@ Metadata
## Commit message ##
generate-cmdlist.sh: trivial whitespace change
- This makes a subsequent diff smaller, and won't leave us with this
- syntax nit at the end.
+ Add " " before a "|" at the end of a line in generate-cmdlist.sh for
+ consistency with other code in the file. Some of the surrounding code
+ will be modified in subsequent commits.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
3: 6b4de6a6088 = 3: 737cca59d99 generate-cmdlist.sh: spawn fewer processes
4: 074685cf714 = 4: 6ad17ab56c2 generate-cmdlist.sh: don't call get_categories() from category_list()
5: f01c1fd8088 = 5: d7be565b567 generate-cmdlist.sh: run "grep | sort", not "sort | grep"
6: e0b11514b8d = 6: 646363db11f generate-cmdlist.sh: replace for loop by printf's auto-repeat feature
7: f2f37c2963b = 7: d8cc7c246b8 generate-cmdlist.sh: stop sorting category lines
8: 83318d6c0da = 8: aeeecc575fb generate-cmdlist.sh: do not shell out to "sed"
9: 7903dd1f8c2 ! 9: e2702bcc1d0 generate-cmdlist.sh: replace "grep' invocation with a shell version
@@ Commit message
test-lib-functions.sh.
On my *nix system this makes things quite a bit slower compared to
- HEAD~, but since the generate-cmdlist.sh is already quite fast, and
- this likely helps systems where command invocations are more
- expensive (i.e. Windows) let's use this anyway.
-
+ HEAD~:
+ o
'sh generate-cmdlist.sh.old command-list.txt' ran
1.56 ± 0.11 times faster than 'sh generate-cmdlist.sh command-list.txt'
18.00 ± 0.19 times faster than 'sh generate-cmdlist.sh.master command-list.txt'
- Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
+ But when I tried running generate-cmdlist.sh 100 times in CI I found
+ that it helped across the board even on OSX & Linux. I tried testing
+ it in CI with this ad-hoc few-liner:
+
+ for i in $(seq -w 0 11 | sort -nr)
+ do
+ git show HEAD~$i:generate-cmdlist.sh >generate-cmdlist-HEAD$i.sh &&
+ git add generate-cmdlist* &&
+ cp t/t0000-generate-cmdlist.sh t/t00$i-generate-cmdlist.sh || : &&
+ perl -pi -e "s/HEAD0/HEAD$i/g" t/t00$i-generate-cmdlist.sh &&
+ git add t/t00*.sh
+ done && git commit -m"generated it"
+
+ Here HEAD~02 and the t0002* file refers to this change, and HEAD~03
+ and t0003* file to the preceding commit, the relevant results were:
+
+ linux-gcc:
+
+ [12:05:33] t0002-generate-cmdlist.sh .. ok 14 ms ( 0.00 usr 0.00 sys + 3.64 cusr 3.09 csys = 6.73 CPU)
+ [12:05:30] t0003-generate-cmdlist.sh .. ok 32 ms ( 0.00 usr 0.00 sys + 2.66 cusr 1.81 csys = 4.47 CPU)
+
+ osx-gcc:
+
+ [11:58:04] t0002-generate-cmdlist.sh .. ok 80081 ms ( 0.02 usr 0.02 sys + 17.80 cusr 10.07 csys = 27.91 CPU)
+ [11:58:16] t0003-generate-cmdlist.sh .. ok 92127 ms ( 0.02 usr 0.01 sys + 22.54 cusr 14.27 csys = 36.84 CPU)
+
+ vs-test:
+
+ [12:03:14] t0002-generate-cmdlist.sh .. ok 30 s ( 0.02 usr 0.00 sys + 13.14 cusr 26.19 csys = 39.35 CPU)
+ [12:03:20] t0003-generate-cmdlist.sh .. ok 32 s ( 0.00 usr 0.02 sys + 13.25 cusr 26.10 csys = 39.37 CPU)
+
+ I.e. even on *nix running 100 of these in a loop was up to ~2x faster
+ in absolute runtime, I suspect it's due factors that are exacerbated
+ in the CI, e.g. much slower process startup due to some platform
+ limits, or a slower FS.
+
+ The "cut -d" change here is because we're not emitting the
+ 40-character aligned output anymore, i.e. we'll get the output from
+ command_list() now, not an as-is line from command-list.txt.
+
+ This also makes the parsing more reliable, as we could tweak the
+ whitespace alignment without breaking this parser. Let's reword a
+ now-inaccurate comment in "command-list.txt" describing that previous
+ alignment limitation. We'll still need the "### command-list [...]"
+ line due to the "Documentation/cmd-list.perl" logic added in
+ 11c6659d85d (command-list: prepare machinery for upcoming "common
+ groups" section, 2015-05-21).
+
+ There was a proposed change subsequent to this one[3] which continued
+ moving more logic into the "command_list() function, i.e. replaced the
+ "cut | tr | grep" chain in "category_list()" with an argument to
+ "command_list()".
+
+ That change might have had a bit of an effect, but not as much as the
+ preceding commit, so I decided to drop it. The relevant performance
+ numbers from it were:
+
+ linux-gcc:
+
+ [12:05:33] t0001-generate-cmdlist.sh .. ok 13 ms ( 0.00 usr 0.00 sys + 3.33 cusr 2.78 csys = 6.11 CPU)
+ [12:05:33] t0002-generate-cmdlist.sh .. ok 14 ms ( 0.00 usr 0.00 sys + 3.64 cusr 3.09 csys = 6.73 CPU)
+
+ osx-gcc:
+
+ [11:58:03] t0001-generate-cmdlist.sh .. ok 78416 ms ( 0.02 usr 0.01 sys + 11.78 cusr 6.22 csys = 18.03 CPU)
+ [11:58:04] t0002-generate-cmdlist.sh .. ok 80081 ms ( 0.02 usr 0.02 sys + 17.80 cusr 10.07 csys = 27.91 CPU)
+
+ vs-test:
+
+ [12:03:20] t0001-generate-cmdlist.sh .. ok 34 s ( 0.00 usr 0.03 sys + 12.42 cusr 19.55 csys = 32.00 CPU)
+ [12:03:14] t0002-generate-cmdlist.sh .. ok 30 s ( 0.02 usr 0.00 sys + 13.14 cusr 26.19 csys = 39.35 CPU)
+
+ As above HEAD~2 and t0002* are testing the code in this commit (and
+ the line is the same), but HEAD~1 and t0001* are testing that dropped
+ change in [3].
+
+ 1. https://lore.kernel.org/git/cover-v2-00.10-00000000000-20211022T193027Z-avarab@gmail.com/
+ 2. https://lore.kernel.org/git/patch-v2-08.10-83318d6c0da-20211022T193027Z-avarab@gmail.com/
+ 3. https://lore.kernel.org/git/patch-v2-10.10-e10a43756d1-20211022T193027Z-avarab@gmail.com/
+
+ ## command-list.txt ##
+@@
+ # specified here, which can only have "guide" attribute and nothing
+ # else.
+ #
+-### command list (do not change this line, also do not change alignment)
++### command list (do not change this line)
+ # command name category [category] [category]
+ git-add mainporcelain worktree
+ git-am mainporcelain
## generate-cmdlist.sh ##
@@ generate-cmdlist.sh: die () {
@@ generate-cmdlist.sh: die () {
+ while read cmd rest
+ do
+ case "$cmd" in
-+ "#"*)
-+ continue;
++ "#"* | '')
++ # Ignore comments and allow empty lines
++ continue
+ ;;
+ *)
+ case "$exclude_programs" in
-+ *":$cmd:"*)
++ *":$cmd:"*)
+ ;;
+ *)
+ echo "$cmd $rest"
+ ;;
+ esac
+ esac
-+ done
++ done <"$1"
}
category_list () {
-- command_list "$1" |
+ command_list "$1" |
- cut -c 40- |
-+ command_list <"$1" |
+ cut -d' ' -f2- |
tr ' ' '\012' |
grep -v '^$' |
LC_ALL=C sort -u
-@@ generate-cmdlist.sh: define_category_names () {
- print_command_list () {
- echo "static struct cmdname_help command_list[] = {"
-
-- command_list "$1" |
-+ command_list <"$1" |
- while read cmd rest
- do
- synopsis=
@@ generate-cmdlist.sh: print_command_list () {
echo "};"
}
10: e10a43756d1 < -: ----------- generate-cmdlist.sh: replace "cut", "tr" and "grep" with pure-shell
-: ----------- > 10: 100084070fd generate-cmdlist.sh: don't parse command-list.txt thrice
--
2.34.0.rc1.721.ga0c1db665bc
next prev parent reply other threads:[~2021-11-05 14:08 UTC|newest]
Thread overview: 87+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-24 13:16 Why the Makefile is so eager to re-build & re-link Ævar Arnfjörð Bjarmason
2021-06-24 15:16 ` Jeff King
2021-06-24 15:28 ` Ævar Arnfjörð Bjarmason
2021-06-24 21:30 ` Johannes Sixt
2021-06-25 8:34 ` Ævar Arnfjörð Bjarmason
2021-06-25 9:01 ` Ævar Arnfjörð Bjarmason
2021-06-29 2:13 ` Jeff King
2021-10-20 18:39 ` [PATCH 0/8] Makefile: make command-list.h 2-5x as fast with -jN Ævar Arnfjörð Bjarmason
2021-10-20 18:39 ` [PATCH 1/8] command-list.txt: sort with "LC_ALL=C sort" Ævar Arnfjörð Bjarmason
2021-10-20 18:39 ` [PATCH 2/8] generate-cmdlist.sh: trivial whitespace change Ævar Arnfjörð Bjarmason
2021-10-20 18:39 ` [PATCH 3/8] generate-cmdlist.sh: spawn fewer processes Ævar Arnfjörð Bjarmason
2021-10-20 18:39 ` [PATCH 4/8] generate-cmdlist.sh: don't call get_categories() from category_list() Ævar Arnfjörð Bjarmason
2021-10-20 18:39 ` [PATCH 5/8] generate-cmdlist.sh: run "grep | sort", not "sort | grep" Ævar Arnfjörð Bjarmason
2021-10-20 18:39 ` [PATCH 6/8] generate-cmdlist.sh: replace for loop by printf's auto-repeat feature Ævar Arnfjörð Bjarmason
2021-10-21 14:42 ` Jeff King
2021-10-21 16:25 ` Jeff King
2021-10-20 18:39 ` [PATCH 7/8] Makefile: stop having command-list.h depend on a wildcard Ævar Arnfjörð Bjarmason
2021-10-21 14:45 ` Jeff King
2021-10-21 18:24 ` Junio C Hamano
2021-10-21 22:46 ` Øystein Walle
2021-10-20 18:39 ` [PATCH 8/8] Makefile: assert correct generate-cmdlist.sh output Ævar Arnfjörð Bjarmason
2021-10-20 20:35 ` [PATCH 0/8] Makefile: make command-list.h 2-5x as fast with -jN Jeff King
2021-10-20 21:31 ` Taylor Blau
2021-10-20 23:14 ` Ævar Arnfjörð Bjarmason
2021-10-20 23:46 ` Jeff King
2021-10-21 0:48 ` Ævar Arnfjörð Bjarmason
2021-10-21 2:20 ` Taylor Blau
2021-10-22 12:37 ` Ævar Arnfjörð Bjarmason
2021-10-21 14:34 ` Jeff King
2021-10-21 22:34 ` Junio C Hamano
2021-10-22 10:51 ` Ævar Arnfjörð Bjarmason
2021-10-22 18:31 ` Jeff King
2021-10-22 20:50 ` Ævar Arnfjörð Bjarmason
2021-10-21 5:39 ` Eric Sunshine
2021-10-22 19:36 ` [PATCH v2 00/10] Makefile: make generate-cmdlist.sh much faster Ævar Arnfjörð Bjarmason
2021-10-22 19:36 ` [PATCH v2 01/10] command-list.txt: sort with "LC_ALL=C sort" Ævar Arnfjörð Bjarmason
2021-10-25 18:29 ` Junio C Hamano
2021-10-25 21:22 ` Ævar Arnfjörð Bjarmason
2021-10-25 21:26 ` Junio C Hamano
2021-10-22 19:36 ` [PATCH v2 02/10] generate-cmdlist.sh: trivial whitespace change Ævar Arnfjörð Bjarmason
2021-10-22 19:36 ` [PATCH v2 03/10] generate-cmdlist.sh: spawn fewer processes Ævar Arnfjörð Bjarmason
2021-10-22 19:36 ` [PATCH v2 04/10] generate-cmdlist.sh: don't call get_categories() from category_list() Ævar Arnfjörð Bjarmason
2021-10-22 19:36 ` [PATCH v2 05/10] generate-cmdlist.sh: run "grep | sort", not "sort | grep" Ævar Arnfjörð Bjarmason
2021-10-22 19:36 ` [PATCH v2 06/10] generate-cmdlist.sh: replace for loop by printf's auto-repeat feature Ævar Arnfjörð Bjarmason
2021-10-22 19:36 ` [PATCH v2 07/10] generate-cmdlist.sh: stop sorting category lines Ævar Arnfjörð Bjarmason
2021-10-25 16:39 ` Jeff King
2021-10-22 19:36 ` [PATCH v2 08/10] generate-cmdlist.sh: do not shell out to "sed" Ævar Arnfjörð Bjarmason
2021-10-25 16:46 ` Jeff King
2021-10-25 17:52 ` Jeff King
2021-10-22 19:36 ` [PATCH v2 09/10] generate-cmdlist.sh: replace "grep' invocation with a shell version Ævar Arnfjörð Bjarmason
2021-10-23 22:19 ` Junio C Hamano
2021-10-23 22:26 ` Junio C Hamano
2021-10-22 19:36 ` [PATCH v2 10/10] generate-cmdlist.sh: replace "cut", "tr" and "grep" with pure-shell Ævar Arnfjörð Bjarmason
2021-10-23 22:26 ` Junio C Hamano
2021-10-22 21:20 ` [PATCH v2 00/10] Makefile: make generate-cmdlist.sh much faster Taylor Blau
2021-10-23 22:34 ` Junio C Hamano
2021-10-25 16:57 ` Jeff King
2021-11-05 14:07 ` Ævar Arnfjörð Bjarmason [this message]
2021-11-05 14:07 ` [PATCH v3 01/10] command-list.txt: sort with "LC_ALL=C sort" Ævar Arnfjörð Bjarmason
2021-11-05 22:45 ` Junio C Hamano
2021-11-06 4:26 ` Ævar Arnfjörð Bjarmason
2021-11-08 19:18 ` Junio C Hamano
2021-11-05 14:08 ` [PATCH v3 02/10] generate-cmdlist.sh: trivial whitespace change Ævar Arnfjörð Bjarmason
2021-11-05 14:08 ` [PATCH v3 03/10] generate-cmdlist.sh: spawn fewer processes Ævar Arnfjörð Bjarmason
2021-11-05 22:47 ` Junio C Hamano
2021-11-06 4:23 ` Ævar Arnfjörð Bjarmason
2021-11-05 14:08 ` [PATCH v3 04/10] generate-cmdlist.sh: don't call get_categories() from category_list() Ævar Arnfjörð Bjarmason
2021-11-05 14:08 ` [PATCH v3 05/10] generate-cmdlist.sh: run "grep | sort", not "sort | grep" Ævar Arnfjörð Bjarmason
2021-11-05 14:08 ` [PATCH v3 06/10] generate-cmdlist.sh: replace for loop by printf's auto-repeat feature Ævar Arnfjörð Bjarmason
2021-11-05 14:08 ` [PATCH v3 07/10] generate-cmdlist.sh: stop sorting category lines Ævar Arnfjörð Bjarmason
2021-11-05 14:08 ` [PATCH v3 08/10] generate-cmdlist.sh: do not shell out to "sed" Ævar Arnfjörð Bjarmason
2021-11-05 14:08 ` [PATCH v3 09/10] generate-cmdlist.sh: replace "grep' invocation with a shell version Ævar Arnfjörð Bjarmason
2021-11-05 14:08 ` [PATCH v3 10/10] generate-cmdlist.sh: don't parse command-list.txt thrice Ævar Arnfjörð Bjarmason
2021-06-25 21:17 ` Why the Makefile is so eager to re-build & re-link Felipe Contreras
2021-06-29 5:04 ` Eric Sunshine
2021-06-24 23:35 ` Øystein Walle
2021-06-24 23:39 ` Øystein Walle
2021-06-25 0:11 ` Ævar Arnfjörð Bjarmason
2021-07-02 11:58 ` [PATCH] Documentation/Makefile: don't re-build on 'git version' changes Ævar Arnfjörð Bjarmason
2021-07-02 15:53 ` Junio C Hamano
2021-07-03 11:58 ` Ævar Arnfjörð Bjarmason
2021-07-05 19:48 ` Junio C Hamano
2021-07-03 1:05 ` Felipe Contreras
2021-07-03 12:03 ` Ævar Arnfjörð Bjarmason
2021-07-03 18:56 ` Felipe Contreras
2021-07-05 19:38 ` Junio C Hamano
2021-07-06 22:25 ` Felipe Contreras
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover-v3-00.10-00000000000-20211105T135058Z-avarab@gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=j6t@kdbg.org \
--cc=me@ttaylorr.com \
--cc=oystwa@gmail.com \
--cc=peff@peff.net \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).