git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Git List <git@vger.kernel.org>
Subject: Re: [PATCH 1/6] test-lib: introduce test_commit_bulk
Date: Fri, 28 Jun 2019 08:35:28 -0400	[thread overview]
Message-ID: <2d4410a9-fd3e-8b9f-00b5-f8eba4d51b42@gmail.com> (raw)
In-Reply-To: <20190628093911.GA27329@sigill.intra.peff.net>

On 6/28/2019 5:39 AM, Jeff King wrote:
> Some tests need to create a string of commits. Doing this with
> test_commit is very heavy-weight, as it needs at least one process per
> commit (and in fact, uses several).
> 
> For bulk creation, we can do much better by using fast-import, but it's
> often a pain to generate the input. Let's provide a helper to do so.

What a quick turnaround! I'm happy to nerd-snipe you here, and am glad
the result was so positive.

> We'll use t5310 as a guinea pig, as it has three 10-commit loops. Here
> are hyperfine results before and after:
> 
>   [before]
>   Benchmark #1: ./t5310-pack-bitmaps.sh --root=/var/ram/git-tests
>     Time (mean ± σ):      2.846 s ±  0.305 s    [User: 3.042 s, System: 0.919 s]
>     Range (min … max):    2.250 s …  3.210 s    10 runs
> 
>   [after]
>   Benchmark #1: ./t5310-pack-bitmaps.sh --root=/var/ram/git-tests
>     Time (mean ± σ):      2.210 s ±  0.174 s    [User: 2.570 s, System: 0.604 s]
>     Range (min … max):    1.999 s …  2.590 s    10 runs

I ran these tests on my Windows machine, where the process startup time is
a higher cost. The improvement is noticeable from just watching the test lines
pause on the steps creating the commits.

 Before: 30.8-31.2s
  After: 23.5-23.8s

> So we're over 20% faster, while making the callers slightly shorter.

I see about the same relative measurement (~23%). The callers are a bit
cleaner, which is good. They are also slightly less clear of what's
happening, but that's the cost of abstraction. Definitely worth it in
this case!
 
> +# Similar to test_commit, but efficiently create <nr> commits, each with a
> +# unique number $n (from 1 to <nr> by default) in the commit message.
> +#
> +# Usage: test_commit_bulk [options] <nr>
> +#   -C <dir>:
> +#	Run all git commands in directory <dir>
> +#   --ref=<n>:
> +#	ref on which to create commits (default: HEAD)
> +#   --start=<n>:
> +#	number commit messages from <n> (default: 1)
> +#   --message=<msg>:
> +#	use <msg> as the commit mesasge (default: "commit $n")
> +#   --filename=<fn>:
> +#	modify <fn> in each commit (default: $n.t)
> +#   --contents=<string>:
> +#	place <string> in each file (default: "content $n")
> +#   --id=<string>:
> +#	shorthand to use <string> and $n in message, filename, and contents
> +#
> +# The message, filename, and contents strings are evaluated by the shell inside
> +# double-quotes, with $n set to the current commit number. So you can do:
> +#
> +#   test_commit_bulk --filename=file --contents='modification $n'
> +#
> +# to have every commit touch the same file, but with unique content. Spaces are
> +# OK, but you must escape any metacharacters (like backslashes or
> +# double-quotes) you do not want expanded.
> +#

I appreciate all the documentation here!

> +		while test "$total" -gt 0
> +		do
> +			echo "commit $ref" &&
> +			printf 'author %s <%s> %s\n' \
> +				"$GIT_AUTHOR_NAME" \
> +				"$GIT_AUTHOR_EMAIL" \
> +				"$cur_time -0700" &&
> +			printf 'committer %s <%s> %s\n' \
> +				"$GIT_COMMITTER_NAME" \
> +				"$GIT_COMMITTER_EMAIL" \
> +				"$cur_time -0700" &&
> +			echo "data <<EOF" &&
> +			eval "echo \"$message\"" &&
> +			echo "EOF" &&
> +			eval "echo \"M 644 inline $filename\"" &&
> +			echo "data <<EOF" &&
> +			eval "echo \"$contents\"" &&
> +			echo "EOF" &&
> +			echo &&
> +			n=$((n + 1)) &&
> +			cur_time=$((cur_time + 1)) &&
> +			total=$((total - 1)) ||
> +			echo "poison fast-import stream"
> +		done

I am not very good at the nitty-gritty details of our scripts, but
looking at this I wonder if there is a cleaner and possibly faster
way to do this loop. The top thing on my mind are the 'eval "echo X"'
lines. If they start processes, then we can improve the performance.
If not, then it may not be worth it.

In wonder if instead we could create some format string outside the
loop and then pass the values that change between iterations into
that format string.

Thanks,
-Stolee


  reply	other threads:[~2019-06-28 12:35 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-27 17:05 Git Test Coverage Report (Thurs. June 27) Derrick Stolee
2019-06-27 17:35 ` Derrick Stolee
2019-06-28  6:41   ` Jeff King
2019-06-28  9:37     ` [PATCH 0/6] easy bulk commit creation in tests Jeff King
2019-06-28  9:39       ` [PATCH 1/6] test-lib: introduce test_commit_bulk Jeff King
2019-06-28 12:35         ` Derrick Stolee [this message]
2019-06-28 18:05           ` Junio C Hamano
2019-06-29  0:09           ` Jeff King
2019-06-28 17:53         ` Junio C Hamano
2019-06-29  0:14           ` Jeff King
2019-06-28 18:44         ` Ævar Arnfjörð Bjarmason
2019-06-29  0:19           ` Jeff King
2019-06-28 21:32         ` Eric Sunshine
2019-06-28 23:04           ` SZEDER Gábor
2019-06-28 23:46             ` Eric Sunshine
2019-06-29  0:26               ` Jeff King
2019-06-29  8:24               ` SZEDER Gábor
2019-07-01 17:42                 ` Junio C Hamano
2019-06-29  0:25           ` Jeff King
2019-06-28  9:39       ` [PATCH 2/6] t5310: increase the number of bitmapped commits Jeff King
2019-06-28  9:41       ` [PATCH 3/6] t3311: use test_commit_bulk Jeff King
2019-06-28  9:41       ` [PATCH 4/6] t5702: " Jeff King
2019-06-28  9:42       ` [PATCH 5/6] t5703: " Jeff King
2019-06-28  9:42       ` [PATCH 6/6] t6200: " Jeff King
2019-06-28 12:53       ` [PATCH 0/6] easy bulk commit creation in tests Johannes Schindelin
2019-06-29  0:30         ` Jeff King
2019-06-29 16:38           ` Elijah Newren
2019-06-30  6:34             ` Jeff King
2019-06-28 18:49       ` Ævar Arnfjörð Bjarmason
2019-06-29  0:45         ` Jeff King
2019-06-29  4:53       ` [PATCH v2 1/6] test-lib: introduce test_commit_bulk Jeff King
2019-07-01 22:24         ` Junio C Hamano
2019-07-02  5:16           ` Jeff King
2019-07-01 22:28         ` Junio C Hamano
2019-07-02  5:22           ` Jeff King
2019-06-28  6:45   ` Git Test Coverage Report (Thurs. June 27) Jeff King
2019-06-28 12:23     ` Derrick Stolee
2019-06-28 23:59       ` Jeff King
2019-06-29  1:36         ` Derrick Stolee
2019-06-29  5:15           ` Jeff King
2019-06-28  9:47   ` Duy Nguyen
2019-06-28 12:39     ` Derrick Stolee
2019-06-28 13:39   ` Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2d4410a9-fd3e-8b9f-00b5-f8eba4d51b42@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).