git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Thomas Gummerer <t.gummerer@gmail.com>
To: git@jeffhostetler.com
Cc: git@vger.kernel.org, gitster@pobox.com, peff@peff.net,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH v11 2/5] p0006-read-tree-checkout: perf test to time read-tree
Date: Tue, 18 Apr 2017 22:40:25 +0100	[thread overview]
Message-ID: <20170418214025.GA4989@hank> (raw)
In-Reply-To: <20170417213734.55373-3-git@jeffhostetler.com>

On 04/17, git@jeffhostetler.com wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Created t/perf/repos/many-files.sh to generate large, but
> artificial repositories.
> 
> Created t/perf/p0006-read-tree-checkout.sh to measure
> performance on various read-tree, checkout, and update-index
> operations.  This test can run using either artificial repos
> described above or normal repos.
> 
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  t/perf/p0006-read-tree-checkout.sh |  67 ++++++++++++++++++++++
>  t/perf/repos/.gitignore            |   1 +
>  t/perf/repos/many-files.sh         | 110 +++++++++++++++++++++++++++++++++++++
>  3 files changed, 178 insertions(+)
>  create mode 100755 t/perf/p0006-read-tree-checkout.sh
>  create mode 100644 t/perf/repos/.gitignore
>  create mode 100755 t/perf/repos/many-files.sh
> 
> diff --git a/t/perf/p0006-read-tree-checkout.sh b/t/perf/p0006-read-tree-checkout.sh
> new file mode 100755
> index 0000000..78cc23f
> --- /dev/null
> +++ b/t/perf/p0006-read-tree-checkout.sh
> @@ -0,0 +1,67 @@
> +#!/bin/sh
> +#
> +# This test measures the performance of various read-tree
> +# and checkout operations.  It is primarily interested in
> +# the algorithmic costs of index operations and recursive
> +# tree traversal -- and NOT disk I/O on thousands of files.
> +
> +test_description="Tests performance of read-tree"
> +
> +. ./perf-lib.sh
> +
> +test_perf_default_repo

I like that it's possible to use a real world repository now instead
of forcing the use of a synthetic repository :)

Is there a reason for this being test_perf_default_repo instead of
test_perf_large_repo?  It seems like generating a large repo is what
you are doing with repos/many-files.sh.

> +
> +# If the test repo was generated by ./repos/many-files.sh
> +# then we know something about the data shape and branches,
> +# so we can isolate testing to the ballast-related commits
> +# and setup sparse-checkout so we don't have to populate
> +# the ballast files and directories.
> +#
> +# Otherwise, we make some general assumptions about the
> +# repo and consider the entire history of the current
> +# branch to be the ballast.
> +
> +test_expect_success "setup repo" '
> +	if git rev-parse --verify refs/heads/p0006-ballast^{commit}
> +	then
> +		echo Assuming synthetic repo from many-files.sh
> +		git branch br_base            master
> +		git branch br_ballast         p0006-ballast^
> +		git branch br_ballast_alias   p0006-ballast^
> +		git branch br_ballast_plus_1  p0006-ballast
> +		git config --local core.sparsecheckout 1
> +		cat >.git/info/sparse-checkout <<-EOF
> +		/*
> +		!ballast/*
> +		EOF
> +	else
> +		echo Assuming non-synthetic repo...
> +		git branch br_base            $(git rev-list HEAD | tail -n 1)
> +		git branch br_ballast         HEAD^ || error "no ancestor commit from current head"
> +		git branch br_ballast_alias   HEAD^
> +		git branch br_ballast_plus_1  HEAD
> +	fi &&
> +	git checkout -q br_ballast &&
> +	nr_files=$(git ls-files | wc -l)
> +'
> +
> +test_perf "read-tree br_base br_ballast ($nr_files)" '
> +	git read-tree -m br_base br_ballast -n
> +'
> +
> +test_perf "switch between br_base br_ballast ($nr_files)" '
> +	git checkout -q br_base &&
> +	git checkout -q br_ballast
> +'
> +
> +test_perf "switch between br_ballast br_ballast_plus_1 ($nr_files)" '
> +	git checkout -q br_ballast_plus_1 &&
> +	git checkout -q br_ballast
> +'
> +
> +test_perf "switch between aliases ($nr_files)" '
> +	git checkout -q br_ballast_alias &&
> +	git checkout -q br_ballast
> +'
> +
> +test_done
> diff --git a/t/perf/repos/.gitignore b/t/perf/repos/.gitignore
> new file mode 100644
> index 0000000..72e3dc3
> --- /dev/null
> +++ b/t/perf/repos/.gitignore
> @@ -0,0 +1 @@
> +gen-*/
> diff --git a/t/perf/repos/many-files.sh b/t/perf/repos/many-files.sh
> new file mode 100755
> index 0000000..5a1d25e
> --- /dev/null
> +++ b/t/perf/repos/many-files.sh
> @@ -0,0 +1,110 @@
> +#!/bin/sh
> +## Generate test data repository using the given parameters.
> +## When omitted, we create "gen-many-files-d-w-f.git".
> +##
> +## Usage: [-r repo] [-d depth] [-w width] [-f files]
> +##
> +## -r repo: path to the new repo to be generated
> +## -d depth: the depth of sub-directories
> +## -w width: the number of sub-directories at each level
> +## -f files: the number of files created in each directory
> +##
> +## Note that all files will have the same SHA-1 and each
> +## directory at a level will have the same SHA-1, so we
> +## will potentially have a large index, but not a large
> +## ODB.
> +##
> +## Ballast will be created under "ballast/".

I think comments should start only with a single '#' in the git
source, as you already have it in p0006.

[...]

  reply	other threads:[~2017-04-18 21:40 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-17 21:37 [PATCH v11 0/5] read-cache: speed up add_index_entry git
2017-04-17 21:37 ` [PATCH v11 1/5] read-cache: add strcmp_offset function git
2017-04-17 21:37 ` [PATCH v11 2/5] p0006-read-tree-checkout: perf test to time read-tree git
2017-04-18 21:40   ` Thomas Gummerer [this message]
2017-04-19  1:25     ` Jeff King
2017-04-17 21:37 ` [PATCH v11 3/5] read-cache: speed up add_index_entry during checkout git
2017-04-17 21:37 ` [PATCH v11 4/5] read-cache: speed up has_dir_name (part 1) git
2017-04-17 21:37 ` [PATCH v11 5/5] read-cache: speed up has_dir_name (part 2) git

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170418214025.GA4989@hank \
    --to=t.gummerer@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).