git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: git@vger.kernel.org, gitster@pobox.com, peff@peff.net,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH v1 2/2] p0005-status: time status on very large repo
Date: Thu, 6 Apr 2017 09:26:15 -0400	[thread overview]
Message-ID: <db545f25-b0a7-2359-e6f5-7a89860c4022@jeffhostetler.com> (raw)
In-Reply-To: <20170405213355.GH8741@aiede.mtv.corp.google.com>



On 4/5/2017 5:33 PM, Jonathan Nieder wrote:
> Hi,
>
> git@jeffhostetler.com wrote:
>
>> +++ b/t/perf/p0005-status.sh
>> @@ -0,0 +1,70 @@
>> +#!/bin/sh
>> +
>> +test_description="Tests performance of read-tree"
>> +
>> +. ./perf-lib.sh
>> +
>> +test_perf_default_repo
>> +test_checkout_worktree
>> +
>> +## usage: dir depth width files
>> +make_paths () {
>> +	for f in $(seq $4)
>> +	do
>> +		echo $1/file$f
>> +	done;
>> +	if test $2 -gt 0;
>> +	then
>> +		for w in $(seq $3)
>> +		do
>> +			make_paths $1/dir$w $(($2 - 1)) $3 $4
>> +		done
>> +	fi
>> +	return 0
>> +}
>> +
>> +fill_index () {
>> +	make_paths $1 $2 $3 $4 |
>> +	sed "s/^/100644 $EMPTY_BLOB	/" |
>> +	git update-index --index-info
>> +	return 0
>> +}
>
> Makes sense.
>
>> +
>> +br_work1=xxx_work1_xxx
>> +
>> +new_dir=xxx_dir_xxx
>> +
>> +## (5, 10, 9) will create 999,999 files.
>> +## (4, 10, 9) will create  99,999 files.
>> +depth=5
>> +width=10
>> +files=9
>> +
>> +export br_work1
>> +
>> +export new_dir
>> +
>> +export depth
>> +export width
>> +export files
>
> Why are these exported?  test_expect_success code (unlike test_per
> code) runs in the same shell as outside, so it doesn't seem necessary.

I'm still trying to grok all of the shell wizardry hidden
in the test_* functions, so some may be novice mistakes here.
However, I couldn't get some of the steps to run in an earlier
draft of it without them.  But I copied this from p0004-read-tree
that I posted in an earlier patch and this version is much simpler
so they may not be necessary here.  I'll double check.

>
>> +
>> +## Inflate the index with thousands of empty files and commit it.
>> +test_expect_success 'inflate the index' '
>> +	git reset --hard &&
>
> What does this line do?

I used update-index to add 1M files and commit it instead of
creating the actual files.  This step causes git to do a full
checkout to populate those 1M files.  The reset --hard is
faster than doing "git checkout HEAD".

>
>> +	git branch $br_work1 &&
>> +	git checkout $br_work1 &&
>
> Is it useful for these parameters to exist in the test script?  I'd
> find this easier to read if it named the branch explicitly.  For
> example:
>
> 	test_expect_success 'set up large index' '
> 		git checkout -B million &&
> 		# (4, 10, 9) would create 99,999 files.
> 		# (5, 10, 9) creates 999,999 files.
> 		fill_index dir 5 10 9 &&
> 		git commit -m "large commit"
> 	'

I got burned when playing with p3400-rebase.sh where
it uses fixed name branches with somewhat common names
such as "upstream".

And since the test script copies $GIT_BUILD_DIR repo
into the trash-directory.* (which the user can override)
we run the risk of also colliding with "dir" or other
such common-ish names.

So I created the variables and gave them unlikely values
and used them as placeholders in the actual test.  If the
user has a collision, they can change the variable's
initialization in one place and not have to guess what
the usage is in the rest of the script.  (Especially
since the trend here is for the receiving function to
just use $1 and $2 type arguments.)

>
>> +'
>> +
>> +## The number of files in the xxx_work1_xxx branch.
>> +nr_work1=$(git ls-files | wc -l)
>> +export nr_work1
>> +
>> +test_perf "read-tree status work1 ($nr_work1)" '
>> +	git read-tree HEAD &&
>> +	git status
>> +'
>
> Looks reasonable.
>
> Thanks and hope that helps,
> Jonathan
>

I'll send out a cleaned up version (and fix the grammar
thing in the commit message).
Thanks
Jeff

  reply	other threads:[~2017-04-06 13:26 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-05 19:55 [PATCH v1 0/2] string-list: use ALLOC_GROW macro when reallocing git
2017-04-05 19:55 ` [PATCH v1 1/2] string-list: use ALLOC_GROW macro when reallocing string_list git
2017-04-05 20:00   ` Stefan Beller
2017-04-05 20:09   ` Jeff King
2017-04-05 21:12     ` Jeff Hostetler
2017-04-05 21:26   ` Jonathan Nieder
2017-04-05 19:56 ` [PATCH v1 2/2] p0005-status: time status on very large repo git
2017-04-05 21:33   ` Jonathan Nieder
2017-04-06 13:26     ` Jeff Hostetler [this message]
2017-04-07  4:29       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=db545f25-b0a7-2359-e6f5-7a89860c4022@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).