git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: phillip.wood@dunelm.org.uk
Cc: Junio C Hamano <gitster@pobox.com>,
	Phillip Wood via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH v3 01/15] diff --color-moved: add perf tests
Date: Fri, 29 Oct 2021 13:06:15 +0200	[thread overview]
Message-ID: <211029.86zgqs3wpx.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <b6f57fc3-75d9-d7d5-7153-28dde066a101@gmail.com>


On Fri, Oct 29 2021, Phillip Wood wrote:

> Hi Junio
>
> On 28/10/2021 22:32, Junio C Hamano wrote:
>> "Phillip Wood via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> 
>>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>>
>>> Add some tests so we can monitor changes to the performance of the
>>> move detection code. The tests record the performance of a single
>>> large diff and a sequence of smaller diffs.
>> "A single large diff" meaning...?
>
> The diff of two commits that are far apart in the history so have lots
> of changes between them
>
>>> +if ! git rev-parse --verify v2.29.0^{commit} >/dev/null
>>> +then
>>> +	skip_all='skipping because tag v2.29.0 was not found'
>>> +	test_done
>>> +fi
>> Hmph.  So this is designed only to be run in a clone of git.git with
>> that tag (and a bit of history, at least to v2.28.0 and 1000 commits)?
>> I am asking primarily because this seems to be the first instance of
>> a test that hardcodes the dependency on our history, instead of
>> allowing the tester to use their favourite history by using the
>> GIT_PERF_LARGE_REPO and GIT_PERF_REPO environment variables.
>
> p3404-rebase-interactive does the same thing. The aim is to have a
> repeatable test rather than just using whatever commit HEAD happens to 
> be pointing at when the test is run as the starting point, if you have
> any ideas for doing that another way I'm happy to change it.

I don't know if it's worth it here, but the following would work:

 1. List all tags in the repository, sorted in reverse order, so e.g.:

    git tag -l 'v*.0' --sort=version:refname

    (The glob can be configurable as an env variable, or we could fall
    back)

 2. Go down that list and find the first pair that matches some limit, I
    think say the first "major" release with 500 commits would qualify

 3. Make it a GIT_PERF_LARGE_REPO test

We've got some perf tests that do similar things. I think you'd find
that with something like this you should able to hand the perf test a
path to git.git, or linux.git, and probably any "major" repository" as
long as it follows a common "we tag our releases at some interval"
pattern.

Or perhaps more simply:

 1. Note the number of commits in the history, per "git rev-list HEAD |
    wc -l" 2.

 2. Then round that down to the nearest 10^x, so for a 250k commit
   repository round down to 100k and diff say the 90k..100kth commits,
   for git.git which has 60k that would be 10k, and the diff is commits
   9k..10k..

It means you'll get a "bump" eventually when say git.git crosses 100k
commits, but it will prorably be stable for any measurement anyone cares
to do, and means that you can get "realistic" measurements for diffing a
big chuck on of history from anything from a tiny repository with >=10
commits, to something truly gargantuan where you'd end up diffing say
900k..1m.
     


  reply	other threads:[~2021-10-29 11:18 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-14 13:04 [PATCH 00/10] diff --color-moved[-ws] speedups Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 01/10] diff --color-moved=zerba: fix alternate coloring Phillip Wood via GitGitGadget
2021-06-15  3:24   ` Junio C Hamano
2021-06-15 11:22     ` Phillip Wood
2021-06-14 13:04 ` [PATCH 02/10] diff --color-moved: avoid false short line matches and bad zerba coloring Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 03/10] diff: simplify allow-indentation-change delta calculation Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 04/10] diff --color-moved-ws=allow-indentation-change: simplify and optimize Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 05/10] diff --color-moved: call comparison function directly Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 06/10] diff --color-moved: unify moved block growth functions Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 07/10] diff --color-moved: shrink potential moved blocks as we go Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 08/10] diff --color-moved: stop clearing potential moved blocks Phillip Wood via GitGitGadget
2021-06-14 13:04 ` [PATCH 09/10] diff --color-moved-ws=allow-indentation-change: improve hash lookups Phillip Wood via GitGitGadget
2021-07-09 15:36   ` Elijah Newren
2021-06-14 13:04 ` [PATCH 10/10] diff --color-moved: intern strings Phillip Wood via GitGitGadget
2021-06-16 14:24 ` [PATCH 00/10] diff --color-moved[-ws] speedups Ævar Arnfjörð Bjarmason
2021-06-21 10:03   ` Phillip Wood
2021-07-20 10:36 ` [PATCH v2 00/12] " Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 01/12] diff --color-moved: add perf tests Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 02/12] diff --color-moved=zebra: fix alternate coloring Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 03/12] diff --color-moved: avoid false short line matches and bad zerba coloring Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 04/12] diff: simplify allow-indentation-change delta calculation Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 05/12] diff --color-moved-ws=allow-indentation-change: simplify and optimize Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 06/12] diff --color-moved: call comparison function directly Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 07/12] diff --color-moved: unify moved block growth functions Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 08/12] diff --color-moved: shrink potential moved blocks as we go Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 09/12] diff --color-moved: stop clearing potential moved blocks Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 10/12] diff --color-moved-ws=allow-indentation-change: improve hash lookups Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 11/12] diff: use designated initializers for emitted_diff_symbol Phillip Wood via GitGitGadget
2021-07-20 10:36   ` [PATCH v2 12/12] diff --color-moved: intern strings Phillip Wood via GitGitGadget
2021-07-20 13:38   ` [PATCH v2 00/12] diff --color-moved[-ws] speedups Phillip Wood
2021-10-27 12:04   ` [PATCH v3 00/15] " Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 01/15] diff --color-moved: add perf tests Phillip Wood via GitGitGadget
2021-10-28 21:32       ` Junio C Hamano
2021-10-29 10:24         ` Phillip Wood
2021-10-29 11:06           ` Ævar Arnfjörð Bjarmason [this message]
2021-11-10 11:05             ` Phillip Wood
2021-10-27 12:04     ` [PATCH v3 02/15] diff --color-moved: clear all flags on blocks that are too short Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 03/15] diff --color-moved: factor out function Phillip Wood via GitGitGadget
2021-10-28 21:51       ` Junio C Hamano
2021-10-29 10:35         ` Phillip Wood
2021-10-27 12:04     ` [PATCH v3 04/15] diff --color-moved: rewind when discarding pmb Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 05/15] diff --color-moved=zebra: fix alternate coloring Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 06/15] diff --color-moved: avoid false short line matches and bad zerba coloring Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 07/15] diff: simplify allow-indentation-change delta calculation Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 08/15] diff --color-moved-ws=allow-indentation-change: simplify and optimize Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 09/15] diff --color-moved: call comparison function directly Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 10/15] diff --color-moved: unify moved block growth functions Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 11/15] diff --color-moved: shrink potential moved blocks as we go Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 12/15] diff --color-moved: stop clearing potential moved blocks Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 13/15] diff --color-moved-ws=allow-indentation-change: improve hash lookups Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 14/15] diff: use designated initializers for emitted_diff_symbol Phillip Wood via GitGitGadget
2021-10-27 12:04     ` [PATCH v3 15/15] diff --color-moved: intern strings Phillip Wood via GitGitGadget
2021-10-27 13:28     ` [PATCH v3 00/15] diff --color-moved[-ws] speedups Phillip Wood
2021-11-16  9:49     ` [PATCH v4 " Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 01/15] diff --color-moved: add perf tests Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 02/15] diff --color-moved: clear all flags on blocks that are too short Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 03/15] diff --color-moved: factor out function Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 04/15] diff --color-moved: rewind when discarding pmb Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 05/15] diff --color-moved=zebra: fix alternate coloring Phillip Wood via GitGitGadget
2021-11-22 13:34         ` Johannes Schindelin
2021-11-16  9:49       ` [PATCH v4 06/15] diff --color-moved: avoid false short line matches and bad zerba coloring Phillip Wood via GitGitGadget
2021-11-22 14:18         ` Johannes Schindelin
2021-11-22 19:00           ` Phillip Wood
2021-11-22 21:54             ` Johannes Schindelin
2021-11-16  9:49       ` [PATCH v4 07/15] diff: simplify allow-indentation-change delta calculation Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 08/15] diff --color-moved-ws=allow-indentation-change: simplify and optimize Phillip Wood via GitGitGadget
2021-11-23 14:51         ` Johannes Schindelin
2021-11-16  9:49       ` [PATCH v4 09/15] diff --color-moved: call comparison function directly Phillip Wood via GitGitGadget
2021-11-23 15:09         ` Johannes Schindelin
2021-11-16  9:49       ` [PATCH v4 10/15] diff --color-moved: unify moved block growth functions Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 11/15] diff --color-moved: shrink potential moved blocks as we go Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 12/15] diff --color-moved: stop clearing potential moved blocks Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 13/15] diff --color-moved-ws=allow-indentation-change: improve hash lookups Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 14/15] diff: use designated initializers for emitted_diff_symbol Phillip Wood via GitGitGadget
2021-11-16  9:49       ` [PATCH v4 15/15] diff --color-moved: intern strings Phillip Wood via GitGitGadget
2021-12-08 12:30       ` [PATCH v4 00/15] diff --color-moved[-ws] speedups Johannes Schindelin
2021-12-09 10:29       ` [PATCH v5 " Phillip Wood via GitGitGadget
2021-12-09 10:29         ` [PATCH v5 01/15] diff --color-moved: add perf tests Phillip Wood via GitGitGadget
2021-12-09 10:29         ` [PATCH v5 02/15] diff --color-moved: clear all flags on blocks that are too short Phillip Wood via GitGitGadget
2021-12-09 10:29         ` [PATCH v5 03/15] diff --color-moved: factor out function Phillip Wood via GitGitGadget
2021-12-09 10:29         ` [PATCH v5 04/15] diff --color-moved: rewind when discarding pmb Phillip Wood via GitGitGadget
2021-12-09 10:29         ` [PATCH v5 05/15] diff --color-moved=zebra: fix alternate coloring Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 06/15] diff --color-moved: avoid false short line matches and bad zebra coloring Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 07/15] diff: simplify allow-indentation-change delta calculation Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 08/15] diff --color-moved-ws=allow-indentation-change: simplify and optimize Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 09/15] diff --color-moved: call comparison function directly Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 10/15] diff --color-moved: unify moved block growth functions Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 11/15] diff --color-moved: shrink potential moved blocks as we go Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 12/15] diff --color-moved: stop clearing potential moved blocks Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 13/15] diff --color-moved-ws=allow-indentation-change: improve hash lookups Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 14/15] diff: use designated initializers for emitted_diff_symbol Phillip Wood via GitGitGadget
2021-12-09 10:30         ` [PATCH v5 15/15] diff --color-moved: intern strings Phillip Wood via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=211029.86zgqs3wpx.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).