From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS53758 23.128.96.0/24 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_NONE, SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id 49D761F953 for ; Fri, 29 Oct 2021 11:18:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231879AbhJ2LVW (ORCPT ); Fri, 29 Oct 2021 07:21:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231670AbhJ2LVV (ORCPT ); Fri, 29 Oct 2021 07:21:21 -0400 Received: from mail-ed1-x52f.google.com (mail-ed1-x52f.google.com [IPv6:2a00:1450:4864:20::52f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E54F1C061570 for ; Fri, 29 Oct 2021 04:18:52 -0700 (PDT) Received: by mail-ed1-x52f.google.com with SMTP id z20so37782107edc.13 for ; Fri, 29 Oct 2021 04:18:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:references:user-agent:in-reply-to :message-id:mime-version; bh=+6rx3/52FNpCgrND2n0l8rEwlW8nVVhZzEIiCEdTdt0=; b=bI1epUbZ9+2Jo+djRGbWmAheE23zDtkz2EH/aElryrd62E4zy1KEn1BrGnTRd43/8U LC13gaMErRALkEvIh5OUM10ycZyvtRj0uenMdIh6Dx+uZ9SDQ6xVIK35xdJns64LW5m2 a0Tzs7gFrvAuBOzb9N9P4xZ/qVFWPLrYEclh47UKM/Bu91zOx5W6XgCblykBqpEksYUm J57pXZbtdTltOyXxOlIV0QNCFny8uDaIgP8PzpyGb8O6yLMy/OhCf+puDsU0wu1nMr41 CtUyI40zFUqcoXpIZzdMWA7muBHh6Tb4O6LhLDpoNnwDrXtxiPA3C6/ieuPQpcjmxQI3 7wWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:references:user-agent :in-reply-to:message-id:mime-version; bh=+6rx3/52FNpCgrND2n0l8rEwlW8nVVhZzEIiCEdTdt0=; b=ZnY3bBkmvQLorVGi/2dOtI7AD/t2tpXhclq1muVwxoFWvj5NxWUCgckgYpPdJawzVH FfkIK4+1iew5HEMx0heoyKGYnrgBMGzH94lU7vRHg6TruL2YSGFgMH1URw3mSrz56D9L +DkjLBqCONG/d6DNvlNtTR1lHVUlhbLZfeh4aIcXsfrEDa58ZmgTUFOuE1rco+UaJYpq HUz5wKFWtrsQFbuHFBMx3tAJsOommBI2c2AivfoFpi5r136Q8KGvyZyhD/Yx0ZUMw2BV Duq/8SENEj+AGHokFQlw/82VIGxAohjmhsqm0KQEJ+VzXeo25Fouv75d7DJ/yW3JqgsQ lLjg== X-Gm-Message-State: AOAM530fcnFLYX44Gn5I9ILXeZzI6ZyCQ3sLUFKKYJUw5jtC0bx9Lqap iDxTvaAgfTtRC8F/0oCfvROwof5+L5lX7Q== X-Google-Smtp-Source: ABdhPJxcjzROVZh3YKEE7/Au48+fn77ehtHkfkEJGCyD/pAUn9CLRhBpqBCDubNT5unr+MwIRkbUGQ== X-Received: by 2002:a17:906:1db2:: with SMTP id u18mr12744375ejh.227.1635506331355; Fri, 29 Oct 2021 04:18:51 -0700 (PDT) Received: from gmgdl (j120189.upc-j.chello.nl. [24.132.120.189]) by smtp.gmail.com with ESMTPSA id hv11sm2282706ejc.24.2021.10.29.04.18.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Oct 2021 04:18:51 -0700 (PDT) Received: from avar by gmgdl with local (Exim 4.95) (envelope-from ) id 1mgPuM-0024of-HV; Fri, 29 Oct 2021 13:18:50 +0200 From: =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason To: phillip.wood@dunelm.org.uk Cc: Junio C Hamano , Phillip Wood via GitGitGadget , git@vger.kernel.org, Elijah Newren Subject: Re: [PATCH v3 01/15] diff --color-moved: add perf tests Date: Fri, 29 Oct 2021 13:06:15 +0200 References: <8fc8914a37b3c343cd92bb0255088f7b000ff7f7.1635336262.git.gitgitgadget@gmail.com> User-agent: Debian GNU/Linux bookworm/sid; Emacs 27.1; mu4e 1.6.6 In-reply-to: Message-ID: <211029.86zgqs3wpx.gmgdl@evledraar.gmail.com> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Fri, Oct 29 2021, Phillip Wood wrote: > Hi Junio > > On 28/10/2021 22:32, Junio C Hamano wrote: >> "Phillip Wood via GitGitGadget" writes: >> >>> From: Phillip Wood >>> >>> Add some tests so we can monitor changes to the performance of the >>> move detection code. The tests record the performance of a single >>> large diff and a sequence of smaller diffs. >> "A single large diff" meaning...? > > The diff of two commits that are far apart in the history so have lots > of changes between them > >>> +if ! git rev-parse --verify v2.29.0^{commit} >/dev/null >>> +then >>> + skip_all='skipping because tag v2.29.0 was not found' >>> + test_done >>> +fi >> Hmph. So this is designed only to be run in a clone of git.git with >> that tag (and a bit of history, at least to v2.28.0 and 1000 commits)? >> I am asking primarily because this seems to be the first instance of >> a test that hardcodes the dependency on our history, instead of >> allowing the tester to use their favourite history by using the >> GIT_PERF_LARGE_REPO and GIT_PERF_REPO environment variables. > > p3404-rebase-interactive does the same thing. The aim is to have a > repeatable test rather than just using whatever commit HEAD happens to > be pointing at when the test is run as the starting point, if you have > any ideas for doing that another way I'm happy to change it. I don't know if it's worth it here, but the following would work: 1. List all tags in the repository, sorted in reverse order, so e.g.: git tag -l 'v*.0' --sort=version:refname (The glob can be configurable as an env variable, or we could fall back) 2. Go down that list and find the first pair that matches some limit, I think say the first "major" release with 500 commits would qualify 3. Make it a GIT_PERF_LARGE_REPO test We've got some perf tests that do similar things. I think you'd find that with something like this you should able to hand the perf test a path to git.git, or linux.git, and probably any "major" repository" as long as it follows a common "we tag our releases at some interval" pattern. Or perhaps more simply: 1. Note the number of commits in the history, per "git rev-list HEAD | wc -l" 2. 2. Then round that down to the nearest 10^x, so for a 250k commit repository round down to 100k and diff say the 90k..100kth commits, for git.git which has 60k that would be 10k, and the diff is commits 9k..10k.. It means you'll get a "bump" eventually when say git.git crosses 100k commits, but it will prorably be stable for any measurement anyone cares to do, and means that you can get "realistic" measurements for diffing a big chuck on of history from anything from a tiny repository with >=10 commits, to something truly gargantuan where you'd end up diffing say 900k..1m.