git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Derrick Stolee <derrickstolee@github.com>
To: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>, git@vger.kernel.org
Cc: vdye@github.com
Subject: Re: [PATCH v1 4/4] rm: integrate with sparse-index
Date: Thu, 4 Aug 2022 10:48:12 -0400	[thread overview]
Message-ID: <999169c6-a727-af2a-3361-51ac7b1f1d80@github.com> (raw)
In-Reply-To: <20220803045118.1243087-5-shaoxuan.yuan02@gmail.com>

On 8/3/2022 12:51 AM, Shaoxuan Yuan wrote:
> Enable the sparse index within the `git-rm` command.
> 
> The `p2000` tests demonstrate a ~96% execution time reduction for
> 'git rm' using a sparse index.

Sorry that I got sidetracked yesterday when I was reviewing this
series, but I noticed something looking at these results:
 
> Test                                     before  after
> -------------------------------------------------------------
> 2000.74: git rm -f f2/f4/a (full-v3)     0.66    0.88 +33.0%
> 2000.75: git rm -f f2/f4/a (full-v4)     0.67    0.75 +12.0%

The range of _growth_ here seemed odd, so I wanted to check if this was
due to a small sample size or not.

> 2000.76: git rm -f f2/f4/a (sparse-v3)   1.99    0.08 -96.0%
> 2000.77: git rm -f f2/f4/a (sparse-v4)   2.06    0.07 -96.6%

These numbers are as expected.

>  test_perf_on_all git read-tree -mu HEAD
>  test_perf_on_all git checkout-index -f --all
>  test_perf_on_all git update-index --add --remove $SPARSE_CONE/a
> +test_perf_on_all git rm -f $SPARSE_CONE/a

At first, I was confused why we needed '-f' and thought that maybe
this was turning into a no-op after the first deletion. However, the
test_perf_on_all helper does an "echo >>$SPARSE_CONE/a" before hand,
so the file exists _in the worktree_ every time. That requires '-f'
since otherwise Git complains that we have modifications.

However, after the first instance the file no longer exists in the
index, so we are losing some testing of the index modification.

We can fix this by resetting the index in each test loop:

  test_perf_on_all "git rm -f $SPARSE_CONE/a && git checkout HEAD -- $SPARSE_CONE/a"

Running this version of the test with GIT_PERF_REPEAT_COUNT=10 and
using the Git repository itself, I get these numbers:

Test                              HEAD~1            HEAD
--------------------------------------------------------------------------
2000.74: git rm ... (full-v3)     0.41(0.37+0.05)   0.43(0.36+0.07) +4.9% 
2000.75: git rm ... (full-v4)     0.38(0.34+0.05)   0.39(0.35+0.05) +2.6% 
2000.76: git rm ... (sparse-v3)   0.57(0.56+0.01)   0.05(0.05+0.00) -91.2%
2000.77: git rm ... (sparse-v4)   0.57(0.55+0.02)   0.03(0.03+0.00) -94.7%

Yes, the 'git checkout' command is contributing to the overall
numbers, but it also already has the performance improvements of
the sparse-index, so it contributes only a little to the performance
on the left.

(Also note that the full index cases change only by amounts within
reasonable noise. The repeat count helps there.)

Thanks,
-Stolee

  reply	other threads:[~2022-08-04 14:48 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-03  4:51 [PATCH v1 0/4] rm: integrate with sparse-index Shaoxuan Yuan
2022-08-03  4:51 ` [PATCH v1 1/4] t1092: add tests for `git-rm` Shaoxuan Yuan
2022-08-03 14:32   ` Derrick Stolee
2022-08-03  4:51 ` [PATCH v1 2/4] pathspec.h: move pathspec_needs_expanded_index() from reset.c to here Shaoxuan Yuan
2022-08-03 14:35   ` Derrick Stolee
2022-08-05  7:53     ` Shaoxuan Yuan
2022-08-03  4:51 ` [PATCH v1 3/4] rm: expand the index only when necessary Shaoxuan Yuan
2022-08-03 14:40   ` Derrick Stolee
2022-08-05  8:07     ` Shaoxuan Yuan
2022-08-03  4:51 ` [PATCH v1 4/4] rm: integrate with sparse-index Shaoxuan Yuan
2022-08-04 14:48   ` Derrick Stolee [this message]
2022-08-06  3:18     ` Shaoxuan Yuan
2022-08-07  4:13 ` [PATCH v2 0/4] " Shaoxuan Yuan
2022-08-07  4:13   ` [PATCH v2 1/4] t1092: add tests for `git-rm` Shaoxuan Yuan
2022-08-10 12:47     ` Derrick Stolee
2022-08-07  4:13   ` [PATCH v2 2/4] pathspec.h: move pathspec_needs_expanded_index() from reset.c to here Shaoxuan Yuan
2022-08-07  4:13   ` [PATCH v2 3/4] rm: expand the index only when necessary Shaoxuan Yuan
2022-08-10  0:24     ` Victoria Dye
2022-08-07  4:13   ` [PATCH v2 4/4] rm: integrate with sparse-index Shaoxuan Yuan
2022-08-08 17:24   ` [PATCH v2 0/4] " Junio C Hamano
2022-08-08 17:51     ` Victoria Dye
2022-08-08 19:01       ` Junio C Hamano
2022-08-10  0:27   ` Victoria Dye
2022-08-10  0:31     ` Shaoxuan Yuan
2022-08-12 18:36     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=999169c6-a727-af2a-3361-51ac7b1f1d80@github.com \
    --to=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=shaoxuan.yuan02@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).