git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Patrick Steinhardt <ps@pks.im>
Cc: git@vger.kernel.org, "Jeff King" <peff@peff.net>,
	"Felipe Contreras" <felipe.contreras@gmail.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Chris Torek" <chris.torek@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>
Subject: Re: [PATCH v2 3/3] connected: implement connectivity check using bitmaps
Date: Tue, 29 Jun 2021 18:44:03 -0400	[thread overview]
Message-ID: <YNuiM8TR5evSeNsN@nand.local> (raw)
In-Reply-To: <YNovuzAsaEb2uIaa@nand.local>

On Mon, Jun 28, 2021 at 04:23:23PM -0400, Taylor Blau wrote:
> I'll try this out myself and see if it's worth it. (As an aside, I'll be
> offline next week, so it may take me a little while to post something to
> the list).

I gave implementing this a shot and it seems to have produced some good
improvements, although there are definitely some areas where it does
better than others.

Here are some results running on linux.git with a cold cache, counting
objects for commit 2ab38c17aa, which I picked deliberately since I know
it has a bitmap:

    $ hyperfine \
      'GIT_READ_COMMIT_TABLE=0 git.compile rev-list --count --objects --use-bitmap-index 2ab38c17aac10bf55ab3efde4c4db3893d8691d2' \
      'GIT_READ_COMMIT_TABLE=1 git.compile rev-list --count --objects --use-bitmap-index 2ab38c17aac10bf55ab3efde4c4db3893d8691d2' \
      --prepare='sync; echo 3 | sudo tee /proc/sys/vm/drop_caches'

		Benchmark #1: GIT_READ_COMMIT_TABLE=0 git.compile rev-list --count --objects --use-bitmap-index 2ab38c17aac10bf55ab3efde4c4db3893d8691d2
			Time (mean ± σ):     141.1 ms ±   2.5 ms    [User: 13.0 ms, System: 64.3 ms]
			Range (min … max):   136.2 ms … 143.4 ms    10 runs

		Benchmark #2: GIT_READ_COMMIT_TABLE=1 git.compile rev-list --count --objects --use-bitmap-index 2ab38c17aac10bf55ab3efde4c4db3893d8691d2
			Time (mean ± σ):      28.7 ms ±   3.2 ms    [User: 6.5 ms, System: 10.0 ms]
			Range (min … max):    22.0 ms …  31.0 ms    21 runs

		Summary
			'GIT_READ_COMMIT_TABLE=1 git.compile rev-list --count --objects --use-bitmap-index 2ab38c17aac10bf55ab3efde4c4db3893d8691d2' ran
				4.91 ± 0.55 times faster than 'GIT_READ_COMMIT_TABLE=0 git.compile rev-list --count --objects --use-bitmap-index 2ab38c17aac10bf55ab3efde4c4db3893d8691d2'

That's sort of a best-case scenario, because we're not doing any
traversal between the bitmapped commits and the traversal tips. But even
if we do have some traversal, the results are still pretty good.
Swapping out 2ab38c17aa for `--branches` yields a 5.02x improvement from
141.0ms down to 28.1ms.

Adding in `--tags` basically negates any improvement (having the commit
table extension eeks out a 1.03x improvement from 645.7ms down to
626.0ms. `perf record` shows that 30% of time is spent outside of the
bitmap code.

If you want to give this a try yourself, I highly recommend generating
your bitmap while packing with `-c pack.writeReverseIndex`. Building a
reverse index on-the-fly also seems to negate any performance
improvements here, so having an on-disk reverse index is more or less a
prerequisite to testing this out.

Extremely gross and inscrutable code can be found on the
'tb/bitmap-commit-table' branch of my fork [1].

Thanks,
Taylor

[1]: https://github.com/git/git/compare/master...ttaylorr:tb/bitmap-commit-table


  reply	other threads:[~2021-06-29 22:44 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-28  5:33 [PATCH v2 0/3] Speed up connectivity checks via bitmaps Patrick Steinhardt
2021-06-28  5:33 ` [PATCH v2 1/3] p5400: add perf tests for git-receive-pack(1) Patrick Steinhardt
2021-06-28  7:49   ` Ævar Arnfjörð Bjarmason
2021-06-29  6:18     ` Patrick Steinhardt
2021-06-29 12:09       ` Ævar Arnfjörð Bjarmason
2021-06-28  5:33 ` [PATCH v2 2/3] receive-pack: skip connectivity checks on delete-only commands Patrick Steinhardt
2021-06-28  8:00   ` Ævar Arnfjörð Bjarmason
2021-06-28  8:06     ` Ævar Arnfjörð Bjarmason
2021-06-29  6:26     ` Patrick Steinhardt
2021-06-30  1:31   ` Jeff King
2021-06-30  1:35     ` Jeff King
2021-06-30 13:52     ` Patrick Steinhardt
2021-06-28  5:33 ` [PATCH v2 3/3] connected: implement connectivity check using bitmaps Patrick Steinhardt
2021-06-28 20:23   ` Taylor Blau
2021-06-29 22:44     ` Taylor Blau [this message]
2021-06-30  2:04       ` Jeff King
2021-06-30  3:07         ` Taylor Blau
2021-06-30  5:45           ` Jeff King
2021-07-02 17:44             ` Taylor Blau
2021-07-02 21:21               ` Jeff King
2021-06-30  1:51   ` Jeff King
2021-07-20 14:26     ` Patrick Steinhardt
2021-08-02  9:37 ` [PATCH v3 0/4] Speed up connectivity checks Patrick Steinhardt
2021-08-02  9:38   ` [PATCH v3 1/4] connected: do not sort input revisions Patrick Steinhardt
2021-08-02 12:49     ` Ævar Arnfjörð Bjarmason
2021-08-03  8:50       ` Patrick Steinhardt
2021-08-04 11:01         ` Ævar Arnfjörð Bjarmason
2021-08-02 19:00     ` Junio C Hamano
2021-08-03  8:55       ` Patrick Steinhardt
2021-08-03 21:47         ` Junio C Hamano
2021-08-02  9:38   ` [PATCH v3 2/4] revision: stop retrieving reference twice Patrick Steinhardt
2021-08-02 12:53     ` Ævar Arnfjörð Bjarmason
2021-08-02  9:38   ` [PATCH v3 3/4] revision: avoid loading object headers multiple times Patrick Steinhardt
2021-08-02 12:55     ` Ævar Arnfjörð Bjarmason
2021-08-05 10:12       ` Patrick Steinhardt
2021-08-02 19:40     ` Junio C Hamano
2021-08-03  9:07       ` Patrick Steinhardt
2021-08-06 14:17         ` Patrick Steinhardt
2021-08-02  9:38   ` [PATCH v3 4/4] revision: avoid hitting packfiles when commits are in commit-graph Patrick Steinhardt
2021-08-02 20:01     ` Junio C Hamano
2021-08-03  9:16       ` Patrick Steinhardt
2021-08-03 21:56         ` Junio C Hamano
2021-08-05 11:01           ` Patrick Steinhardt
2021-08-05 16:16             ` Junio C Hamano
2021-08-04 10:51         ` Ævar Arnfjörð Bjarmason
2021-08-05 11:25   ` [PATCH v4 0/6] Speed up connectivity checks Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 1/6] revision: separate walk and unsorted flags Patrick Steinhardt
2021-08-05 18:47       ` Junio C Hamano
2021-08-05 11:25     ` [PATCH v4 2/6] connected: do not sort input revisions Patrick Steinhardt
2021-08-05 18:44       ` Junio C Hamano
2021-08-06  6:00         ` Patrick Steinhardt
2021-08-06 16:50           ` Junio C Hamano
2021-08-05 11:25     ` [PATCH v4 3/6] revision: stop retrieving reference twice Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 4/6] revision: avoid loading object headers multiple times Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 5/6] commit-graph: split out function to search commit position Patrick Steinhardt
2021-08-05 11:25     ` [PATCH v4 6/6] revision: avoid hitting packfiles when commits are in commit-graph Patrick Steinhardt
2021-08-09  8:00 ` [PATCH v5 0/5] Speed up connectivity checks Patrick Steinhardt
2021-08-09  8:02   ` Patrick Steinhardt
2021-08-09  8:11 ` Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 1/5] revision: separate walk and unsorted flags Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 2/5] connected: do not sort input revisions Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 3/5] revision: stop retrieving reference twice Patrick Steinhardt
2021-08-09  8:11   ` [PATCH v5 4/5] commit-graph: split out function to search commit position Patrick Steinhardt
2021-08-09  8:12   ` [PATCH v5 5/5] revision: avoid hitting packfiles when commits are in commit-graph Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YNuiM8TR5evSeNsN@nand.local \
    --to=me@ttaylorr.com \
    --cc=avarab@gmail.com \
    --cc=chris.torek@gmail.com \
    --cc=felipe.contreras@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).