git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: 程洋 <chengyang@xiaomi.com>
To: "Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: RE: [External Mail]Re: why git is so slow for a tiny git push?
Date: Wed, 24 Nov 2021 08:07:15 +0000	[thread overview]
Message-ID: <feaf3bace8394837ad0ec020d01b545c@xiaomi.com> (raw)
In-Reply-To: YWYCIndv/u67lNQU@coredump.intra.peff.net

It seems that,  "get_object_list_from_bitmap" takes 9 seconds.
Does it meet the expectation?

I'm not sure. But here is my guess:
Since I have 300k refs. But clone with `--no-tags` only requires "refs/heads/*". Git has to search and filter refs in the whole bitmap file, which takes a lot of time.
I think jgit do it in a really smart way. It pack all refs/heads into one bitmapfile ,and the other refs in another bitmap file. Because 90% of clone operation only requires all refs/heads.


-----Original Message-----
From: 程洋
Sent: Tuesday, November 23, 2021 2:42 PM
To: 'Jeff King' <peff@peff.net>; Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Cc: git@vger.kernel.org
Subject: RE: [External Mail]Re: why git is so slow for a tiny git push?

I got another problem here.
When I tries to clone from remote server. It took me 25 seconds to enumerating objects. And then 1 second to `couting objects` by bitmap.
I don't understand, why a fresh clone need `enumerating objects` ? Is `couting objects` enough for the server to determine what to send?

Here is the remote server trace:


11:49:12.438519 common-main.c:48             | d0 | main                     | version      |     |           |           |              | 2.33.1.558.g2bd2f258f4.dirty
11:49:12.438556 common-main.c:49             | d0 | main                     | start        |     |  0.000274 |           |              | git daemon --inetd --syslog --export-all --enable=upload-pack --enable=receive-pack --base-path=/home/work/repositories
11:49:12.438607 compat/linux/procinfo.c:170  | d0 | main                     | cmd_ancestry |     |           |           |              | ancestry:[xinetd systemd]
11:49:12.438655 git.c:737                                      | d0 | main                     | cmd_name     |     |           |           |              | _run_dashed_ (_run_dashed_)
11:49:12.438668 run-command.c:739                 | d0 | main                     | child_start  |     |  0.000390 |           |              | [ch0] class:dashed argv:[git-daemon --inetd --syslog --export-all --enable=upload-pack --enable=receive-pack --base-path=/home/work/repositories]
11:49:12.439555 common-main.c:48                  | d1 | main                     | version      |     |           |           |              | 2.33.1.558.g2bd2f258f4.dirty
11:49:12.439589 common-main.c:49                  | d1 | main                     | start        |     |  0.000242 |           |              | /usr/libexec/git-core/git-daemon --inetd --syslog --export-all --enable=upload-pack --enable=receive-pack --base-path=/home/work/repositories
11:49:12.439645 compat/linux/procinfo.c:170  | d1 | main                     | cmd_ancestry |     |           |           |              | ancestry:[git xinetd systemd]
11:49:12.439809 run-command.c:739            | d1 | main                     | child_start  |     |  0.000467 |           |              | [ch0] class:? argv:[git upload-pack --strict --timeout=0 .]
11:49:12.440747 common-main.c:48             | d2 | main                     | version      |     |           |           |              | 2.33.1.558.g2bd2f258f4.dirty
11:49:12.440772 common-main.c:49             | d2 | main                     | start        |     |  0.000252 |           |              | /usr/libexec/git-core/git upload-pack --strict --timeout=0 .
11:49:12.440833 compat/linux/procinfo.c:170  | d2 | main                     | cmd_ancestry |     |           |           |              | ancestry:[git-daemon git xinetd systemd]
11:49:12.440853 git.c:456                    | d2 | main                     | cmd_name     |     |           |           |              | upload-pack (_run_dashed_/upload-pack)
11:49:12.441013 protocol.c:76                | d2 | main                     | data         |     |  0.000494 |  0.000494 | transfer     | negotiated-version:2
11:49:12.481208 run-command.c:739            | d2 | main                     | child_start  |     |  0.040684 |           |              | [ch0] class:? argv:[git pack-objects --revs --thin --stdout --progress --delta-base-offset]
11:49:12.482307 common-main.c:48             | d3 | main                     | version      |     |           |           |              | 2.33.1.558.g2bd2f258f4.dirty
11:49:12.482334 common-main.c:49             | d3 | main                     | start        |     |  0.000220 |           |              | /usr/libexec/git-core/git pack-objects --revs --thin --stdout --progress --delta-base-offset
11:49:12.482405 compat/linux/procinfo.c:170  | d3 | main                     | cmd_ancestry |     |           |           |              | ancestry:[git git-daemon git xinetd systemd]
11:49:12.482500 git.c:456                    | d3 | main                     | cmd_name     |     |           |           |              | pack-objects (_run_dashed_/upload-pack/pack-objects)
11:49:12.482632 builtin/pack-objects.c:4140  | d3 | main                     | region_enter | r0  |  0.000522 |           | pack-objects | label:enumerate-objects
11:49:12.482825 progress.c:268               | d3 | main                     | region_enter | r0  |  0.000715 |           | progress     | ..label:Enumerating objects
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
11:49:21.477783 progress.c:329               | d3 | main                     | data         | r0  |  8.995670 |  8.994955 | progress     | ....total_objects:0
11:49:21.477848 progress.c:336               | d3 | main                     | region_leave | r0  |  8.995738 |  8.995023 | progress     | ..label:Enumerating objects
11:49:21.477880 builtin/pack-objects.c:4162  | d3 | main                     | region_leave | r0  |  8.995770 |  8.995248 | pack-objects | label:enumerate-objects
11:49:21.477891 builtin/pack-objects.c:4168  | d3 | main                     | region_enter | r0  |  8.995782 |           | pack-objects | label:prepare-pack
11:49:21.477903 progress.c:268               | d3 | main                     | region_enter | r0  |  8.995794 |           | progress     | ..label:Counting objects
11:49:22.316806 progress.c:329               | d3 | main                     | data         | r0  |  9.834695 |  0.838901 | progress     | ....total_objects:1383396
11:49:22.316848 progress.c:336               | d3 | main                     | region_leave | r0  |  9.834738 |  0.838944 | progress     | ..label:Counting objects
11:49:22.366109 progress.c:268               | d3 | main                     | region_enter | r0  |  9.883998 |           | progress     | ..label:Compressing objects
11:49:34.208323 trace2/tr2_tgt_perf.c:201    | d2 | main                     | signal       |     | 21.767795 |           |              | signo:13
11:49:34.208372 trace2/tr2_tgt_perf.c:201    | d3 | main                     | signal       |     | 21.726219 |           |              | ....signo:13
11:49:34.218767 run-command.c:995            | d1 | main                     | child_exit   |     | 21.779417 | 21.778950 |              | [ch0] pid:48725 code:141
11:49:34.218809 common-main.c:54             | d1 | main                     | exit         |     | 21.779469 |           |              | code:141
11:49:34.218822 trace2/tr2_tgt_perf.c:213    | d1 | main                     | atexit       |     | 21.779482 |           |              | code:141
11:49:34.219135 run-command.c:995            | d0 | main                     | child_exit   |     | 21.780855 | 21.780465 |              | [ch0] pid:48724 code:141
11:49:34.219170 git.c:759                    | d0 | main                     | exit         |     | 21.780893 |           |              | code:141
11:49:34.219182 trace2/tr2_tgt_perf.c:213    | d0 | main                     | atexit       |     | 21.780906 |           |              | code:141
-----Original Message-----
From: Jeff King <peff@peff.net>
Sent: Wednesday, October 13, 2021 5:46 AM
To: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Cc: 程洋 <chengyang@xiaomi.com>; git@vger.kernel.org
Subject: Re: [External Mail]Re: why git is so slow for a tiny git push?

*This message originated from outside of XIAOMI. Please treat this email with caution*


On Tue, Oct 12, 2021 at 12:06:04PM +0200, Ævar Arnfjörð Bjarmason wrote:

> But more generally with these side-indexes it seems to me that the
> code involved might not be considering these sorts of edge cases, i.e.
> my understanding from you above is that if we have bitmaps anywhere
> we'll try to in-memory use them for all the objects in play? Or that
> otherwise having "partial" bitmaps leads to pathological behavior.

Sure, if there was an easy way to know beforehand whether the bitmap was going to help or run into these pathological cases, it would be nice to detect it. I don't know what that is (and I've given it quite a lot of thought over the past 8 years).

I suspect the most direction would be to teach the bitmap code to behave more like the regular traversal by just walking down to the UNINTERESTING commits. Right now it gets a complete bitmap for the commits we don't want, and then a bitmap for the ones we do want, and takes a set difference.

It could instead walk both sides in the usual way, filling in the bitmap for each, and then stop when it hits boundary commits. The bitmap for the boundary commit (if we don't have a full one on-disk) is filled in with what's in its tree. That means it's incomplete, and the result might include some extra objects (e.g., if boundary~100 had a blob that went away, but later came back in a descendant that isn't marked uninteresting). That's the same tradeoff the non-bitmap traversal makes.

It would be pretty major surgery to the bitmap code. I haven't actually tried it before.

-Peff
#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

  parent reply	other threads:[~2021-11-24  8:07 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <c5a8595658d6416684c2bbd317494c49@xiaomi.com>
     [not found] ` <5a6f3e8f29f74c93bf3af5da636df973@xiaomi.com>
2021-10-09 18:05   ` why git is so slow for a tiny git push? 程洋
2021-10-11 16:53     ` Jeff King
2021-10-12  8:04       ` [External Mail]Re: " 程洋
2021-10-12  8:39         ` Jeff King
2021-10-12  9:08           ` 程洋
2021-10-12 21:39             ` Jeff King
2021-10-14  6:47               ` 程洋
2021-10-26 21:54                 ` Jeff King
2021-10-27  2:48                   ` 程洋
2021-10-12 10:06           ` Ævar Arnfjörð Bjarmason
2021-10-12 21:46             ` Jeff King
2021-11-23  6:42               ` 程洋
2021-11-24 18:15                 ` Jeff King
2021-11-25  2:53                   ` 程洋
2021-11-24  8:07               ` 程洋 [this message]
2021-10-28 13:17     ` Han-Wen Nienhuys

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=feaf3bace8394837ad0ec020d01b545c@xiaomi.com \
    --to=chengyang@xiaomi.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).