git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Git fetch slow on local repository with 600k refs
@ 2023-03-13 11:54 程洋
  2023-03-14  8:22 ` Bagas Sanjaya
  2023-03-14 14:29 ` Sean Allred
  0 siblings, 2 replies; 4+ messages in thread
From: 程洋 @ 2023-03-13 11:54 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: 姜浩哲

We're holding a Gerrit server cluster. And uses pull-replication plugin to sync changes between master and slave.
When a change is pushed to master, it notify the slave, and slave fetch it from master.

But we found in a big repository with 600k refs. Fetch takes 5-10 seconds even if fetching a 1 byte change. Here is the GIT_TRACE2_PERF
I did an experiment to fetch a ref that my slave already have. And we can find git rev-list takes 2 seconds to perform. (I guess it try to find remote object from reachable objects of local refs one by one)
Is there anyway to optimize such situation?

19:12:55.931180 common-main.c:48             | d0 | main                     | version      |     |           |           |              | 2.33.1.558.g2bd2f258f4.dirty
19:12:55.931215 common-main.c:49             | d0 | main                     | start        |     |  0.000335 |           |              | git fetch --no-tags git://10.13.8.10/miui/gerrit/base-test.git refs/changes/27/2741927/1:refs/changes/27/2741927/1
19:12:55.931302 compat/linux/procinfo.c:170  | d0 | main                     | cmd_ancestry |     |           |           |              | ancestry:[bash sudo bash miauthd miauthd systemd]
19:12:55.931381 git.c:456                    | d0 | main                     | cmd_name     |     |           |           |              | fetch (fetch)
19:12:55.931566 builtin/fetch.c:1579         | d0 | main                     | region_enter | r0  |  0.000692 |           | fetch        | label:remote_refs
19:12:55.936781 connect.c:167                | d0 | main                     | data         |     |  0.005907 |  0.005215 | transfer     | ..negotiated-version:2
19:12:55.940447 builtin/fetch.c:1582         | d0 | main                     | region_leave | r0  |  0.009573 |  0.008881 | fetch        | label:remote_refs
19:12:56.221133 run-command.c:739            | d0 | main                     | child_start  |     |  0.290252 |           |              | [ch0] class:? argv:[git rev-list --objects --stdin --not --all --quiet --alternate-refs --unsorted-input]
19:12:58.014792 run-command.c:995            | d0 | main                     | child_exit   |     |  2.083899 |  1.793647 |              | [ch0] pid:81860 code:0
19:12:58.014855 builtin/fetch.c:1321         | d0 | main                     | region_enter | r0  |  2.083980 |           | fetch        | label:consume_refs
19:12:58.015412 builtin/fetch.c:1326         | d0 | main                     | region_leave | r0  |  2.084538 |  0.000558 | fetch        | label:consume_refs
19:12:58.015466 run-command.c:739            | d0 | main                     | child_start  |     |  2.084590 |           |              | [ch1] class:? argv:[git maintenance run --auto --no-quiet]
19:12:58.018879 common-main.c:48             | d1 | main                     | version      |     |           |           |              | 2.33.1.558.g2bd2f258f4.dirty
19:12:58.018911 common-main.c:49             | d1 | main                     | start        |     |  0.000324 |           |              | /usr/libexec/git-core/git maintenance run --auto --no-quiet
19:12:58.019011 compat/linux/procinfo.c:170  | d1 | main                     | cmd_ancestry |     |           |           |              | ancestry:[git bash sudo bash miauthd miauthd systemd]
19:12:58.019087 git.c:456                    | d1 | main                     | cmd_name     |     |           |           |              | maintenance (fetch/maintenance)
19:12:58.019276 git.c:714                    | d1 | main                     | exit         |     |  0.000690 |           |              | code:0
19:12:58.019284 trace2/tr2_tgt_perf.c:213    | d1 | main                     | atexit       |     |  0.000698 |           |              | code:0
19:12:58.019386 run-command.c:995            | d0 | main                     | child_exit   |     |  2.088507 |  0.003917 |              | [ch1] pid:81878 code:0
19:12:58.019411 git.c:714                    | d0 | main                     | exit         |     |  2.088538 |           |              | code:0
19:12:58.019419 trace2/tr2_tgt_perf.c:213    | d0 | main                     | atexit       |     |  2.088545 |           |              | code:0
#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Git fetch slow on local repository with 600k refs
  2023-03-13 11:54 Git fetch slow on local repository with 600k refs 程洋
@ 2023-03-14  8:22 ` Bagas Sanjaya
  2023-03-14 11:13   ` [External Mail]Re: " 程洋
  2023-03-14 14:29 ` Sean Allred
  1 sibling, 1 reply; 4+ messages in thread
From: Bagas Sanjaya @ 2023-03-14  8:22 UTC (permalink / raw)
  To: 程洋, git@vger.kernel.org; +Cc: 姜浩哲

On 3/13/23 18:54, 程洋 wrote:
> 19:12:55.931180 common-main.c:48             | d0 | main                     | version      |     |           |           |              | 2.33.1.558.g2bd2f258f4.dirty
> 19:12:55.931215 common-main.c:49             | d0 | main                     | start        |     |  0.000335 |           |              | git fetch --no-tags git://10.13.8.10/miui/gerrit/base-test.git refs/changes/27/2741927/1:refs/changes/27/2741927/1
> 19:12:55.931302 compat/linux/procinfo.c:170  | d0 | main                     | cmd_ancestry |     |           |           |              | ancestry:[bash sudo bash miauthd miauthd systemd]
> 19:12:55.931381 git.c:456                    | d0 | main                     | cmd_name     |     |           |           |              | fetch (fetch)
> 19:12:55.931566 builtin/fetch.c:1579         | d0 | main                     | region_enter | r0  |  0.000692 |           | fetch        | label:remote_refs
> 19:12:55.936781 connect.c:167                | d0 | main                     | data         |     |  0.005907 |  0.005215 | transfer     | ..negotiated-version:2
> 19:12:55.940447 builtin/fetch.c:1582         | d0 | main                     | region_leave | r0  |  0.009573 |  0.008881 | fetch        | label:remote_refs
> 19:12:56.221133 run-command.c:739            | d0 | main                     | child_start  |     |  0.290252 |           |              | [ch0] class:? argv:[git rev-list --objects --stdin --not --all --quiet --alternate-refs --unsorted-input]
> 19:12:58.014792 run-command.c:995            | d0 | main                     | child_exit   |     |  2.083899 |  1.793647 |              | [ch0] pid:81860 code:0
> 19:12:58.014855 builtin/fetch.c:1321         | d0 | main                     | region_enter | r0  |  2.083980 |           | fetch        | label:consume_refs
> 19:12:58.015412 builtin/fetch.c:1326         | d0 | main                     | region_leave | r0  |  2.084538 |  0.000558 | fetch        | label:consume_refs
> 19:12:58.015466 run-command.c:739            | d0 | main                     | child_start  |     |  2.084590 |           |              | [ch1] class:? argv:[git maintenance run --auto --no-quiet]
> 19:12:58.018879 common-main.c:48             | d1 | main                     | version      |     |           |           |              | 2.33.1.558.g2bd2f258f4.dirty
> 19:12:58.018911 common-main.c:49             | d1 | main                     | start        |     |  0.000324 |           |              | /usr/libexec/git-core/git maintenance run --auto --no-quiet
> 19:12:58.019011 compat/linux/procinfo.c:170  | d1 | main                     | cmd_ancestry |     |           |           |              | ancestry:[git bash sudo bash miauthd miauthd systemd]
> 19:12:58.019087 git.c:456                    | d1 | main                     | cmd_name     |     |           |           |              | maintenance (fetch/maintenance)
> 19:12:58.019276 git.c:714                    | d1 | main                     | exit         |     |  0.000690 |           |              | code:0
> 19:12:58.019284 trace2/tr2_tgt_perf.c:213    | d1 | main                     | atexit       |     |  0.000698 |           |              | code:0
> 19:12:58.019386 run-command.c:995            | d0 | main                     | child_exit   |     |  2.088507 |  0.003917 |              | [ch1] pid:81878 code:0
> 19:12:58.019411 git.c:714                    | d0 | main                     | exit         |     |  2.088538 |           |              | code:0
> 19:12:58.019419 trace2/tr2_tgt_perf.c:213    | d0 | main                     | atexit       |     |  2.088545 |           |              | code:0

From above, I see that the hot paths are `git maintenance run` and
`git rev-list`, right?

Next time, try to send only plain-text email in this ML, as vger isn't
happy with HTML emails (most likely spam).

Thanks.

-- 
An old man doll... just what I always wanted! - Clara


^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [External Mail]Re: Git fetch slow on local repository with 600k refs
  2023-03-14  8:22 ` Bagas Sanjaya
@ 2023-03-14 11:13   ` 程洋
  0 siblings, 0 replies; 4+ messages in thread
From: 程洋 @ 2023-03-14 11:13 UTC (permalink / raw)
  To: Bagas Sanjaya, git@vger.kernel.org; +Cc: 姜浩哲



> -----Original Message-----
> From: Bagas Sanjaya <bagasdotme@gmail.com>
> Sent: Tuesday, March 14, 2023 4:22 PM
> To: 程洋 <chengyang@xiaomi.com>; git@vger.kernel.org
> Cc: 姜浩哲 <jianghaozhe1@xiaomi.com>
> Subject: [External Mail]Re: Git fetch slow on local repository with 600k refs
>
>
> On 3/13/23 18:54, 程洋 wrote:
> > 19:12:55.931180 common-main.c:48             | d0 | main                     | version
> |     |           |           |              | 2.33.1.558.g2bd2f258f4.dirty
> > 19:12:55.931215 common-main.c:49             | d0 | main                     | start        |
> |  0.000335 |           |              | git fetch --no-tags
> git://10.13.8.10/miui/gerrit/base-test.git
> refs/changes/27/2741927/1:refs/changes/27/2741927/1
> > 19:12:55.931302 compat/linux/procinfo.c:170  | d0 | main                     |
> cmd_ancestry |     |           |           |              | ancestry:[bash sudo bash miauthd
> miauthd systemd]
> > 19:12:55.931381 git.c:456                    | d0 | main                     | cmd_name     |
> |           |           |              | fetch (fetch)
> > 19:12:55.931566 builtin/fetch.c:1579         | d0 | main                     |
> region_enter | r0  |  0.000692 |           | fetch        | label:remote_refs
> > 19:12:55.936781 connect.c:167                | d0 | main                     | data         |     |
> 0.005907 |  0.005215 | transfer     | ..negotiated-version:2
> > 19:12:55.940447 builtin/fetch.c:1582         | d0 | main                     |
> region_leave | r0  |  0.009573 |  0.008881 | fetch        | label:remote_refs
> > 19:12:56.221133 run-command.c:739            | d0 | main                     |
> child_start  |     |  0.290252 |           |              | [ch0] class:? argv:[git rev-list --
> objects --stdin --not --all --quiet --alternate-refs --unsorted-input]
> > 19:12:58.014792 run-command.c:995            | d0 | main                     | child_exit
> |     |  2.083899 |  1.793647 |              | [ch0] pid:81860 code:0
> > 19:12:58.014855 builtin/fetch.c:1321         | d0 | main                     |
> region_enter | r0  |  2.083980 |           | fetch        | label:consume_refs
> > 19:12:58.015412 builtin/fetch.c:1326         | d0 | main                     |
> region_leave | r0  |  2.084538 |  0.000558 | fetch        | label:consume_refs
> > 19:12:58.015466 run-command.c:739            | d0 | main                     |
> child_start  |     |  2.084590 |           |              | [ch1] class:? argv:[git maintenance
> run --auto --no-quiet]
> > 19:12:58.018879 common-main.c:48             | d1 | main                     | version
> |     |           |           |              | 2.33.1.558.g2bd2f258f4.dirty
> > 19:12:58.018911 common-main.c:49             | d1 | main                     | start        |
> |  0.000324 |           |              | /usr/libexec/git-core/git maintenance run --auto
> --no-quiet
> > 19:12:58.019011 compat/linux/procinfo.c:170  | d1 | main                     |
> cmd_ancestry |     |           |           |              | ancestry:[git bash sudo bash
> miauthd miauthd systemd]
> > 19:12:58.019087 git.c:456                    | d1 | main                     | cmd_name     |
> |           |           |              | maintenance (fetch/maintenance)
> > 19:12:58.019276 git.c:714                    | d1 | main                     | exit         |     |
> 0.000690 |           |              | code:0
> > 19:12:58.019284 trace2/tr2_tgt_perf.c:213    | d1 | main                     | atexit
> |     |  0.000698 |           |              | code:0
> > 19:12:58.019386 run-command.c:995            | d0 | main                     | child_exit
> |     |  2.088507 |  0.003917 |              | [ch1] pid:81878 code:0
> > 19:12:58.019411 git.c:714                    | d0 | main                     | exit         |     |
> 2.088538 |           |              | code:0
> > 19:12:58.019419 trace2/tr2_tgt_perf.c:213    | d0 | main                     | atexit
> |     |  2.088545 |           |              | code:0
>
> From above, I see that the hot paths are `git maintenance run` and `git rev-
> list`, right?
>
> Next time, try to send only plain-text email in this ML, as vger isn't happy
> with HTML emails (most likely spam).
>
> Thanks.
>
> --
> An old man doll... just what I always wanted! - Clara

I do send the email with outlook and choose the plaintext mode. Do you still see my mail as HTML?

BTW, I didn't see any performance issue on maintenance, but only on git rev-list
#/******本邮件及其附件含有小米公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Git fetch slow on local repository with 600k refs
  2023-03-13 11:54 Git fetch slow on local repository with 600k refs 程洋
  2023-03-14  8:22 ` Bagas Sanjaya
@ 2023-03-14 14:29 ` Sean Allred
  1 sibling, 0 replies; 4+ messages in thread
From: Sean Allred @ 2023-03-14 14:29 UTC (permalink / raw)
  To: 程洋; +Cc: git@vger.kernel.org, 姜浩哲


程洋 <chengyang@xiaomi.com> writes:

> We're holding a Gerrit server cluster. And uses pull-replication
> plugin to sync changes between master and slave.
>
> When a change is pushed to master, it notify the slave, and slave
> fetch it from master.
>
> But we found in a big repository with 600k refs. Fetch takes 5-10
> seconds even if fetching a 1 byte change. Here is the GIT_TRACE2_PERF
>
> I did an experiment to fetch a ref that my slave already have. And we
> can find git rev-list takes 2 seconds to perform. (I guess it try to
> find remote object from reachable objects of local refs one by one)
>
> Is there anyway to optimize such situation?

Do you need all those refs as refs -- or are you just looking to keep
the commits?

We found a rather clever solution for the latter we're looking to
upstream at some point to collect all refs into a single 'archive' ref
that collects commits in fake merge commits (there's no actual conflict
resolution happening -- we just use the same tree over and over). We
make each commit message look like show-ref output. For example:

A single ref (refs/archive) pointing to commit (A), with contents

    tree <some arbitrary tree>
    parent <B> [... 500 other commits 'merged' in ...]
    author <system user>
    committer <system user>

    deadbeef0123456788... refs/tags/very/old/release-1
    deadbeef0123456789... refs/tags/very/old/release-2

When we want to pull a ref out of the archive, we have a process in
place to do so. This keeps the total number of refs down and the
fetch/push performance within acceptable limits.

--
Sean Allred

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-03-14 14:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-13 11:54 Git fetch slow on local repository with 600k refs 程洋
2023-03-14  8:22 ` Bagas Sanjaya
2023-03-14 11:13   ` [External Mail]Re: " 程洋
2023-03-14 14:29 ` Sean Allred

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).