git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Anthony Muller <anthony@monospace.sh>
To: "git" <git@vger.kernel.org>
Subject: Performance of "git gc..." is extremely bad in some cases
Date: Mon, 08 Mar 2021 21:15:48 +0000	[thread overview]
Message-ID: <17813b232e9.e48d03c3862272.7793967418558853913@monospace.sh> (raw)
In-Reply-To: 

What did you do before the bug happened? (Steps to reproduce your issue)

git clone https://github.com/notracking/hosts-blocklists
cd hosts-blocklists
git reflog expire --all --expire=now && git gc --prune=now --aggressive


What did you expect to happen? (Expected behavior)

Running gc on a ~300 MB repo should not take 1 hour 55 minutes when
running gc on a 2.6 GB repo (LLVM) only takes 24 minutes.


What happened instead? (Actual behavior)

Command took 1h 55m to complete on a ~300MB repo and used enough
resources that the machine is almost unusable.


What's different between what you expected and what actually happened?

Compression stage uses the majority of the resources and time. Compression
itself, when compared to something like zlib or lzma, should not take very long.
While more may be happening as objects are compressed, the amount of time
gc takes to compress the objects and the resources it consumed are both
unreasonable.

Memory: RSS = 3451152 KB (3.29 GB), VSZ = 29286272 KB (27.92 GB)
Time: 12902.83s user 8995.41s system 315% cpu 1:55:36.73 total

I've seen this issue with a number of repos and size of the repo does not
determine if this happens. LLVM @ 2.6 GB worked flawlessly, a 900 MB
repo never finished, this 300 MB repo takes forever, and if you test something
like chromium git will just crash.


[System Info]
hardware: 2.9Ghz Quad Core i7
git version:
git version 2.30.0
cpu: x86_64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Darwin 19.6.0 Darwin Kernel Version 19.6.0: Tue Jan 12 22:13:05 PST 2021; root:xnu-6153.141.16~1/RELEASE_X86_64 x86_64
compiler info: clang: 12.0.0 (clang-1200.0.32.28)
libc info: no libc information available
$SHELL (typically, interactive shell): /usr/local/bin/zsh


             reply	other threads:[~2021-03-08 21:32 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-08 21:15 Anthony Muller [this message]
2021-03-08 22:29 ` Performance of "git gc..." is extremely bad in some cases Bryan Turner
     [not found]   ` <178140c3b3b.c7a29306868075.2037370475662478386@monospace.sh>
2021-03-08 23:55     ` Bryan Turner
2021-03-08 23:56   ` brian m. carlson
2021-03-09  0:14     ` Anthony Muller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17813b232e9.e48d03c3862272.7793967418558853913@monospace.sh \
    --to=anthony@monospace.sh \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).