git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [BUG?] Major performance issue with some commands on our repo's master branch
@ 2022-06-04  7:39 Tassilo Horn
  2022-06-04 20:20 ` Tao Klerks
  0 siblings, 1 reply; 12+ messages in thread
From: Tassilo Horn @ 2022-06-04  7:39 UTC (permalink / raw)
  To: git

Hi all,

[spoiler alert: I've figured out the config option causing the problem
while writing this long mail, so you might jump straight to the SOLUTION
section at the bottom of this mail.]

at my day job, I work on a git repo (sadly non-public, proprietary) with
these stats:

- master has about 150000 commits, the last release branch I've also benchmarked above has 144000 commits
- the history dates back to 2001
- .git/ is about 1.8 GB

So it's quite big but not unusually big when compared to linux or other
free software projects.

The typical git commands I use (status, fetch, pull, commit, push,
rebase, ...) are all quick.  However, I use the git porcelain Magit [1]
which invokes several plumbing commands in order to present to the user
an always up-to-date extended status buffer of the currently checked out
branch showing the current branch.  Some of those plumbing commands are
extremely slow for no obvious reasons.  The most outstanding command I
could pinpoint is this:

--8<---------------cut here---------------start------------->8---
❯ time git show --no-patch --format="%h %s" "master^{commit}" --
6192a0cfdc6 Merge remote-tracking branch 'origin/SHD_ECORO_3_9_7'

________________________________________________________
Executed in   13.21 secs    fish           external
   usr time   12.99 secs  462.00 micros   12.99 secs
   sys time    0.17 secs  119.00 micros    0.17 secs
--8<---------------cut here---------------end--------------->8---

The interesting thing is that I have this problem only with the master
branch.  When I run it for the last release branch, I get these times:

--8<---------------cut here---------------start------------->8---
❯ time git show --no-patch --format="%h %s" "SHD_ECORO_3_9_7^{commit}" --
994334fc9fb ECOJ-33833 HTML-Formbrief: Bestellungs-Anhänge im KV-Kontext

________________________________________________________
Executed in   22.68 millis    fish           external
   usr time    7.71 millis  761.00 micros    6.95 millis
   sys time   10.47 millis  194.00 micros   10.28 millis
--8<---------------cut here---------------end--------------->8---

So you see, it's almost a factor of 1000 difference!  How can that be?

The split between master and the SHD_ECORO_3_X_X series of branches has
happened almost 2 years ago and master is way ahead of those.

--8<---------------cut here---------------start------------->8---
❯ git log --oneline master...origin/SHD_ECORO_3_9_7 | wc -l
5013
--8<---------------cut here---------------end--------------->8---

But there are around 9 merges from the last release branch into master
daily.

--8<---------------cut here---------------start------------->8---
❯ git log --merges --oneline --since 6months | wc -l
1611
--8<---------------cut here---------------end--------------->8---

From my memory, the issue hasn't popped up out of sudden but has gotten
worse slowly over time.  I have the impression that the worsening
increased pace over the last few month which might be the result of our
workflow.  Before, I've been the merge guy doing two "merge waves" from
the last supported release branch upwards into master once or twice a
day (usually release-branch -> next-release-branch -> master).  Since
about 3 month, we've switched to a workflow where every developer does
merge upwards herself just after committing/pushing to some lesser
branch than master simply because branches have diverged so much that
you'd need to be an expert in everything in order to be able to resolve
conflicts sensibly.

I should mention that I haven't seen this issue with any other repo I
have.  But that's also the biggest one I use.  The Emacs repository I
also work on is comparable in the number of commits but with much less
merges.

At last, here's the git bugreport sysinfo section on that machine and
repository.

--8<---------------cut here---------------start------------->8---
[System Info]
git version:
git version 2.36.1
cpu: x86_64
no commit associated with this build
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
uname: Linux 5.18.1-zen1-1-zen #1 ZEN SMP PREEMPT_DYNAMIC Mon, 30 May 2022 17:53:16 +0000 x86_64
compiler info: gnuc: 11.2
libc info: glibc: 2.35
$SHELL (typically, interactive shell): /usr/bin/fish

[Enabled Hooks]
--8<---------------cut here---------------end--------------->8---

SOLUTION
========

While writing this long mail, I've figured out that the performance
penalty is caused by my setting of diff.renameLimit = 10000.  If I
comment that option in my ~/.gitconfig, the above command finishes in
150 millis instead of 13 seconds:

--8<---------------cut here---------------start------------->8---
❯ time git show --no-patch --format="%h %s" "master^{commit}" --
6192a0cfdc6 Merge remote-tracking branch 'origin/SHD_ECORO_3_9_7'

________________________________________________________
Executed in  147.99 millis    fish           external
   usr time  114.52 millis  713.00 micros  113.81 millis
   sys time   34.78 millis  193.00 micros   34.59 millis
--8<---------------cut here---------------end--------------->8---

But there's still the question why diff.renameLimit has an influence
here when --no-patch is provided so no diff should be generated.

Bye,
Tassilo

[1] https://magit.vc/

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-06-09 20:06 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-04  7:39 [BUG?] Major performance issue with some commands on our repo's master branch Tassilo Horn
2022-06-04 20:20 ` Tao Klerks
2022-06-05 10:46   ` Tassilo Horn
2022-06-06  5:18     ` Tao Klerks
2022-06-08 23:36     ` Jeff King
2022-06-09  1:27       ` Kyle Meyer
2022-06-09 15:03         ` Jeff King
2022-06-09 18:23           ` Junio C Hamano
2022-06-09 18:43             ` Jeff King
2022-06-09 20:06               ` Junio C Hamano
2022-06-09  5:51       ` Tassilo Horn
2022-06-09 15:05         ` Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).