git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Git performance on large repository on OS X is slow without core.preloadindex=false
@ 2020-03-13 23:52 Paul Tarjan
  0 siblings, 0 replies; only message in thread
From: Paul Tarjan @ 2020-03-13 23:52 UTC (permalink / raw)
  To: git

Hi git folks,

I'm working on a git repo for my company and noticed that `git status`
performance was an order of magnitude slower on OS X than on Linux.

tl;dr; Do you know why by default `git status` is trying to `lstat`
every file in the repo on OS X but not on Linux? And is that config
option I found of `core.preloadindex=false` the correct suggestion for
using git on OS X?

Some background:
```
$ git --version
git version 2.25.1
```
```
$ sw_vers
ProductName: Mac OS X
ProductVersion: 10.15.3
BuildVersion: 19D76
```
```
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.4 LTS
Release: 18.04
Codename: bionic
```

Here is the initial problem:

On OS X:
```
$ GIT_TRACE_PERFORMANCE=true git status
14:07:25.251118 read-cache.c:2306       performance: 0.020316000 s:
read cache .git/index
14:07:29.149814 preload-index.c:147     performance: 3.897873000 s:
preload index
14:07:29.152372 read-cache.c:1621       performance: 3.900433000 s:
refresh index
14:07:29.163628 diff-lib.c:251          performance: 0.010868000 s:  diff-files
14:07:29.165150 unpack-trees.c:1592     performance: 0.000125000 s:
traverse_trees
14:07:29.165833 unpack-trees.c:447      performance: 0.000024000 s:
check_updates
14:07:29.165849 unpack-trees.c:1691     performance: 0.001543000 s:
unpack_trees
14:07:29.165854 diff-lib.c:537          performance: 0.001620000 s:  diff-index
14:07:29.178017 name-hash.c:600         performance: 0.011799000 s:
initialize name hash
14:07:29.778714 dir.c:2606              performance: 0.612725000 s:
read directory
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
14:07:29.779696 trace.c:475             performance: 4.549920000 s:
git command: git status
```

Linux:
```
$ GIT_TRACE_PERFORMANCE=true git status
17:07:22.665901 read-cache.c:1914       performance: 0.016203489 s:
read cache .git/index
17:07:22.736241 preload-index.c:112     performance: 0.070269440 s:
preload index
17:07:22.739459 read-cache.c:1472       performance: 0.003190579 s:
refresh index
17:07:22.742891 diff-lib.c:250          performance: 0.003279925 s: diff-files
17:07:22.744175 diff-lib.c:527          performance: 0.001076360 s: diff-index
17:07:22.755865 name-hash.c:605         performance: 0.011539610 s:
initialize name hash
17:07:22.996122 dir.c:2303              performance: 0.251910028 s:
read directory
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
17:07:22.997366 trace.c:420             performance: 0.347834763 s:
git command: git status
```

But after I run:
```
$ git config --bool core.preloadindex false
```

I now get this on OS X:
```
$ GIT_TRACE_PERFORMANCE=true git status
16:40:29.487593 read-cache.c:2306       performance: 0.022781000 s:
read cache .git/index
16:40:29.853585 read-cache.c:1621       performance: 0.365122000 s:
refresh index
16:40:29.857934 diff-lib.c:251          performance: 0.003896000 s:  diff-files
16:40:29.859338 unpack-trees.c:1592     performance: 0.000088000 s:
traverse_trees
16:40:29.859982 unpack-trees.c:447      performance: 0.000026000 s:
check_updates
16:40:29.859994 unpack-trees.c:1691     performance: 0.001466000 s:
unpack_trees
16:40:29.859999 diff-lib.c:537          performance: 0.001523000 s:  diff-index
16:40:29.870882 name-hash.c:600         performance: 0.010529000 s:
initialize name hash
16:40:30.525238 dir.c:2606              performance: 0.665120000 s:
read directory
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean
16:40:30.526289 trace.c:475             performance: 1.062404000 s:
git command: git status
```

I traced it down to on Linux we are only making ~700 lstat syscals but
on OS X we're making 100k.

OS X (using lldb with a breakpoint on lstat64):
```
(lldb) break list
Current breakpoints:
1: name = 'lstat64', locations = 1, resolved = 1, hit count = 103859
Options: ignore: 953389 enabled
  1.1: where = libsystem_kernel.dylib`lstat$INODE64, address =
0x00007fff6a899be4, resolved, hit count = 103859```

Linux:
```
$ strace git status -uno 2>&1 | grep lstat | wc -l
726
```

The backtrace for these lstat calls seems to come from these two spots:
https://github.com/git/git/blob/30e9940356dc67959877f4b2417da33ebdefbb79/preload-index.c#L76
https://github.com/git/git/blob/30e9940356dc67959877f4b2417da33ebdefbb79/symlinks.c#L138

Thanks
Paul

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2020-03-13 23:52 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-13 23:52 Git performance on large repository on OS X is slow without core.preloadindex=false Paul Tarjan

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).