From: Linus Torvalds <torvalds@linux-foundation.org>
To: Junio C Hamano <gitster@pobox.com>,
Git Mailing List <git@vger.kernel.org>
Cc: "Kristian Høgsberg" <krh@redhat.com>
Subject: performance problem: "git commit filename"
Date: Sat, 12 Jan 2008 14:46:08 -0800 (PST) [thread overview]
Message-ID: <alpine.LFD.1.00.0801121426510.2806@woody.linux-foundation.org> (raw)
I thought we had fixed this long long ago, but if we did, it has
re-surfaced.
Using an explicit filename with "git commit" is _extremely_ slow. Lookie
here:
[torvalds@woody linux]$ time git commit fs/exec.c
no changes added to commit (use "git add" and/or "git commit -a")
real 0m1.671s
user 0m1.200s
sys 0m0.328s
that's closer to two seconds on a fast machine, with the whole tree
cached!
And for the uncached case, it's just unbearably slow: two and a half
*minutes*.
In contrast, without the filename, it's much faster:
[torvalds@woody linux]$ time git commit
no changes added to commit (use "git add" and/or "git commit -a")
real 0m0.387s
user 0m0.220s
sys 0m0.168s
with the cold-cache case now being "just" 18s (which is still long, but
we're talking eight times faster, and certainly not unbearable!)
Doing an "strace -c" on the thing shows why. In the filename case, we
have:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
32.69 0.000868 0 92299 37 lstat
17.40 0.000462 0 29958 3993 open
15.78 0.000419 0 5522 getdents
15.56 0.000413 0 23165 mmap
11.37 0.000302 0 23118 munmap
5.76 0.000153 0 25966 2 close
1.43 0.000038 0 2845 fstat
...
and in the non-filename case we have
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
53.67 0.000600 0 69227 31 lstat
23.35 0.000261 0 5522 getdents
11.09 0.000124 2 55 munmap
4.20 0.000047 0 285 write
3.31 0.000037 0 5537 2638 open
2.33 0.000026 0 2899 1 close
2.06 0.000023 0 2844 fstat
...
notice how the expensive case has a lot of successful open/mmap/munmap
calls: it is *literally* ignoring the valid entries in the old index
entirely, and re-hashing every single file in the tree! No wonder it is
slow!
Just counting "lstat()" calls, it's worth noticing that the non-filename
case seems to do three lstat's for each index entry (and yes, that's two
too many), but the named file case has upped that to *four* lstats per
entry, and then added the one open/mmap/munmap/close on top of that!
I'm pretty sure we didn't use to do things this badly. And if this is a
regression like I think it is, it should be fixed before a real 1.5.4
release.
I'll try to see if I can see what's up, but I thought I'd better let
others know too, in case I don't have time. I *suspect* (but have nothing
what-so-ever to back that up) that this happened as part of making commit
a builtin.
Linus
next reply other threads:[~2008-01-12 22:47 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-12 22:46 Linus Torvalds [this message]
2008-01-13 1:46 ` performance problem: "git commit filename" Linus Torvalds
2008-01-13 4:04 ` Linus Torvalds
2008-01-13 5:38 ` Daniel Barkalow
2008-01-13 8:14 ` Junio C Hamano
2008-01-13 16:57 ` Linus Torvalds
2008-01-13 19:31 ` Daniel Barkalow
2008-01-13 8:12 ` Junio C Hamano
2008-01-13 10:33 ` Junio C Hamano
2008-01-13 10:54 ` [PATCH] builtin-commit.c: do not lstat(2) partially committed paths twice Junio C Hamano
2008-01-13 11:09 ` performance problem: "git commit filename" Junio C Hamano
2008-01-13 17:24 ` Linus Torvalds
2008-01-13 19:39 ` Junio C Hamano
2008-01-13 22:36 ` [PATCH] index: be careful when handling long names Junio C Hamano
2008-01-13 22:53 ` Alex Riesen
2008-01-13 23:08 ` Junio C Hamano
2008-01-13 23:33 ` Alex Riesen
2008-01-14 21:03 ` Junio C Hamano
2008-01-14 1:00 ` performance problem: "git commit filename" Junio C Hamano
2008-01-14 17:07 ` Linus Torvalds
2008-01-14 18:38 ` Junio C Hamano
2008-01-14 19:39 ` Linus Torvalds
2008-01-14 20:08 ` Junio C Hamano
2008-01-14 21:00 ` Linus Torvalds
2008-01-15 0:18 ` Linus Torvalds
2008-01-15 1:13 ` Junio C Hamano
2008-01-13 10:38 ` [PATCH] builtin-commit.c: remove useless check added by faulty cut and paste Junio C Hamano
2008-01-14 21:23 ` しらいしななこ
2008-01-14 21:54 ` Junio C Hamano
2008-01-14 23:46 ` performance problem: "git commit filename" Kristian Høgsberg
2008-01-14 23:15 ` Kristian Høgsberg
2008-01-14 23:48 ` Junio C Hamano
2008-01-14 23:53 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.1.00.0801121426510.2806@woody.linux-foundation.org \
--to=torvalds@linux-foundation.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=krh@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).