git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* How to still kill git fetch with too many refs
@ 2013-07-02  3:02 Martin Fick
  2013-07-02  4:07 ` Jeff King
  2013-07-02  9:24 ` Michael Haggerty
  0 siblings, 2 replies; 15+ messages in thread
From: Martin Fick @ 2013-07-02  3:02 UTC (permalink / raw)
  To: git

I have often reported problems with git fetch when there are 
many refs in a repo, and I have been pleasantly surprised 
how many problems I reported were so quickly fixed. :) With 
time, others have created various synthetic test cases to 
ensure that git can handle many many refs.  A simple 
synthetic test case with 1M refs all pointing to the same 
sha1 seems to be easily handled by git these days.  However, 
in our experience with our internal git repo, we still have 
performance issues related to having too many refs, in our 
kernel/msm instance we have around 400K.

When I tried the simple synthetic test case and could not 
reproduce bad results, so I tried something just a little 
more complex and was able to get atrocious results!!! 
Basically, I generate a packed-refs files with many refs 
which each point to a different sha1.  To get a list of 
valid but unique sha1s for the repo, I simply used rev-list.  
The result, a copy of linus' repo with a million unique 
valid refs and a git fetch of a single updated ref taking a 
very long time (55mins and it did not complete yet).  Note, 
with 100K refs it completes in about 2m40s.  It is likely 
not linear since 2m40s * 10 would be ~26m (but the 
difference could also just be how the data in the sha1s are 
ordered).


Here is my small reproducible test case for this issue:

git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
cp -rp linux linux.1Mrefs-revlist

cd linux
echo "Hello" > hello ; git add hello ; git ci -a -m 'hello'
cd ..

cd linux.1Mrefs-revlist
git rev-list HEAD | for nn in $(seq 0 100) ; do for c in 
$(seq 0 10000) ; do  read sha ; echo $sha refs/c/$nn/$c$nn ; 
done ; done > .git/packed-refs

time git fetch file:///$(dirname $PWD)/linux 
refs/heads/master

Any insights as to why it is so slow, and how we could 
possibly speed it up?

Thanks,

-Martin

PS: My tests were performed with git version 1.8.2.1 on 
linux 2.6.32-37-generic #81-Ubuntu SMP 


-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation
 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-07-02 17:52 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-02  3:02 How to still kill git fetch with too many refs Martin Fick
2013-07-02  4:07 ` Jeff King
2013-07-02  4:41   ` Jeff King
2013-07-02  5:01     ` Jeff King
2013-07-02  5:19       ` Junio C Hamano
2013-07-02  5:28         ` Jeff King
2013-07-02  6:11           ` [PATCH 0/3] avoid quadratic behavior in fetch-pack Jeff King
2013-07-02  6:16             ` [PATCH 1/3] fetch-pack: avoid quadratic list insertion in mark_complete Jeff King
2013-07-02  6:21             ` [PATCH 2/3] commit.c: make compare_commits_by_commit_date global Jeff King
2013-07-02  6:24             ` [PATCH 3/3] fetch-pack: avoid quadratic behavior in rev_list_push Jeff King
2013-07-02  7:52               ` Eric Sunshine
2013-07-02 17:45             ` [PATCH 0/3] avoid quadratic behavior in fetch-pack Martin Fick
2013-07-02 17:52       ` How to still kill git fetch with too many refs Brandon Casey
2013-07-02  9:24 ` Michael Haggerty
2013-07-02 16:58   ` Martin Fick

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).