Linus Torvalds wrote: > > On Thu, 8 Sep 2005, Chuck Lever wrote: > >>in my case the merges were taking significantly longer than a half >>second. making this change is certainly not worth it if merges are >>running fast... > > > Note that in cold-cache cases, all the expense of read-tree is in actually > reading the tree objects themselves (a kernel tree has more than a > thousand subdirectories). Also, a full "git pull" will do the diffstat > etc, and then the expense ends up being in the actual "git diff" part. > > So read-tree itself may be half a second, but a merge ends up having other > parts. i measured this using the following test... i have a linux kernel git repository under control of stgit and it has about 70 patches in it. i did an "stg status" to heat the page cache. i popped all the patches, then did an "stg pull origin". i started oprofile, and did an "stg push -a". it took about 9 minutes. i stopped oprofile and looked at the results. roughly 65% of the total CPU time was spent in libc:memmove. after instrumenting git, i determined that in the "stg push" case the critical memmove was the one in add_cache_entry. note that i'm not saying that the 9 minutes of wall clock time was entirely due to CPU... catalin has been steadily improving "stg push" so this time has shortened by more than half, recently. but i do notice that working in a kernel repository is significantly slower than working in my git or stgit source repositories, which are smaller by far. the small repositories behave just as i expect, the tools respond quite snappily. >>they are still read-only with my linked list implementation. > > Btw, in the sparse project, we have this really smart "pointer list" data > structure, which is extremely space- and time-efficient. It ends up > _looking_ like a linked list, but it batches things up in hunks of 29 > entries (29 pointers plus overhead gives you blocks of 32 longwords, which > is the allocation size) and thus gives basically a cache-friendly > doubly-linked list. It knows how to do insertions, traversals etc very > efficiently. > > Any interest? i'm not married to splay trees. i think we should explore several different data structures before picking one, and this one sounds reasonable to try.