> The trace_performance functions require manual instrumentation of the code sections you want to measure
Ahh a case of RTFM :)

> Could you post details about your test setup? Are you still using WebKit for your tests?
I'm on Win7 x64, Core i5 M560, WD 7200 Laptop HDD, NTSF, no virus scanner, truecrypt, no defragger. 

I've tried to be a bit smarter with the intent of my code, and this is what I came up with.

diff --git a/cache.h b/cache.h
index 4bf19e3..2e9fb1f 100644
--- a/cache.h
+++ b/cache.h
@@ -294,7 +294,7 @@ extern void free_name_hash(struct index_state *istate);
 #define active_cache_changed (the_index.cache_changed)
 #define active_cache_tree (the_index.cache_tree)
 
-#define read_cache() read_index(&the_index)
+#define read_cache() read_index_preload(&the_index, NULL)
 #define read_cache_from(path) read_index_from(&the_index, (path))
 #define read_cache_preload(pathspec) read_index_preload(&the_index, (pathspec))
 #define is_cache_unborn() is_index_unborn(&the_index)
diff --git a/read-cache.c b/read-cache.c
index c3d5e35..5fb2788 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1866,7 +1866,7 @@ int read_index_unmerged(struct index_state *istate)
  int i;
  int unmerged = 0;
 
- read_index(istate);
+ read_index_preload(istate, NULL);
  for (i = 0; i < istate->cache_nr; i++) {
  struct cache_entry *ce = istate->cache[i];
  struct cache_entry *new_ce;
-- 

Interestingly when I run on a cleanly checked out blink repo my changes seem to make matters worse in terms of performance, but when working on a repo with ignored files in it it seems to work better. So for point of comparison I decided to run it on a comparison on a repo with working ignored files in it in this case msysgit/git after a 'make install'. When I get a few hours I'll try to build blink and re-run the numbers on a much much larger repo. 

This comparison is a average of 3 cold cache runs of the kb/fscache-v4 [a] vs kb/fscache-v4 with my above changes applied [b], with preloadindex and fscache set to true. 

For comparison
git status -s
[a] 3.02s
[b] 2.92s

git reset --hard head
[a] 3.67s
[b] 3.09s

git add -u
[a] 2.89s
[b] 2.08s


I noticed something interesting. Preload index uses 20 threads to do the work. When I was keeping an eye on them in task manager some threads will finish quite quickly, while others will run a lot longer. The way I understand the code at the moment the threads get equal chunks of work to perform. It's quite lilkely that even more performance could be obtained out of preload if the work splitting was 'smarter'. My currently best idea would be to use something like a lock-free queue to queue up the work and let the threads get the work of the queue. That way all threads are busy with work for longer. A candidate for the implementation would be libfds [1] queue. However my issue with this library and the reason I haven't tried to integrate is simply because the code expressly has no license. 


[1] http://www.liblfds.org/

--
--
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.
 
You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en
 
---
You received this message because you are subscribed to the Google Groups "msysGit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.