On Tue, Aug 21, 2018 at 05:29:24PM -0400, Jeff King wrote: > 0001.2: rev-list --all --objects 37.07(36.62+0.45) 39.11(38.58+0.51) +5.5% > > Less change, but my overall times were smaller, too, so clearly our > hardware or exact repos are a little bit different. Those numbers seem > pretty consistent in further runs. > > It bisects to 509f6f62a4 (cache: update object ID functions for > the_hash_algo, 2018-07-16). Which make sense. An "--objects" traversal > spends a huge amount of time checking each tree entry to see if we've > processed that object yet, which ends up as hashcmp() in the hash table. > I expect that a fixed 20-byte memcmp() can be optimized a lot more than > one with an arbitrary value. > > Even if _we_ know the value can only take on one of a few values, I > don't know that we have an easy way to tell the compiler that. Possibly > we could improve things by jumping directly to an optimized code path. > Sort of a poor-man's JIT. ;) > > Doing this: > > diff --git a/cache.h b/cache.h > index b1fd3d58ab..9c004a26c9 100644 > --- a/cache.h > +++ b/cache.h > @@ -1023,7 +1023,10 @@ extern const struct object_id null_oid; > > static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2) > { > - return memcmp(sha1, sha2, the_hash_algo->rawsz); > + if (the_hash_algo->rawsz == 20) > + return memcmp(sha1, sha2, 20); > + else > + return memcmp(sha1, sha1, the_hash_algo->rawsz); > } > > static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2) > on top of v2.19-rc0 seems to give me about a 3% speedup (though I might > be imaging it, as there's a bit of noise). A function pointer in > the_hash_algo might make even more sense. It's possible that might be a better solution. I looked into a GCC assertion that the value was either 20 or 32, and that in itself didn't seem to help, at least in the generated code. Your solution is likely better in that regard. We could wire it up to be either 20 or 32 and let people experimenting with other sizes of algorithms just add another branch. I haven't tested how that performs, though. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204