On 2021-04-11 at 11:36:33, Ævar Arnfjörð Bjarmason wrote: > > On Sat, Apr 10 2021, brian m. carlson wrote: > > > Currently, when we compare two object IDs, we have to take a branch to > > determine what the hash size is supposed to be. The compiler can > > optimize well for a single length, but has trouble when there are two > > possible lengths. > > This would benefit from some performance/perf numbers. When this code > was first changed like this in 183a638b7da (hashcmp: assert constant > hash size, 2018-08-23) we had: > > Test v2.18.0 v2.19.0-rc0 HEAD > ------------------------------------------------------------------------------ > 0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0% > > Then it was later modified in 0dab7129ab1 (cache: make hashcmp and > hasheq work with larger hashes, 2018-11-14). I can do some perf numbers. > > @@ -205,7 +205,7 @@ static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2) > > > > static inline int oidcmp(const struct object_id *oid1, const struct object_id *oid2) > > { > > - return hashcmp(oid1->hash, oid2->hash); > > + return memcmp(oid1->hash, oid2->hash, GIT_MAX_RAWSZ); > > } > > hashcmp is now: > > if (the_hash_algo->rawsz == GIT_MAX_RAWSZ) > return memcmp(sha1, sha2, GIT_MAX_RAWSZ); > return memcmp(sha1, sha2, GIT_SHA1_RAWSZ); > > Wouldn't it make more sense to amend it to just be a memcmp > wrapper/macro if we're going to not make this conditional on the hash > algorithm, or are there other callsites where we still want the old way > of doing it? No, we can't do that. With oidcmp, we know the buffer is large enough. However, in some cases, the buffer in hashcmp is not large enough. For example, we may be at the end of a SHA-1 tree object and we'd segfault. I did try that and I quickly found that it was totally broken. -- brian m. carlson (he/him or they/them) Houston, Texas, US