git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: git@jeffhostetler.com
To: git@vger.kernel.org
Cc: gitster@pobox.com, peff@peff.net,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: [PATCH 2/6] hashmap: allow memihash computation to be continued
Date: Wed, 22 Mar 2017 17:14:21 +0000	[thread overview]
Message-ID: <1490202865-31325-3-git-send-email-git@jeffhostetler.com> (raw)
In-Reply-To: <1490202865-31325-1-git-send-email-git@jeffhostetler.com>

From: Jeff Hostetler <jeffhost@microsoft.com>

Add variant of memihash() to allow the hash computation to
be continued.  There are times when we compute the hash on
a full path and then the hash on just the path to the parent
directory.  This can be expensive on large repositories.

With this, we can hash the parent directory first. And then
continue the computation to include the "/filename".

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 hashmap.c | 17 +++++++++++++++++
 hashmap.h |  1 +
 2 files changed, 18 insertions(+)

diff --git a/hashmap.c b/hashmap.c
index b10b642..505e63f 100644
--- a/hashmap.c
+++ b/hashmap.c
@@ -50,6 +50,23 @@ unsigned int memihash(const void *buf, size_t len)
 	return hash;
 }
 
+/*
+ * Incoporate another chunk of data into a memihash
+ * computation.
+ */ 
+unsigned int memihash_cont(unsigned int hash_seed, const void *buf, size_t len)
+{
+	unsigned int hash = hash_seed;
+	unsigned char *ucbuf = (unsigned char *) buf;
+	while (len--) {
+		unsigned int c = *ucbuf++;
+		if (c >= 'a' && c <= 'z')
+			c -= 'a' - 'A';
+		hash = (hash * FNV32_PRIME) ^ c;
+	}
+	return hash;
+}
+
 #define HASHMAP_INITIAL_SIZE 64
 /* grow / shrink by 2^2 */
 #define HASHMAP_RESIZE_BITS 2
diff --git a/hashmap.h b/hashmap.h
index ab7958a..45eda69 100644
--- a/hashmap.h
+++ b/hashmap.h
@@ -12,6 +12,7 @@ extern unsigned int strhash(const char *buf);
 extern unsigned int strihash(const char *buf);
 extern unsigned int memhash(const void *buf, size_t len);
 extern unsigned int memihash(const void *buf, size_t len);
+extern unsigned int memihash_cont(unsigned int hash_seed, const void *buf, size_t len);
 
 static inline unsigned int sha1hash(const unsigned char *sha1)
 {
-- 
2.7.4


  parent reply	other threads:[~2017-03-22 17:15 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-22 17:14 [PATCH 0/6] thread lazy_init_name_hash git
2017-03-22 17:14 ` [PATCH 1/6] name-hash: specify initial size for istate.dir_hash table git
2017-03-22 17:14 ` git [this message]
2017-03-22 17:14 ` [PATCH 3/6] hashmap: Add disallow_rehash setting git
2017-03-22 17:14 ` [PATCH 4/6] name-hash: perf improvement for lazy_init_name_hash git
2017-03-22 17:14 ` [PATCH 5/6] name-hash: add test-lazy-init-name-hash git
2017-03-22 17:14 ` [PATCH 6/6] name-hash: add perf test for lazy_init_name_hash git
2017-03-22 18:02 ` [PATCH 0/6] thread lazy_init_name_hash Stefan Beller
2017-03-22 19:22   ` Jeff Hostetler
2017-03-22 19:38 ` Junio C Hamano
2017-03-22 20:54   ` Junio C Hamano
2017-03-22 21:04     ` Jeff Hostetler
2017-03-22 21:29       ` Junio C Hamano
2017-03-22 20:39 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1490202865-31325-3-git-send-email-git@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).