git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / Atom feed
* [PATCH 00/16] Speed up Counting Objects with bitmap data
@ 2013-06-24 23:22 Vicent Marti
  2013-06-24 23:22 ` [PATCH 01/16] list-objects: mark tree as unparsed when we free its buffer Vicent Marti
                   ` (16 more replies)
  0 siblings, 17 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:22 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

Hello friends and enemies from the lovevely Git Mailing list.

I bring to you a patch series that implement a quite interesting performance
optimization: the removal of the "Counting Objects" phase during `pack-objects`
by using a pre-computed bitmap to find the reachable objects in the packfile.

As you probably know, Shawn Pearce designed this approach a few months ago and
implemented it for JGit, with very exciting results.

This is a not-so-straightforward port of his original design: The general approach
is the same, but unfortunately we were not able to re-use JGit's original on-disk
format for the `.bitmap` files.

There is a full technical spec for the new format (v2) in patch 09, including
benchmarks and rationale for the new design. The gist of it is that JGit's
original format is not `mmap`able (JGit tends to not mmap anything), and that
becomes very costly in practice with `upload-pack`, which spawns a new process
for every upload.

The header and metadata for both formats are however compatible, so it should be
trivial to update JGit to read/write this format too. I intend to do this on the
coming weeks, and I also hope that the v2 implementation will be slightly faster
than the actual, even with the shortcomings of the JVM.

The patch series, although massive, is rather straightforward.

Most of the patches are isolated refactorings that enable access to a few functions
that were previously hidden (re. packfile data). These functions are needed for
reading and writing the bitmap indexes.

Patch 03 is worth noting because it implements a performance optimization for
`pack-objects` which isn't particularly good in normal invocations (~10% speed up)
but that will show great benefits in later patches when it comes to writing the
bitmap indexes.

Patch 10 is the core of the series, implementing the actual loading of bitmap indexes
and optimizing the Counting Objects phase of `pack-objects`. Like with every other
patch that offers performance improvements, sample benchmarks are provided (spoiler:
they are pretty fucking cool).

Patch 11 and 16 are samples of using the new Bitmap traversal API to speed up other
parts of Git (`rev-list --objects` and `rev-list --count`, respectively).

Patch 12, 13 and 15 implement the actual writing of bitmap indexes. Like JGit, patch
12 enables writing a bitmap index as part of the `pack-objects` process (and hence
as part of a normal `gc` run). On top of that, I implemented a new plumbing command
in patch 15 that allows to write bitmap indexes for already-existing packfiles.

I'd love your feedback on the design and implementation of this feature. I deem it
rather stable, as we've been testing it on production on the world's largest Git
host (Git Hub Dot Com The Web Site) with good results, so I'd love it to have it
upstreamed on Core Git.

Strawberry kisses,
vmg

Jeff King (1):
  list-objects: mark tree as unparsed when we free its buffer

Vicent Marti (15):
  sha1_file: refactor into `find_pack_object_pos`
  pack-objects: use a faster hash table
  pack-objects: make `pack_name_hash` global
  revision: allow setting custom limiter function
  sha1_file: export `git_open_noatime`
  compat: add endinanness helpers
  ewah: compressed bitmap implementation
  documentation: add documentation for the bitmap format
  pack-objects: use bitmaps when packing objects
  rev-list: add bitmap mode to speed up lists
  pack-objects: implement bitmap writing
  repack: consider bitmaps when performing repacks
  sha1_file: implement `nth_packed_object_info`
  write-bitmap: implement new git command to write bitmaps
  rev-list: Optimize --count using bitmaps too

 Documentation/technical/bitmap-format.txt |  235 ++++++++
 Makefile                                  |   11 +
 builtin.h                                 |    1 +
 builtin/pack-objects.c                    |  362 +++++++-----
 builtin/pack-objects.h                    |   33 ++
 builtin/rev-list.c                        |   35 +-
 builtin/write-bitmap.c                    |  256 +++++++++
 cache.h                                   |    5 +
 ewah/bitmap.c                             |  229 ++++++++
 ewah/ewah_bitmap.c                        |  703 ++++++++++++++++++++++++
 ewah/ewah_io.c                            |  199 +++++++
 ewah/ewah_rlw.c                           |  124 +++++
 ewah/ewok.h                               |  194 +++++++
 ewah/ewok_rlw.h                           |  114 ++++
 git-compat-util.h                         |   28 +
 git-repack.sh                             |   10 +-
 git.c                                     |    1 +
 khash.h                                   |  329 +++++++++++
 list-objects.c                            |    1 +
 pack-bitmap-write.c                       |  520 ++++++++++++++++++
 pack-bitmap.c                             |  855 +++++++++++++++++++++++++++++
 pack-bitmap.h                             |   64 +++
 pack-write.c                              |    2 +
 revision.c                                |    5 +
 revision.h                                |    2 +
 sha1_file.c                               |   57 +-
 26 files changed, 4212 insertions(+), 163 deletions(-)
 create mode 100644 Documentation/technical/bitmap-format.txt
 create mode 100644 builtin/pack-objects.h
 create mode 100644 builtin/write-bitmap.c
 create mode 100644 ewah/bitmap.c
 create mode 100644 ewah/ewah_bitmap.c
 create mode 100644 ewah/ewah_io.c
 create mode 100644 ewah/ewah_rlw.c
 create mode 100644 ewah/ewok.h
 create mode 100644 ewah/ewok_rlw.h
 create mode 100644 khash.h
 create mode 100644 pack-bitmap-write.c
 create mode 100644 pack-bitmap.c
 create mode 100644 pack-bitmap.h

-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 01/16] list-objects: mark tree as unparsed when we free its buffer
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
@ 2013-06-24 23:22 ` Vicent Marti
  2013-06-24 23:22 ` [PATCH 02/16] sha1_file: refactor into `find_pack_object_pos` Vicent Marti
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:22 UTC (permalink / raw)
  To: git; +Cc: Jeff King

From: Jeff King <peff@peff.net>

We free the tree buffer during traversal to save memory.
However, we do not reset the "parsed" flag, which leaves a
landmine for the next person to use the tree. When they call
parse_tree it will do nothing, and they will segfault when
they try to access the buffer.

This hasn't mattered until now because most rev-list
traversals would exit the program immediately afterwards,
but the bitmap writer wants to access the trees twice.
---
 list-objects.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/list-objects.c b/list-objects.c
index 3dd4a96..1251180 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -125,6 +125,7 @@ static void process_tree(struct rev_info *revs,
 	strbuf_setlen(base, baselen);
 	free(tree->buffer);
 	tree->buffer = NULL;
+	tree->object.parsed = 0;
 }
 
 static void mark_edge_parents_uninteresting(struct commit *commit,
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 02/16] sha1_file: refactor into `find_pack_object_pos`
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
  2013-06-24 23:22 ` [PATCH 01/16] list-objects: mark tree as unparsed when we free its buffer Vicent Marti
@ 2013-06-24 23:22 ` Vicent Marti
  2013-06-25 13:59   ` Thomas Rast
  2013-06-24 23:23 ` [PATCH 03/16] pack-objects: use a faster hash table Vicent Marti
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:22 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

Looking up the offset in the packfile for a given SHA1 involves the
following:

	- Finding the position in the index for the given SHA1
	- Accessing the offset cache in the index for the found position

There are cases however where we'd like to find the position of a SHA1
in the index without looking up the packfile offset (e.g. when accessing
information that has been indexed based on index offsets).

This refactoring implements `find_pack_object_pos`, returning the
position in the index, and re-implements `find_pack_entry_one`(returning
the actual offset in the packfile) to use the new function.
---
 cache.h     |    1 +
 sha1_file.c |   27 +++++++++++++++++----------
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/cache.h b/cache.h
index ec8240f..a29645e 100644
--- a/cache.h
+++ b/cache.h
@@ -1101,6 +1101,7 @@ extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *, int, int);
 extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t);
 extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t);
+extern int find_pack_entry_pos(const unsigned char *sha1, struct packed_git *p);
 extern off_t find_pack_entry_one(const unsigned char *, struct packed_git *);
 extern int is_pack_valid(struct packed_git *);
 extern void *unpack_entry(struct packed_git *, off_t, enum object_type *, unsigned long *);
diff --git a/sha1_file.c b/sha1_file.c
index 0af19c0..371e295 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2205,8 +2205,7 @@ off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n)
 	}
 }
 
-off_t find_pack_entry_one(const unsigned char *sha1,
-				  struct packed_git *p)
+int find_pack_entry_pos(const unsigned char *sha1, struct packed_git *p)
 {
 	const uint32_t *level1_ofs = p->index_data;
 	const unsigned char *index = p->index_data;
@@ -2219,7 +2218,7 @@ off_t find_pack_entry_one(const unsigned char *sha1,
 
 	if (!index) {
 		if (open_pack_index(p))
-			return 0;
+			return -1;
 		level1_ofs = p->index_data;
 		index = p->index_data;
 	}
@@ -2243,12 +2242,9 @@ off_t find_pack_entry_one(const unsigned char *sha1,
 
 	if (use_lookup < 0)
 		use_lookup = !!getenv("GIT_USE_LOOKUP");
+
 	if (use_lookup) {
-		int pos = sha1_entry_pos(index, stride, 0,
-					 lo, hi, p->num_objects, sha1);
-		if (pos < 0)
-			return 0;
-		return nth_packed_object_offset(p, pos);
+		return sha1_entry_pos(index, stride, 0, lo, hi, p->num_objects, sha1);
 	}
 
 	do {
@@ -2259,13 +2255,24 @@ off_t find_pack_entry_one(const unsigned char *sha1,
 			printf("lo %u hi %u rg %u mi %u\n",
 			       lo, hi, hi - lo, mi);
 		if (!cmp)
-			return nth_packed_object_offset(p, mi);
+			return mi;
 		if (cmp > 0)
 			hi = mi;
 		else
 			lo = mi+1;
 	} while (lo < hi);
-	return 0;
+
+	return -1;
+}
+
+off_t find_pack_entry_one(const unsigned char *sha1, struct packed_git *p)
+{
+	int pos;
+
+	if ((pos = find_pack_entry_pos(sha1, p)) < 0)
+		return 0;
+
+	return nth_packed_object_offset(p, (uint32_t)pos);
 }
 
 int is_pack_valid(struct packed_git *p)
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 03/16] pack-objects: use a faster hash table
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
  2013-06-24 23:22 ` [PATCH 01/16] list-objects: mark tree as unparsed when we free its buffer Vicent Marti
  2013-06-24 23:22 ` [PATCH 02/16] sha1_file: refactor into `find_pack_object_pos` Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-25 14:03   ` Thomas Rast
                     ` (2 more replies)
  2013-06-24 23:23 ` [PATCH 04/16] pack-objects: make `pack_name_hash` global Vicent Marti
                   ` (13 subsequent siblings)
  16 siblings, 3 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

In a normal pack-objects invocations runtime is dominated by the
counting objects phase and the time spent performing hash operations
with the `object_entry` structs is not particularly relevant.

If the Counting Objects phase however gets optimized to perform batch
insertions (like it is the goal of this patch series), the current
hash table implementation starts to bottleneck in both insertion times
and memory reallocations caused by the fast growth of the hash table.

For instance, these are function timings in a hotspot profiling of a
pack-objects run:

	locate_object_entry_hash :: 563.935ms
	hashcmp :: 195.991ms
	add_object_entry_1 :: 47.962ms

This commit brings `khash.h`, a header only hash table implementation
that while remaining rather simple (uses quadratic probing and a
standard hashing scheme) and self-contained, offers a significant
performance improvement in both insertion and lookup times.

`khash` is a generic hash table implementation that can be 'templated'
for any given type while maintaining good performance by using preprocessor
macros. This specific version has been modified to define by default a
`khash_sha1` type, a map of SHA1s (const unsigned char[20]) to void *
pointers.

When replacing the old hash table implementation in `pack-objects` with
the khash_sha1 table, the insertion time is greatly reduced:

	kh_put_sha1 :: 284.011ms
	add_object_entry_1 : 36.06ms
	hashcmp :: 24.045ms

This reduction of more than 50% in the insertion and lookup times,
although nice, is not particularly noticeable for normal `pack-objects`
operation: `pack-objects` performs massive batch insertions and
relatively few lookups, so `khash` doesn't get a chance to shine here.

The big win here, however, is in the massively reduced amount of hash
collisions (as you can see from the huge reduction of time spent in
`hashcmp` after the change). These greatly improved lookup times
will result critical once we implement the writing algorithm for bitmap
indxes in a later patch of this series.

The bitmap writing phase for a repository like `linux` requires several
million table lookups: using the new hash table saves 1min and 20s from
a `pack-objects` invocation that also writes out bitmaps.
---
 Makefile               |    1 +
 builtin/pack-objects.c |  210 ++++++++++++++++---------------
 khash.h                |  329 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 441 insertions(+), 99 deletions(-)
 create mode 100644 khash.h

diff --git a/Makefile b/Makefile
index 79f961e..e01506d 100644
--- a/Makefile
+++ b/Makefile
@@ -683,6 +683,7 @@ LIB_H += grep.h
 LIB_H += hash.h
 LIB_H += help.h
 LIB_H += http.h
+LIB_H += khash.h
 LIB_H += kwset.h
 LIB_H += levenshtein.h
 LIB_H += line-log.h
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f069462..fc12df8 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -18,6 +18,7 @@
 #include "refs.h"
 #include "streaming.h"
 #include "thread-utils.h"
+#include "khash.h"
 
 static const char *pack_usage[] = {
 	N_("git pack-objects --stdout [options...] [< ref-list | < object-list]"),
@@ -57,7 +58,7 @@ struct object_entry {
  * in the order we see -- typically rev-list --objects order that gives us
  * nice "minimum seek" order.
  */
-static struct object_entry *objects;
+static struct object_entry **objects;
 static struct pack_idx_entry **written_list;
 static uint32_t nr_objects, nr_alloc, nr_result, nr_written;
 
@@ -93,8 +94,8 @@ static unsigned long window_memory_limit = 0;
  * to help looking up the entry by object name.
  * This hashtable is built after all the objects are seen.
  */
-static int *object_ix;
-static int object_ix_hashsz;
+
+static khash_sha1 *packed_objects;
 static struct object_entry *locate_object_entry(const unsigned char *sha1);
 
 /*
@@ -104,6 +105,35 @@ static uint32_t written, written_delta;
 static uint32_t reused, reused_delta;
 
 
+static struct object_slab {
+	struct object_slab *next;
+	uint32_t count;
+	struct object_entry data[0];
+} *slab;
+
+#define OBJECTS_PER_SLAB 2048
+
+static void push_slab(void)
+{
+	static const size_t slab_size =
+		(OBJECTS_PER_SLAB * sizeof(struct object_entry)) + sizeof(struct object_slab);
+
+	struct object_slab *new_slab = calloc(1, slab_size);
+
+	new_slab->count = 0;
+	new_slab->next = slab;
+	slab = new_slab;
+}
+
+static struct object_entry *alloc_object_entry(void)
+{
+	if (!slab || slab->count == OBJECTS_PER_SLAB)
+		push_slab();
+
+	return &slab->data[slab->count++];
+}
+
+
 static void *get_delta(struct object_entry *entry)
 {
 	unsigned long size, base_size, delta_size;
@@ -635,10 +665,10 @@ static struct object_entry **compute_write_order(void)
 	struct object_entry **wo = xmalloc(nr_objects * sizeof(*wo));
 
 	for (i = 0; i < nr_objects; i++) {
-		objects[i].tagged = 0;
-		objects[i].filled = 0;
-		objects[i].delta_child = NULL;
-		objects[i].delta_sibling = NULL;
+		objects[i]->tagged = 0;
+		objects[i]->filled = 0;
+		objects[i]->delta_child = NULL;
+		objects[i]->delta_sibling = NULL;
 	}
 
 	/*
@@ -647,7 +677,7 @@ static struct object_entry **compute_write_order(void)
 	 * recency order.
 	 */
 	for (i = nr_objects; i > 0;) {
-		struct object_entry *e = &objects[--i];
+		struct object_entry *e = objects[--i];
 		if (!e->delta)
 			continue;
 		/* Mark me as the first child */
@@ -665,9 +695,9 @@ static struct object_entry **compute_write_order(void)
 	 * we see a tagged tip.
 	 */
 	for (i = wo_end = 0; i < nr_objects; i++) {
-		if (objects[i].tagged)
+		if (objects[i]->tagged)
 			break;
-		add_to_write_order(wo, &wo_end, &objects[i]);
+		add_to_write_order(wo, &wo_end, objects[i]);
 	}
 	last_untagged = i;
 
@@ -675,35 +705,35 @@ static struct object_entry **compute_write_order(void)
 	 * Then fill all the tagged tips.
 	 */
 	for (; i < nr_objects; i++) {
-		if (objects[i].tagged)
-			add_to_write_order(wo, &wo_end, &objects[i]);
+		if (objects[i]->tagged)
+			add_to_write_order(wo, &wo_end, objects[i]);
 	}
 
 	/*
 	 * And then all remaining commits and tags.
 	 */
 	for (i = last_untagged; i < nr_objects; i++) {
-		if (objects[i].type != OBJ_COMMIT &&
-		    objects[i].type != OBJ_TAG)
+		if (objects[i]->type != OBJ_COMMIT &&
+		    objects[i]->type != OBJ_TAG)
 			continue;
-		add_to_write_order(wo, &wo_end, &objects[i]);
+		add_to_write_order(wo, &wo_end, objects[i]);
 	}
 
 	/*
 	 * And then all the trees.
 	 */
 	for (i = last_untagged; i < nr_objects; i++) {
-		if (objects[i].type != OBJ_TREE)
+		if (objects[i]->type != OBJ_TREE)
 			continue;
-		add_to_write_order(wo, &wo_end, &objects[i]);
+		add_to_write_order(wo, &wo_end, objects[i]);
 	}
 
 	/*
 	 * Finally all the rest in really tight order
 	 */
 	for (i = last_untagged; i < nr_objects; i++) {
-		if (!objects[i].filled)
-			add_family_to_write_order(wo, &wo_end, &objects[i]);
+		if (!objects[i]->filled)
+			add_family_to_write_order(wo, &wo_end, objects[i]);
 	}
 
 	if (wo_end != nr_objects)
@@ -804,61 +834,26 @@ static void write_pack_file(void)
 		nr_remaining -= nr_written;
 	} while (nr_remaining && i < nr_objects);
 
+	stop_progress(&progress_state);
+
 	free(written_list);
 	free(write_order);
-	stop_progress(&progress_state);
 	if (written != nr_result)
 		die("wrote %"PRIu32" objects while expecting %"PRIu32,
 			written, nr_result);
 }
 
-static int locate_object_entry_hash(const unsigned char *sha1)
-{
-	int i;
-	unsigned int ui;
-	memcpy(&ui, sha1, sizeof(unsigned int));
-	i = ui % object_ix_hashsz;
-	while (0 < object_ix[i]) {
-		if (!hashcmp(sha1, objects[object_ix[i] - 1].idx.sha1))
-			return i;
-		if (++i == object_ix_hashsz)
-			i = 0;
-	}
-	return -1 - i;
-}
-
 static struct object_entry *locate_object_entry(const unsigned char *sha1)
 {
-	int i;
+	khiter_t pos = kh_get_sha1(packed_objects, sha1);
 
-	if (!object_ix_hashsz)
-		return NULL;
+	if (pos < kh_end(packed_objects)) {
+		return kh_value(packed_objects, pos);
+	}
 
-	i = locate_object_entry_hash(sha1);
-	if (0 <= i)
-		return &objects[object_ix[i]-1];
 	return NULL;
 }
 
-static void rehash_objects(void)
-{
-	uint32_t i;
-	struct object_entry *oe;
-
-	object_ix_hashsz = nr_objects * 3;
-	if (object_ix_hashsz < 1024)
-		object_ix_hashsz = 1024;
-	object_ix = xrealloc(object_ix, sizeof(int) * object_ix_hashsz);
-	memset(object_ix, 0, sizeof(int) * object_ix_hashsz);
-	for (i = 0, oe = objects; i < nr_objects; i++, oe++) {
-		int ix = locate_object_entry_hash(oe->idx.sha1);
-		if (0 <= ix)
-			continue;
-		ix = -1 - ix;
-		object_ix[ix] = i + 1;
-	}
-}
-
 static unsigned name_hash(const char *name)
 {
 	unsigned c, hash = 0;
@@ -901,19 +896,19 @@ static int no_try_delta(const char *path)
 	return 0;
 }
 
-static int add_object_entry(const unsigned char *sha1, enum object_type type,
-			    const char *name, int exclude)
+static int add_object_entry_1(const unsigned char *sha1, enum object_type type,
+			    uint32_t hash, int exclude, struct packed_git *found_pack,
+				off_t found_offset)
 {
 	struct object_entry *entry;
-	struct packed_git *p, *found_pack = NULL;
-	off_t found_offset = 0;
-	int ix;
-	unsigned hash = name_hash(name);
+	struct packed_git *p;
+	khiter_t ix;
+	int hash_ret;
 
-	ix = nr_objects ? locate_object_entry_hash(sha1) : -1;
-	if (ix >= 0) {
+	ix = kh_put_sha1(packed_objects, sha1, &hash_ret);
+	if (hash_ret == 0) {
 		if (exclude) {
-			entry = objects + object_ix[ix] - 1;
+			entry = kh_value(packed_objects, ix);
 			if (!entry->preferred_base)
 				nr_result--;
 			entry->preferred_base = 1;
@@ -921,38 +916,42 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 		return 0;
 	}
 
-	if (!exclude && local && has_loose_object_nonlocal(sha1))
+	if (!exclude && local && has_loose_object_nonlocal(sha1)) {
+		kh_del_sha1(packed_objects, ix);
 		return 0;
+	}
 
-	for (p = packed_git; p; p = p->next) {
-		off_t offset = find_pack_entry_one(sha1, p);
-		if (offset) {
-			if (!found_pack) {
-				if (!is_pack_valid(p)) {
-					warning("packfile %s cannot be accessed", p->pack_name);
-					continue;
+	if (!found_pack) {
+		for (p = packed_git; p; p = p->next) {
+			off_t offset = find_pack_entry_one(sha1, p);
+			if (offset) {
+				if (!found_pack) {
+					if (!is_pack_valid(p)) {
+						warning("packfile %s cannot be accessed", p->pack_name);
+						continue;
+					}
+					found_offset = offset;
+					found_pack = p;
+				}
+				if (exclude)
+					break;
+				if (incremental ||
+					(local && !p->pack_local) ||
+					(ignore_packed_keep && p->pack_local && p->pack_keep)) {
+					kh_del_sha1(packed_objects, ix);
+					return 0;
 				}
-				found_offset = offset;
-				found_pack = p;
 			}
-			if (exclude)
-				break;
-			if (incremental)
-				return 0;
-			if (local && !p->pack_local)
-				return 0;
-			if (ignore_packed_keep && p->pack_local && p->pack_keep)
-				return 0;
 		}
 	}
 
 	if (nr_objects >= nr_alloc) {
 		nr_alloc = (nr_alloc  + 1024) * 3 / 2;
-		objects = xrealloc(objects, nr_alloc * sizeof(*entry));
+		objects = xrealloc(objects, nr_alloc * sizeof(struct object_entry *));
 	}
 
-	entry = objects + nr_objects++;
-	memset(entry, 0, sizeof(*entry));
+	entry = alloc_object_entry();
+
 	hashcpy(entry->idx.sha1, sha1);
 	entry->hash = hash;
 	if (type)
@@ -966,19 +965,30 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 		entry->in_pack_offset = found_offset;
 	}
 
-	if (object_ix_hashsz * 3 <= nr_objects * 4)
-		rehash_objects();
-	else
-		object_ix[-1 - ix] = nr_objects;
+	kh_value(packed_objects, ix) = entry;
+	kh_key(packed_objects, ix) = entry->idx.sha1;
+	objects[nr_objects++] = entry;
 
 	display_progress(progress_state, nr_objects);
 
-	if (name && no_try_delta(name))
-		entry->no_try_delta = 1;
-
 	return 1;
 }
 
+static int add_object_entry(const unsigned char *sha1, enum object_type type,
+			    const char *name, int exclude)
+{
+	if (add_object_entry_1(sha1, type, name_hash(name), exclude, NULL, 0)) {
+		struct object_entry *entry = objects[nr_objects - 1];
+
+		if (name && no_try_delta(name))
+			entry->no_try_delta = 1;
+
+		return 1;
+	}
+
+	return 0;
+}
+
 struct pbase_tree_cache {
 	unsigned char sha1[20];
 	int ref;
@@ -1404,7 +1414,7 @@ static void get_object_details(void)
 
 	sorted_by_offset = xcalloc(nr_objects, sizeof(struct object_entry *));
 	for (i = 0; i < nr_objects; i++)
-		sorted_by_offset[i] = objects + i;
+		sorted_by_offset[i] = objects[i];
 	qsort(sorted_by_offset, nr_objects, sizeof(*sorted_by_offset), pack_offset_sort);
 
 	for (i = 0; i < nr_objects; i++) {
@@ -2063,7 +2073,7 @@ static void prepare_pack(int window, int depth)
 	nr_deltas = n = 0;
 
 	for (i = 0; i < nr_objects; i++) {
-		struct object_entry *entry = objects + i;
+		struct object_entry *entry = objects[i];
 
 		if (entry->delta)
 			/* This happens if we decided to reuse existing
@@ -2574,6 +2584,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	if (progress && all_progress_implied)
 		progress = 2;
 
+	packed_objects = kh_init_sha1();
 	prepare_packed_git();
 
 	if (progress)
@@ -2598,5 +2609,6 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		fprintf(stderr, "Total %"PRIu32" (delta %"PRIu32"),"
 			" reused %"PRIu32" (delta %"PRIu32")\n",
 			written, written_delta, reused, reused_delta);
+
 	return 0;
 }
diff --git a/khash.h b/khash.h
new file mode 100644
index 0000000..e6cdb38
--- /dev/null
+++ b/khash.h
@@ -0,0 +1,329 @@
+/* The MIT License
+
+   Copyright (c) 2008, 2009, 2011 by Attractive Chaos <attractor@live.co.uk>
+
+   Permission is hereby granted, free of charge, to any person obtaining
+   a copy of this software and associated documentation files (the
+   "Software"), to deal in the Software without restriction, including
+   without limitation the rights to use, copy, modify, merge, publish,
+   distribute, sublicense, and/or sell copies of the Software, and to
+   permit persons to whom the Software is furnished to do so, subject to
+   the following conditions:
+
+   The above copyright notice and this permission notice shall be
+   included in all copies or substantial portions of the Software.
+
+   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+   BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+   ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+   CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+   SOFTWARE.
+*/
+
+#ifndef __AC_KHASH_H
+#define __AC_KHASH_H
+
+#define AC_VERSION_KHASH_H "0.2.8"
+
+#include <stdlib.h>
+#include <string.h>
+#include <limits.h>
+
+typedef uint32_t khint32_t;
+typedef uint64_t khint64_t;
+
+typedef khint32_t khint_t;
+typedef khint_t khiter_t;
+
+#define __ac_isempty(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&2)
+#define __ac_isdel(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&1)
+#define __ac_iseither(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&3)
+#define __ac_set_isdel_false(flag, i) (flag[i>>4]&=~(1ul<<((i&0xfU)<<1)))
+#define __ac_set_isempty_false(flag, i) (flag[i>>4]&=~(2ul<<((i&0xfU)<<1)))
+#define __ac_set_isboth_false(flag, i) (flag[i>>4]&=~(3ul<<((i&0xfU)<<1)))
+#define __ac_set_isdel_true(flag, i) (flag[i>>4]|=1ul<<((i&0xfU)<<1))
+
+#define __ac_fsize(m) ((m) < 16? 1 : (m)>>4)
+
+#define kroundup32(x) (--(x), (x)|=(x)>>1, (x)|=(x)>>2, (x)|=(x)>>4, (x)|=(x)>>8, (x)|=(x)>>16, ++(x))
+
+static const double __ac_HASH_UPPER = 0.77;
+
+#define __KHASH_TYPE(name, khkey_t, khval_t) \
+	typedef struct { \
+		khint_t n_buckets, size, n_occupied, upper_bound; \
+		khint32_t *flags; \
+		khkey_t *keys; \
+		khval_t *vals; \
+	} kh_##name##_t;
+
+#define __KHASH_PROTOTYPES(name, khkey_t, khval_t)	 					\
+	extern kh_##name##_t *kh_init_##name(void);							\
+	extern void kh_destroy_##name(kh_##name##_t *h);					\
+	extern void kh_clear_##name(kh_##name##_t *h);						\
+	extern khint_t kh_get_##name(const kh_##name##_t *h, khkey_t key); 	\
+	extern int kh_resize_##name(kh_##name##_t *h, khint_t new_n_buckets); \
+	extern khint_t kh_put_##name(kh_##name##_t *h, khkey_t key, int *ret); \
+	extern void kh_del_##name(kh_##name##_t *h, khint_t x);
+
+#define __KHASH_IMPL(name, SCOPE, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal) \
+	SCOPE kh_##name##_t *kh_init_##name(void) {							\
+		return (kh_##name##_t*)xcalloc(1, sizeof(kh_##name##_t));		\
+	}																	\
+	SCOPE void kh_destroy_##name(kh_##name##_t *h)						\
+	{																	\
+		if (h) {														\
+			free((void *)h->keys); free(h->flags);					\
+			free((void *)h->vals);										\
+			free(h);													\
+		}																\
+	}																	\
+	SCOPE void kh_clear_##name(kh_##name##_t *h)						\
+	{																	\
+		if (h && h->flags) {											\
+			memset(h->flags, 0xaa, __ac_fsize(h->n_buckets) * sizeof(khint32_t)); \
+			h->size = h->n_occupied = 0;								\
+		}																\
+	}																	\
+	SCOPE khint_t kh_get_##name(const kh_##name##_t *h, khkey_t key) 	\
+	{																	\
+		if (h->n_buckets) {												\
+			khint_t k, i, last, mask, step = 0; \
+			mask = h->n_buckets - 1;									\
+			k = __hash_func(key); i = k & mask;							\
+			last = i; \
+			while (!__ac_isempty(h->flags, i) && (__ac_isdel(h->flags, i) || !__hash_equal(h->keys[i], key))) { \
+				i = (i + (++step)) & mask; \
+				if (i == last) return h->n_buckets;						\
+			}															\
+			return __ac_iseither(h->flags, i)? h->n_buckets : i;		\
+		} else return 0;												\
+	}																	\
+	SCOPE int kh_resize_##name(kh_##name##_t *h, khint_t new_n_buckets) \
+	{ /* This function uses 0.25*n_buckets bytes of working space instead of [sizeof(key_t+val_t)+.25]*n_buckets. */ \
+		khint32_t *new_flags = 0;										\
+		khint_t j = 1;													\
+		{																\
+			kroundup32(new_n_buckets); 									\
+			if (new_n_buckets < 4) new_n_buckets = 4;					\
+			if (h->size >= (khint_t)(new_n_buckets * __ac_HASH_UPPER + 0.5)) j = 0;	/* requested size is too small */ \
+			else { /* hash table size to be changed (shrink or expand); rehash */ \
+				new_flags = (khint32_t*)xmalloc(__ac_fsize(new_n_buckets) * sizeof(khint32_t));	\
+				if (!new_flags) return -1;								\
+				memset(new_flags, 0xaa, __ac_fsize(new_n_buckets) * sizeof(khint32_t)); \
+				if (h->n_buckets < new_n_buckets) {	/* expand */		\
+					khkey_t *new_keys = (khkey_t*)xrealloc((void *)h->keys, new_n_buckets * sizeof(khkey_t)); \
+					if (!new_keys) return -1;							\
+					h->keys = new_keys;									\
+					if (kh_is_map) {									\
+						khval_t *new_vals = (khval_t*)xrealloc((void *)h->vals, new_n_buckets * sizeof(khval_t)); \
+						if (!new_vals) return -1;						\
+						h->vals = new_vals;								\
+					}													\
+				} /* otherwise shrink */								\
+			}															\
+		}																\
+		if (j) { /* rehashing is needed */								\
+			for (j = 0; j != h->n_buckets; ++j) {						\
+				if (__ac_iseither(h->flags, j) == 0) {					\
+					khkey_t key = h->keys[j];							\
+					khval_t val;										\
+					khint_t new_mask;									\
+					new_mask = new_n_buckets - 1; 						\
+					if (kh_is_map) val = h->vals[j];					\
+					__ac_set_isdel_true(h->flags, j);					\
+					while (1) { /* kick-out process; sort of like in Cuckoo hashing */ \
+						khint_t k, i, step = 0; \
+						k = __hash_func(key);							\
+						i = k & new_mask;								\
+						while (!__ac_isempty(new_flags, i)) i = (i + (++step)) & new_mask; \
+						__ac_set_isempty_false(new_flags, i);			\
+						if (i < h->n_buckets && __ac_iseither(h->flags, i) == 0) { /* kick out the existing element */ \
+							{ khkey_t tmp = h->keys[i]; h->keys[i] = key; key = tmp; } \
+							if (kh_is_map) { khval_t tmp = h->vals[i]; h->vals[i] = val; val = tmp; } \
+							__ac_set_isdel_true(h->flags, i); /* mark it as deleted in the old hash table */ \
+						} else { /* write the element and jump out of the loop */ \
+							h->keys[i] = key;							\
+							if (kh_is_map) h->vals[i] = val;			\
+							break;										\
+						}												\
+					}													\
+				}														\
+			}															\
+			if (h->n_buckets > new_n_buckets) { /* shrink the hash table */ \
+				h->keys = (khkey_t*)xrealloc((void *)h->keys, new_n_buckets * sizeof(khkey_t)); \
+				if (kh_is_map) h->vals = (khval_t*)xrealloc((void *)h->vals, new_n_buckets * sizeof(khval_t)); \
+			}															\
+			free(h->flags); /* free the working space */				\
+			h->flags = new_flags;										\
+			h->n_buckets = new_n_buckets;								\
+			h->n_occupied = h->size;									\
+			h->upper_bound = (khint_t)(h->n_buckets * __ac_HASH_UPPER + 0.5); \
+		}																\
+		return 0;														\
+	}																	\
+	SCOPE khint_t kh_put_##name(kh_##name##_t *h, khkey_t key, int *ret) \
+	{																	\
+		khint_t x;														\
+		if (h->n_occupied >= h->upper_bound) { /* update the hash table */ \
+			if (h->n_buckets > (h->size<<1)) {							\
+				if (kh_resize_##name(h, h->n_buckets - 1) < 0) { /* clear "deleted" elements */ \
+					*ret = -1; return h->n_buckets;						\
+				}														\
+			} else if (kh_resize_##name(h, h->n_buckets + 1) < 0) { /* expand the hash table */ \
+				*ret = -1; return h->n_buckets;							\
+			}															\
+		} /* TODO: to implement automatically shrinking; resize() already support shrinking */ \
+		{																\
+			khint_t k, i, site, last, mask = h->n_buckets - 1, step = 0; \
+			x = site = h->n_buckets; k = __hash_func(key); i = k & mask; \
+			if (__ac_isempty(h->flags, i)) x = i; /* for speed up */	\
+			else {														\
+				last = i; \
+				while (!__ac_isempty(h->flags, i) && (__ac_isdel(h->flags, i) || !__hash_equal(h->keys[i], key))) { \
+					if (__ac_isdel(h->flags, i)) site = i;				\
+					i = (i + (++step)) & mask; \
+					if (i == last) { x = site; break; }					\
+				}														\
+				if (x == h->n_buckets) {								\
+					if (__ac_isempty(h->flags, i) && site != h->n_buckets) x = site; \
+					else x = i;											\
+				}														\
+			}															\
+		}																\
+		if (__ac_isempty(h->flags, x)) { /* not present at all */		\
+			h->keys[x] = key;											\
+			__ac_set_isboth_false(h->flags, x);							\
+			++h->size; ++h->n_occupied;									\
+			*ret = 1;													\
+		} else if (__ac_isdel(h->flags, x)) { /* deleted */				\
+			h->keys[x] = key;											\
+			__ac_set_isboth_false(h->flags, x);							\
+			++h->size;													\
+			*ret = 2;													\
+		} else *ret = 0; /* Don't touch h->keys[x] if present and not deleted */ \
+		return x;														\
+	}																	\
+	SCOPE void kh_del_##name(kh_##name##_t *h, khint_t x)				\
+	{																	\
+		if (x != h->n_buckets && !__ac_iseither(h->flags, x)) {			\
+			__ac_set_isdel_true(h->flags, x);							\
+			--h->size;													\
+		}																\
+	}
+
+#define KHASH_DECLARE(name, khkey_t, khval_t)		 					\
+	__KHASH_TYPE(name, khkey_t, khval_t) 								\
+	__KHASH_PROTOTYPES(name, khkey_t, khval_t)
+
+#define KHASH_INIT2(name, SCOPE, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal) \
+	__KHASH_TYPE(name, khkey_t, khval_t) 								\
+	__KHASH_IMPL(name, SCOPE, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal)
+
+#define KHASH_INIT(name, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal) \
+	KHASH_INIT2(name, static inline, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal)
+
+/* Other convenient macros... */
+
+/*! @function
+  @abstract     Test whether a bucket contains data.
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @param  x     Iterator to the bucket [khint_t]
+  @return       1 if containing data; 0 otherwise [int]
+ */
+#define kh_exist(h, x) (!__ac_iseither((h)->flags, (x)))
+
+/*! @function
+  @abstract     Get key given an iterator
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @param  x     Iterator to the bucket [khint_t]
+  @return       Key [type of keys]
+ */
+#define kh_key(h, x) ((h)->keys[x])
+
+/*! @function
+  @abstract     Get value given an iterator
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @param  x     Iterator to the bucket [khint_t]
+  @return       Value [type of values]
+  @discussion   For hash sets, calling this results in segfault.
+ */
+#define kh_val(h, x) ((h)->vals[x])
+
+/*! @function
+  @abstract     Alias of kh_val()
+ */
+#define kh_value(h, x) ((h)->vals[x])
+
+/*! @function
+  @abstract     Get the start iterator
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @return       The start iterator [khint_t]
+ */
+#define kh_begin(h) (khint_t)(0)
+
+/*! @function
+  @abstract     Get the end iterator
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @return       The end iterator [khint_t]
+ */
+#define kh_end(h) ((h)->n_buckets)
+
+/*! @function
+  @abstract     Get the number of elements in the hash table
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @return       Number of elements in the hash table [khint_t]
+ */
+#define kh_size(h) ((h)->size)
+
+/*! @function
+  @abstract     Get the number of buckets in the hash table
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @return       Number of buckets in the hash table [khint_t]
+ */
+#define kh_n_buckets(h) ((h)->n_buckets)
+
+/*! @function
+  @abstract     Iterate over the entries in the hash table
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @param  kvar  Variable to which key will be assigned
+  @param  vvar  Variable to which value will be assigned
+  @param  code  Block of code to execute
+ */
+#define kh_foreach(h, kvar, vvar, code) { khint_t __i;		\
+	for (__i = kh_begin(h); __i != kh_end(h); ++__i) {		\
+		if (!kh_exist(h,__i)) continue;						\
+		(kvar) = kh_key(h,__i);								\
+		(vvar) = kh_val(h,__i);								\
+		code;												\
+	} }
+
+/*! @function
+  @abstract     Iterate over the values in the hash table
+  @param  h     Pointer to the hash table [khash_t(name)*]
+  @param  vvar  Variable to which value will be assigned
+  @param  code  Block of code to execute
+ */
+#define kh_foreach_value(h, vvar, code) { khint_t __i;		\
+	for (__i = kh_begin(h); __i != kh_end(h); ++__i) {		\
+		if (!kh_exist(h,__i)) continue;						\
+		(vvar) = kh_val(h,__i);								\
+		code;												\
+	} }
+
+static inline khint_t __kh_oid_hash(const unsigned char *oid)
+{
+	khint_t hash;
+	memcpy(&hash, oid, sizeof(hash));
+	return hash;
+}
+
+#define __kh_oid_cmp(a, b) (hashcmp(a, b) == 0)
+
+KHASH_INIT(sha1, const unsigned char *, void *, 1, __kh_oid_hash, __kh_oid_cmp)
+typedef kh_sha1_t khash_sha1;
+
+#endif /* __AC_KHASH_H */
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 04/16] pack-objects: make `pack_name_hash` global
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (2 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 03/16] pack-objects: use a faster hash table Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-24 23:23 ` [PATCH 05/16] revision: allow setting custom limiter function Vicent Marti
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

The hash function used by `builtin/pack-objects.c` to efficiently find
delta bases when packing can be of interest for other parts of Git that
also have to deal with delta bases.
---
 builtin/pack-objects.c |   24 ++----------------------
 cache.h                |    2 ++
 sha1_file.c            |   20 ++++++++++++++++++++
 3 files changed, 24 insertions(+), 22 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index fc12df8..b7cab18 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -854,26 +854,6 @@ static struct object_entry *locate_object_entry(const unsigned char *sha1)
 	return NULL;
 }
 
-static unsigned name_hash(const char *name)
-{
-	unsigned c, hash = 0;
-
-	if (!name)
-		return 0;
-
-	/*
-	 * This effectively just creates a sortable number from the
-	 * last sixteen non-whitespace characters. Last characters
-	 * count "most", so things that end in ".c" sort together.
-	 */
-	while ((c = *name++) != 0) {
-		if (isspace(c))
-			continue;
-		hash = (hash >> 2) + (c << 24);
-	}
-	return hash;
-}
-
 static void setup_delta_attr_check(struct git_attr_check *check)
 {
 	static struct git_attr *attr_delta;
@@ -977,7 +957,7 @@ static int add_object_entry_1(const unsigned char *sha1, enum object_type type,
 static int add_object_entry(const unsigned char *sha1, enum object_type type,
 			    const char *name, int exclude)
 {
-	if (add_object_entry_1(sha1, type, name_hash(name), exclude, NULL, 0)) {
+	if (add_object_entry_1(sha1, type, pack_name_hash(name), exclude, NULL, 0)) {
 		struct object_entry *entry = objects[nr_objects - 1];
 
 		if (name && no_try_delta(name))
@@ -1186,7 +1166,7 @@ static void add_preferred_base_object(const char *name)
 {
 	struct pbase_tree *it;
 	int cmplen;
-	unsigned hash = name_hash(name);
+	unsigned hash = pack_name_hash(name);
 
 	if (!num_preferred_base || check_pbase_path(hash))
 		return;
diff --git a/cache.h b/cache.h
index a29645e..95ef14d 100644
--- a/cache.h
+++ b/cache.h
@@ -653,6 +653,8 @@ extern char *sha1_pack_index_name(const unsigned char *sha1);
 extern const char *find_unique_abbrev(const unsigned char *sha1, int);
 extern const unsigned char null_sha1[20];
 
+extern uint32_t pack_name_hash(const char *name);
+
 static inline int hashcmp(const unsigned char *sha1, const unsigned char *sha2)
 {
 	int i;
diff --git a/sha1_file.c b/sha1_file.c
index 371e295..44c7bca 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -60,6 +60,26 @@ static struct cached_object empty_tree = {
 	0
 };
 
+uint32_t pack_name_hash(const char *name)
+{
+	unsigned c, hash = 0;
+
+	if (!name)
+		return 0;
+
+	/*
+	 * This effectively just creates a sortable number from the
+	 * last sixteen non-whitespace characters. Last characters
+	 * count "most", so things that end in ".c" sort together.
+	 */
+	while ((c = *name++) != 0) {
+		if (isspace(c))
+			continue;
+		hash = (hash >> 2) + (c << 24);
+	}
+	return hash;
+}
+
 static struct packed_git *last_found_pack;
 
 static struct cached_object *find_cached_object(const unsigned char *sha1)
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 05/16] revision: allow setting custom limiter function
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (3 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 04/16] pack-objects: make `pack_name_hash` global Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-24 23:23 ` [PATCH 06/16] sha1_file: export `git_open_noatime` Vicent Marti
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

This commit enables users of `struct rev_info` to peform custom limiting
during a revision walk (i.e. `get_revision`).

If the field `include_check` has been set to a callback, this callback
will be issued once for each commit before it is added to the "pending"
list of the revwalk. If the include check returns 0, the commit will be
marked as added but won't be pushed to the pending list, effectively
limiting the walk.
---
 revision.c |    5 +++++
 revision.h |    2 ++
 2 files changed, 7 insertions(+)

diff --git a/revision.c b/revision.c
index f1bb731..fa78c65 100644
--- a/revision.c
+++ b/revision.c
@@ -777,8 +777,13 @@ static int add_parents_to_list(struct rev_info *revs, struct commit *commit,
 
 	if (commit->object.flags & ADDED)
 		return 0;
+
 	commit->object.flags |= ADDED;
 
+	if (revs->include_check &&
+		!revs->include_check(commit, revs->include_check_data))
+		return 0;
+
 	/*
 	 * If the commit is uninteresting, don't try to
 	 * prune parents - we want the maximal uninteresting
diff --git a/revision.h b/revision.h
index eeea6fb..997a093 100644
--- a/revision.h
+++ b/revision.h
@@ -162,6 +162,8 @@ struct rev_info {
 	unsigned long min_age;
 	int min_parents;
 	int max_parents;
+	int (*include_check)(struct commit *, void *);
+	void *include_check_data;
 
 	/* diff info for patches and for paths limiting */
 	struct diff_options diffopt;
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 06/16] sha1_file: export `git_open_noatime`
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (4 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 05/16] revision: allow setting custom limiter function Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-24 23:23 ` [PATCH 07/16] compat: add endinanness helpers Vicent Marti
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

The `git_open_noatime` helper can be of general interest for other
consumers of git's different on-disk formats.
---
 cache.h     |    1 +
 sha1_file.c |    4 +---
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/cache.h b/cache.h
index 95ef14d..bbe5e2a 100644
--- a/cache.h
+++ b/cache.h
@@ -769,6 +769,7 @@ extern int hash_sha1_file(const void *buf, unsigned long len, const char *type,
 extern int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *return_sha1);
 extern int pretend_sha1_file(void *, unsigned long, enum object_type, unsigned char *);
 extern int force_object_loose(const unsigned char *sha1, time_t mtime);
+extern int git_open_noatime(const char *name);
 extern void *map_sha1_file(const unsigned char *sha1, unsigned long *size);
 extern int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long mapsize, void *buffer, unsigned long bufsiz);
 extern int parse_sha1_header(const char *hdr, unsigned long *sizep);
diff --git a/sha1_file.c b/sha1_file.c
index 44c7bca..018a847 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -259,8 +259,6 @@ char *sha1_pack_index_name(const unsigned char *sha1)
 struct alternate_object_database *alt_odb_list;
 static struct alternate_object_database **alt_odb_tail;
 
-static int git_open_noatime(const char *name);
-
 /*
  * Prepare alternate object database registry.
  *
@@ -1307,7 +1305,7 @@ int check_sha1_signature(const unsigned char *sha1, void *map,
 	return hashcmp(sha1, real_sha1) ? -1 : 0;
 }
 
-static int git_open_noatime(const char *name)
+int git_open_noatime(const char *name)
 {
 	static int sha1_file_open_flag = O_NOATIME;
 
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 07/16] compat: add endinanness helpers
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (5 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 06/16] sha1_file: export `git_open_noatime` Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-25 13:08   ` Peter Krefting
  2013-06-24 23:23 ` [PATCH 08/16] ewah: compressed bitmap implementation Vicent Marti
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

The POSIX standard doesn't currently define a `nothll`/`htonll`
function pair to perform network-to-host and host-to-network
swaps of 64-bit data. These 64-bit swaps are necessary for the on-disk
storage of EWAH bitmaps if they are not in native byte order.
---
 git-compat-util.h |   28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/git-compat-util.h b/git-compat-util.h
index ff193f4..bc9b591 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -710,4 +710,32 @@ void warn_on_inaccessible(const char *path);
 /* Get the passwd entry for the UID of the current process. */
 struct passwd *xgetpwuid_self(void);
 
+#include <endian.h>
+
+#ifndef __BYTE_ORDER
+# if defined(BYTE_ORDER) && defined(LITTLE_ENDIAN) && defined(BIG_ENDIAN)
+#  define __BYTE_ORDER BYTE_ORDER
+#  define __LITTLE_ENDIAN LITTLE_ENDIAN
+#  define __BIG_ENDIAN BIG_ENDIAN
+# else
+#  error "Cannot determine endianness"
+# endif
+#endif
+
+#if __BYTE_ORDER == __BIG_ENDIAN
+# define ntohll(n) (n)
+# define htonll(n) (n)
+#elif __BYTE_ORDER == __LITTLE_ENDIAN
+# if defined(__GNUC__) && defined(__GLIBC__)
+#  include <byteswap.h>
+#  define ntohll(n) bswap_64(n)
+#  define htonll(n) bswap_64(n)
+# else /* GNUC & GLIBC */
+#  define ntohll(n) ( (((unsigned long long)ntohl(n)) << 32) + ntohl(n >> 32) )
+#  define htonll(n) ( (((unsigned long long)htonl(n)) << 32) + htonl(n >> 32) )
+# endif /* GNUC & GLIBC */
+#else /* __BYTE_ORDER */
+# error "Can't define htonll or ntohll!"
+#endif
+
 #endif
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 08/16] ewah: compressed bitmap implementation
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (6 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 07/16] compat: add endinanness helpers Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-25  1:10   ` Junio C Hamano
  2013-06-25 15:38   ` Thomas Rast
  2013-06-24 23:23 ` [PATCH 09/16] documentation: add documentation for the bitmap format Vicent Marti
                   ` (8 subsequent siblings)
  16 siblings, 2 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

EWAH is a word-aligned compressed variant of a bitset (i.e. a data
structure that acts as a 0-indexed boolean array for many entries).

It uses a 64-bit run-length encoding (RLE) compression scheme,
trading some compression for better processing speed.

The goal of this word-aligned implementation is not to achieve
the best compression, but rather to improve query processing time.
As it stands right now, this EWAH implementation will always be more
efficient storage-wise than its uncompressed alternative.

EWAH arrays will be used as the on-disk format to store reachability
bitmaps for all objects in a repository while keeping reasonable sizes,
in the same way that JGit does.

This EWAH implementation is a mostly straightforward port of the
original `javaewah` library that JGit currently uses. The library is
self-contained and has been embedded whole (4 files) inside the `ewah`
folder to ease redistribution.

The library is re-licensed under the GPLv2 with the permission of Daniel
Lemire, the original author. The source code for the C version can
be found on GitHub:

	https://github.com/vmg/libewok

The original Java implementation can also be found on GitHub:

	https://github.com/lemire/javaewah
---
 Makefile           |    6 +
 ewah/bitmap.c      |  229 +++++++++++++++++
 ewah/ewah_bitmap.c |  703 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 ewah/ewah_io.c     |  199 +++++++++++++++
 ewah/ewah_rlw.c    |  124 +++++++++
 ewah/ewok.h        |  194 +++++++++++++++
 ewah/ewok_rlw.h    |  114 +++++++++
 7 files changed, 1569 insertions(+)
 create mode 100644 ewah/bitmap.c
 create mode 100644 ewah/ewah_bitmap.c
 create mode 100644 ewah/ewah_io.c
 create mode 100644 ewah/ewah_rlw.c
 create mode 100644 ewah/ewok.h
 create mode 100644 ewah/ewok_rlw.h

diff --git a/Makefile b/Makefile
index e01506d..e03c773 100644
--- a/Makefile
+++ b/Makefile
@@ -672,6 +672,8 @@ LIB_H += diff.h
 LIB_H += diffcore.h
 LIB_H += dir.h
 LIB_H += exec_cmd.h
+LIB_H += ewah/ewok.h
+LIB_H += ewah/ewok_rlw.h
 LIB_H += fetch-pack.h
 LIB_H += fmt-merge-msg.h
 LIB_H += fsck.h
@@ -802,6 +804,10 @@ LIB_OBJS += dir.o
 LIB_OBJS += editor.o
 LIB_OBJS += entry.o
 LIB_OBJS += environment.o
+LIB_OBJS += ewah/bitmap.o
+LIB_OBJS += ewah/ewah_bitmap.o
+LIB_OBJS += ewah/ewah_io.o
+LIB_OBJS += ewah/ewah_rlw.o
 LIB_OBJS += exec_cmd.o
 LIB_OBJS += fetch-pack.o
 LIB_OBJS += fsck.o
diff --git a/ewah/bitmap.c b/ewah/bitmap.c
new file mode 100644
index 0000000..75ca8fd
--- /dev/null
+++ b/ewah/bitmap.c
@@ -0,0 +1,229 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#include <assert.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "ewok.h"
+
+#define MASK(x) ((eword_t)1 << (x % BITS_IN_WORD))
+#define BLOCK(x) (x / BITS_IN_WORD)
+
+struct bitmap *bitmap_new(void)
+{
+	struct bitmap *bitmap = ewah_malloc(sizeof(struct bitmap));
+	bitmap->words = ewah_calloc(32, sizeof(eword_t));
+	bitmap->word_alloc = 32;
+	return bitmap;
+}
+
+void bitmap_set(struct bitmap *self, size_t pos)
+{
+	size_t block = BLOCK(pos);
+
+	if (block >= self->word_alloc) {
+		size_t old_size = self->word_alloc;
+		self->word_alloc = block * 2;
+		self->words = ewah_realloc(self->words, self->word_alloc * sizeof(eword_t));
+
+		memset(self->words + old_size, 0x0,
+			(self->word_alloc - old_size) * sizeof(eword_t));
+	}
+
+	self->words[block] |= MASK(pos);
+}
+
+void bitmap_clear(struct bitmap *self, size_t pos)
+{
+	size_t block = BLOCK(pos);
+
+	if (block < self->word_alloc)
+		self->words[block] &= ~MASK(pos);
+}
+
+bool bitmap_get(struct bitmap *self, size_t pos)
+{
+	size_t block = BLOCK(pos);
+	return block < self->word_alloc && (self->words[block] & MASK(pos)) != 0;
+}
+
+extern size_t ewah_add_empty_words(struct ewah_bitmap *self, bool v, size_t number);
+extern size_t ewah_add(struct ewah_bitmap *self, eword_t word);
+
+struct ewah_bitmap *bitmap_to_ewah(struct bitmap *bitmap)
+{
+	struct ewah_bitmap *ewah = ewah_new();
+	size_t i, running_empty_words = 0;
+	eword_t last_word = 0;
+
+	for (i = 0; i < bitmap->word_alloc; ++i) {
+		if (bitmap->words[i] == 0) {
+			running_empty_words++;
+			continue;
+		}
+
+		if (last_word != 0) {
+			ewah_add(ewah, last_word);
+		}
+
+		if (running_empty_words > 0) {
+			ewah_add_empty_words(ewah, false, running_empty_words);
+			running_empty_words = 0;
+		}
+
+		last_word = bitmap->words[i];
+	}
+
+	ewah_add(ewah, last_word);
+	return ewah;
+}
+
+struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
+{
+	struct bitmap *bitmap = bitmap_new();
+	struct ewah_iterator it;
+	eword_t blowup;
+	size_t i = 0;
+
+	ewah_iterator_init(&it, ewah);
+
+	while (ewah_iterator_next(&blowup, &it)) {
+		if (i >= bitmap->word_alloc) {
+			bitmap->word_alloc *= 1.5;
+			bitmap->words = ewah_realloc(
+				bitmap->words, bitmap->word_alloc * sizeof(eword_t));
+		}
+
+		bitmap->words[i++] = blowup;
+	}
+
+	bitmap->word_alloc = i;
+	return bitmap;
+}
+
+void bitmap_and_not_inplace(struct bitmap *self, struct bitmap *other)
+{
+	const size_t count = (self->word_alloc < other->word_alloc) ?
+		self->word_alloc : other->word_alloc;
+
+	size_t i;
+
+	for (i = 0; i < count; ++i) {
+		self->words[i] &= ~other->words[i];
+	}
+}
+
+void bitmap_or_inplace(struct bitmap *self, struct ewah_bitmap *other)
+{
+	size_t original_size = self->word_alloc;
+	size_t other_final = (other->bit_size / BITS_IN_WORD) + 1;
+	size_t i = 0;
+	struct ewah_iterator it;
+	eword_t word;
+
+	if (self->word_alloc < other_final) {
+		self->word_alloc = other_final;
+		self->words = ewah_realloc(self->words, self->word_alloc * sizeof(eword_t));
+		memset(self->words + original_size, 0x0,
+			(self->word_alloc - original_size) * sizeof(eword_t));
+	}
+
+	ewah_iterator_init(&it, other);
+
+	while (ewah_iterator_next(&word, &it)) {
+		self->words[i++] |= word;
+	}
+}
+
+void bitmap_each_bit(struct bitmap *self, ewah_callback callback, void *data)
+{
+	size_t pos = 0, i;
+
+	for (i = 0; i < self->word_alloc; ++i) {
+		eword_t word = self->words[i];
+		uint32_t offset;
+
+		if (word == (eword_t)~0) {
+			for (offset = 0; offset < BITS_IN_WORD; ++offset) {
+				callback(pos++, data);
+			}
+		} else {
+			for (offset = 0; offset < BITS_IN_WORD; ++offset) {
+				if ((word >> offset) == 0)
+					break;
+
+				offset += __builtin_ctzll(word >> offset);
+				callback(pos + offset, data);
+			}
+			pos += BITS_IN_WORD;
+		}
+	}
+}
+
+size_t bitmap_popcount(struct bitmap *self)
+{
+	size_t i, count = 0;
+
+	for (i = 0; i < self->word_alloc; ++i) {
+		count += __builtin_popcountll(self->words[i]);
+	}
+
+	return count;
+}
+
+bool bitmap_equals(struct bitmap *self, struct bitmap *other)
+{
+	struct bitmap *big, *small;
+	size_t i;
+
+	if (self->word_alloc < other->word_alloc) {
+		small = self;
+		big = other;
+	} else {
+		small = other;
+		big = self;
+	}
+
+	for (i = 0; i < small->word_alloc; ++i) {
+		if (small->words[i] != big->words[i])
+			return false;
+	}
+
+	for (; i < big->word_alloc; ++i) {
+		if (big->words[i] != 0)
+			return false;
+	}
+
+	return true;
+}
+
+void bitmap_reset(struct bitmap *bitmap)
+{
+	memset(bitmap->words, 0x0, bitmap->word_alloc * sizeof(eword_t));
+}
+
+void bitmap_free(struct bitmap *bitmap)
+{
+	if (bitmap == NULL)
+		return;
+
+	free(bitmap->words);
+	free(bitmap);
+}
diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
new file mode 100644
index 0000000..8a23494
--- /dev/null
+++ b/ewah/ewah_bitmap.c
@@ -0,0 +1,703 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#include <assert.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+#include <stdio.h>
+
+#include "ewok.h"
+#include "ewok_rlw.h"
+
+static inline size_t min_size(size_t a, size_t b)
+{
+	return a < b ? a : b;
+}
+
+static inline size_t max_size(size_t a, size_t b)
+{
+	return a < b ? a : b;
+}
+
+static inline void buffer_grow(struct ewah_bitmap *self, size_t new_size)
+{
+	size_t rlw_offset = (uint8_t *)self->rlw - (uint8_t *)self->buffer;
+
+	if (self->alloc_size >= new_size)
+		return;
+
+	self->alloc_size = new_size;
+	self->buffer = ewah_realloc(self->buffer, self->alloc_size * sizeof(eword_t));
+	self->rlw = self->buffer + (rlw_offset / sizeof(size_t));
+}
+
+static inline void buffer_push(struct ewah_bitmap *self, eword_t value)
+{
+	if (self->buffer_size + 1 >= self->alloc_size) {
+		buffer_grow(self, self->buffer_size * 1.5);
+	}
+
+	self->buffer[self->buffer_size++] = value;
+}
+
+static void buffer_push_rlw(struct ewah_bitmap *self, eword_t value)
+{
+	buffer_push(self, value);
+	self->rlw = self->buffer + self->buffer_size - 1;
+}
+
+static size_t add_empty_words(struct ewah_bitmap *self, bool v, size_t number)
+{
+	size_t added = 0;
+
+	if (rlw_get_run_bit(self->rlw) != v && rlw_size(self->rlw) == 0) {
+		rlw_set_run_bit(self->rlw, v);
+	}
+	else if (rlw_get_literal_words(self->rlw) != 0 || rlw_get_run_bit(self->rlw) != v) {
+		buffer_push_rlw(self, 0);
+		if (v) rlw_set_run_bit(self->rlw, v);
+		added++;
+	}
+
+	eword_t runlen = rlw_get_running_len(self->rlw);
+	eword_t can_add = min_size(number, RLW_LARGEST_RUNNING_COUNT - runlen);
+
+	rlw_set_running_len(self->rlw, runlen + can_add);
+	number -= can_add;
+
+	while (number >= RLW_LARGEST_RUNNING_COUNT) {
+		buffer_push_rlw(self, 0);
+		added++;
+
+		if (v) rlw_set_run_bit(self->rlw, v);
+		rlw_set_running_len(self->rlw, RLW_LARGEST_RUNNING_COUNT);
+
+		number -= RLW_LARGEST_RUNNING_COUNT;
+	}
+
+	if (number > 0) {
+		buffer_push_rlw(self, 0);
+		added++;
+
+		if (v) rlw_set_run_bit(self->rlw, v);
+		rlw_set_running_len(self->rlw, number);
+	}
+
+	return added;
+}
+
+size_t ewah_add_empty_words(struct ewah_bitmap *self, bool v, size_t number)
+{
+	if (number == 0)
+		return 0;
+
+	self->bit_size += number * BITS_IN_WORD;
+	return add_empty_words(self, v, number);
+}
+
+static size_t add_literal(struct ewah_bitmap *self, eword_t new_data)
+{
+	eword_t current_num = rlw_get_literal_words(self->rlw);
+
+	if (current_num >= RLW_LARGEST_LITERAL_COUNT) {
+		buffer_push_rlw(self, 0);
+
+		rlw_set_literal_words(self->rlw, 1);
+		buffer_push(self, new_data);
+		return 2;
+	}
+
+	rlw_set_literal_words(self->rlw, current_num + 1);
+
+	/* sanity check */
+	assert(rlw_get_literal_words(self->rlw) == current_num + 1);
+
+	buffer_push(self, new_data);
+	return 1;
+}
+
+void ewah_add_dirty_words(
+	struct ewah_bitmap *self, const eword_t *buffer, size_t number, bool negate)
+{
+	size_t literals, can_add;
+
+	while (1) {
+		literals = rlw_get_literal_words(self->rlw);
+		can_add = min_size(number, RLW_LARGEST_LITERAL_COUNT - literals);
+
+		rlw_set_literal_words(self->rlw, literals + can_add);
+
+		if (self->buffer_size + can_add >= self->alloc_size) {
+			buffer_grow(self, (self->buffer_size + can_add) * 1.5);
+		}
+
+		if (negate) {
+			size_t i;
+			for (i = 0; i < can_add; ++i)
+				self->buffer[self->buffer_size++] = ~buffer[i];
+		} else {
+			memcpy(self->buffer + self->buffer_size, buffer, can_add * sizeof(eword_t));
+			self->buffer_size += can_add;
+		}
+
+		self->bit_size += can_add * BITS_IN_WORD;
+
+		if (number - can_add == 0)
+			break;
+
+		buffer_push_rlw(self, 0);
+		buffer += can_add;
+		number -= can_add;
+	}
+}
+
+static size_t add_empty_word(struct ewah_bitmap *self, bool v)
+{
+	bool no_literal = (rlw_get_literal_words(self->rlw) == 0);
+	eword_t run_len = rlw_get_running_len(self->rlw);
+
+	if (no_literal && run_len == 0) {
+		rlw_set_run_bit(self->rlw, v);
+		assert(rlw_get_run_bit(self->rlw) == v);
+	}
+
+	if (no_literal && rlw_get_run_bit(self->rlw) == v &&
+		run_len < RLW_LARGEST_RUNNING_COUNT) {
+		rlw_set_running_len(self->rlw, run_len + 1);
+		assert(rlw_get_running_len(self->rlw) == run_len + 1);
+		return 0;
+	}
+
+	else {
+		buffer_push_rlw(self, 0);
+
+		assert(rlw_get_running_len(self->rlw) == 0);
+		assert(rlw_get_run_bit(self->rlw) == 0);
+		assert(rlw_get_literal_words(self->rlw) == 0);
+
+		rlw_set_run_bit(self->rlw, v);
+		assert(rlw_get_run_bit(self->rlw) == v);
+
+		rlw_set_running_len(self->rlw, 1);
+		assert(rlw_get_running_len(self->rlw) == 1);
+		assert(rlw_get_literal_words(self->rlw) == 0);
+		return 1;
+	}
+}
+
+size_t ewah_add(struct ewah_bitmap *self, eword_t word)
+{
+	self->bit_size += BITS_IN_WORD;
+
+	if (word == 0)
+		return add_empty_word(self, false);
+
+	if (word == (eword_t)(~0))
+		return add_empty_word(self, true);
+
+	return add_literal(self, word);
+}
+
+void ewah_set(struct ewah_bitmap *self, size_t i)
+{
+	const size_t dist =
+		(i + BITS_IN_WORD) / BITS_IN_WORD -
+		(self->bit_size + BITS_IN_WORD - 1) / BITS_IN_WORD;
+
+	assert(i >= self->bit_size);
+
+	self->bit_size = i + 1;
+
+	if (dist > 0) {
+		if (dist > 1)
+			add_empty_words(self, false, dist - 1);
+
+		add_literal(self, (eword_t)1 << (i % BITS_IN_WORD));
+		return;
+	}
+
+	if (rlw_get_literal_words(self->rlw) == 0) {
+		rlw_set_running_len(self->rlw, rlw_get_running_len(self->rlw) - 1);
+		add_literal(self, (eword_t)1 << (i % BITS_IN_WORD));
+		return;
+	}
+
+	self->buffer[self->buffer_size - 1] |= ((eword_t)1 << (i % BITS_IN_WORD));
+
+	/* check if we just completed a stream of 1s */
+	if (self->buffer[self->buffer_size - 1] == (eword_t)(~0)) {
+		self->buffer[--self->buffer_size] = 0;
+		rlw_set_literal_words(self->rlw, rlw_get_literal_words(self->rlw) - 1);
+		add_empty_word(self, true);
+	}
+}
+
+void ewah_each_bit(struct ewah_bitmap *self, void (*callback)(size_t, void*), void *payload)
+{
+	size_t pos = 0;
+	size_t pointer = 0;
+	size_t k;
+
+	while (pointer < self->buffer_size) {
+		eword_t *word = &self->buffer[pointer];
+
+		if (rlw_get_run_bit(word)) {
+			size_t len = rlw_get_running_len(word) * BITS_IN_WORD;
+			for (k = 0; k < len; ++k, ++pos) {
+				callback(pos, payload);
+			}
+		} else {
+			pos += rlw_get_running_len(word) * BITS_IN_WORD;
+		}
+
+		++pointer;
+
+		for (k = 0; k < rlw_get_literal_words(word); ++k) {
+			int c;
+
+			/* todo: zero count optimization */
+			for (c = 0; c < BITS_IN_WORD; ++c, ++pos) {
+				if ((self->buffer[pointer] & ((eword_t)1 << c)) != 0) {
+					callback(pos, payload);
+				}
+			}
+
+			++pointer;
+		}
+	}
+}
+
+struct ewah_bitmap *ewah_new(void)
+{
+	struct ewah_bitmap *bitmap;
+
+	bitmap = ewah_malloc(sizeof(struct ewah_bitmap));
+	if (bitmap == NULL)
+		return NULL;
+
+	bitmap->buffer = ewah_malloc(32 * sizeof(eword_t));
+	bitmap->alloc_size = 32;
+
+	ewah_clear(bitmap);
+
+	return bitmap;
+}
+
+void ewah_clear(struct ewah_bitmap *bitmap)
+{
+	bitmap->buffer_size = 1;
+	bitmap->buffer[0] = 0;
+	bitmap->bit_size = 0;
+	bitmap->rlw = bitmap->buffer;
+}
+
+void ewah_free(struct ewah_bitmap *bitmap)
+{
+	if (bitmap->alloc_size)
+		free(bitmap->buffer);
+
+	free(bitmap);
+}
+
+static void read_new_rlw(struct ewah_iterator *it)
+{
+	const eword_t *word = NULL;
+
+	it->literals = 0;
+	it->compressed = 0;
+
+	while (1) {
+		word = &it->buffer[it->pointer];
+
+		it->rl = rlw_get_running_len(word);
+		it->lw = rlw_get_literal_words(word);
+		it->b = rlw_get_run_bit(word);
+
+		if (it->rl || it->lw)
+			return;
+
+		if (it->pointer < it->buffer_size - 1) {
+			it->pointer++;
+		} else {
+			it->pointer = it->buffer_size;
+			return;
+		}
+	}
+}
+
+bool ewah_iterator_next(eword_t *next, struct ewah_iterator *it)
+{
+	if (it->pointer >= it->buffer_size)
+		return false;
+
+	if (it->compressed < it->rl) {
+		it->compressed++;
+		*next = it->b ? (eword_t)(~0) : 0;
+	} else {
+		assert(it->literals < it->lw);
+
+		it->literals++;
+		it->pointer++;
+
+		assert(it->pointer < it->buffer_size);
+
+		*next = it->buffer[it->pointer];
+	}
+
+	if (it->compressed == it->rl && it->literals == it->lw) {
+		if (++it->pointer < it->buffer_size)
+			read_new_rlw(it);
+	}
+
+	return true;
+}
+
+void ewah_iterator_init(struct ewah_iterator *it, struct ewah_bitmap *parent)
+{
+	it->buffer = parent->buffer;
+	it->buffer_size = parent->buffer_size;
+	it->pointer = 0;
+
+	it->lw = 0;
+	it->rl = 0;
+	it->compressed = 0;
+	it->literals = 0;
+	it->b = false;
+
+	if (it->pointer < it->buffer_size)
+		read_new_rlw(it);
+}
+
+void ewah_dump(struct ewah_bitmap *bitmap)
+{
+	size_t i;
+	fprintf(stderr, "%zu bits | %zu words | ", bitmap->bit_size, bitmap->buffer_size);
+
+	for (i = 0; i < bitmap->buffer_size; ++i)
+		fprintf(stderr, "%016llx ", (unsigned long long)bitmap->buffer[i]);
+
+	fprintf(stderr, "\n");
+}
+
+void ewah_not(struct ewah_bitmap *self)
+{
+	size_t pointer = 0;
+
+	while (pointer < self->buffer_size) {
+		eword_t *word = &self->buffer[pointer];
+		size_t literals, k;
+
+		rlw_xor_run_bit(word);
+		++pointer;
+
+		literals = rlw_get_literal_words(word);
+		for (k = 0; k < literals; ++k) {
+			self->buffer[pointer] = ~self->buffer[pointer];
+			++pointer;
+		}
+	}
+}
+
+void ewah_xor(
+	struct ewah_bitmap *bitmap_i,
+	struct ewah_bitmap *bitmap_j,
+	struct ewah_bitmap *out)
+{
+	struct rlw_iterator rlw_i;
+	struct rlw_iterator rlw_j;
+
+	rlwit_init(&rlw_i, bitmap_i);
+	rlwit_init(&rlw_j, bitmap_j);
+
+	while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
+		while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
+			struct rlw_iterator *prey, *predator;
+			size_t index;
+			bool negate_words;
+
+			if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
+				prey = &rlw_i;
+				predator = &rlw_j;
+			} else {
+				prey = &rlw_j;
+				predator = &rlw_i;
+			}
+
+			negate_words = !!predator->rlw.running_bit;
+			index = rlwit_discharge(prey, out, predator->rlw.running_len, negate_words);
+
+			ewah_add_empty_words(out, negate_words, predator->rlw.running_len - index);
+			rlwit_discard_first_words(predator, predator->rlw.running_len);
+		}
+
+		size_t literals = min_size(rlw_i.rlw.literal_words, rlw_j.rlw.literal_words);
+
+		if (literals) {
+			size_t k;
+
+			for (k = 0; k < literals; ++k) {
+				ewah_add(out,
+					rlw_i.buffer[rlw_i.literal_word_start + k] ^
+					rlw_j.buffer[rlw_j.literal_word_start + k]
+				);
+			}
+
+			rlwit_discard_first_words(&rlw_i, literals);
+			rlwit_discard_first_words(&rlw_j, literals);
+		}
+	}
+
+	if (rlwit_word_size(&rlw_i) > 0) {
+		rlwit_discharge(&rlw_i, out, ~0, false);
+	} else {
+		rlwit_discharge(&rlw_j, out, ~0, false);
+	}
+
+	out->bit_size = max_size(bitmap_i->bit_size, bitmap_j->bit_size);
+}
+
+void ewah_and(
+	struct ewah_bitmap *bitmap_i,
+	struct ewah_bitmap *bitmap_j,
+	struct ewah_bitmap *out)
+{
+	struct rlw_iterator rlw_i;
+	struct rlw_iterator rlw_j;
+
+	rlwit_init(&rlw_i, bitmap_i);
+	rlwit_init(&rlw_j, bitmap_j);
+
+	while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
+		while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
+			struct rlw_iterator *prey, *predator;
+
+			if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
+				prey = &rlw_i;
+				predator = &rlw_j;
+			} else {
+				prey = &rlw_j;
+				predator = &rlw_i;
+			}
+
+			if (predator->rlw.running_bit == 0) {
+				ewah_add_empty_words(out, false, predator->rlw.running_len);
+				rlwit_discard_first_words(prey, predator->rlw.running_len);
+				rlwit_discard_first_words(predator, predator->rlw.running_len);
+			} else {
+				size_t index;
+				index = rlwit_discharge(prey, out, predator->rlw.running_len, false);
+				ewah_add_empty_words(out, false, predator->rlw.running_len - index);
+				rlwit_discard_first_words(predator, predator->rlw.running_len);
+			}
+		}
+
+		size_t literals = min_size(rlw_i.rlw.literal_words, rlw_j.rlw.literal_words);
+
+		if (literals) {
+			size_t k;
+
+			for (k = 0; k < literals; ++k) {
+				ewah_add(out,
+					rlw_i.buffer[rlw_i.literal_word_start + k] &
+					rlw_j.buffer[rlw_j.literal_word_start + k]
+				);
+			}
+
+			rlwit_discard_first_words(&rlw_i, literals);
+			rlwit_discard_first_words(&rlw_j, literals);
+		}
+	}
+
+	if (rlwit_word_size(&rlw_i) > 0) {
+		rlwit_discharge_empty(&rlw_i, out);
+	} else {
+		rlwit_discharge_empty(&rlw_j, out);
+	}
+
+	out->bit_size = max_size(bitmap_i->bit_size, bitmap_j->bit_size);
+}
+
+void ewah_and_not(
+	struct ewah_bitmap *bitmap_i,
+	struct ewah_bitmap *bitmap_j,
+	struct ewah_bitmap *out)
+{
+	struct rlw_iterator rlw_i;
+	struct rlw_iterator rlw_j;
+
+	rlwit_init(&rlw_i, bitmap_i);
+	rlwit_init(&rlw_j, bitmap_j);
+
+	while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
+		while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
+			struct rlw_iterator *prey, *predator;
+
+			if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
+				prey = &rlw_i;
+				predator = &rlw_j;
+			} else {
+				prey = &rlw_j;
+				predator = &rlw_i;
+			}
+
+			if ((predator->rlw.running_bit && prey == &rlw_i) ||
+				(!predator->rlw.running_bit && prey != &rlw_i)) {
+				ewah_add_empty_words(out, false, predator->rlw.running_len);
+				rlwit_discard_first_words(prey, predator->rlw.running_len);
+				rlwit_discard_first_words(predator, predator->rlw.running_len);
+			} else {
+				size_t index;
+				bool negate_words;
+
+				negate_words = (&rlw_i != prey);
+				index = rlwit_discharge(prey, out, predator->rlw.running_len, negate_words);
+				ewah_add_empty_words(out, negate_words, predator->rlw.running_len - index);
+				rlwit_discard_first_words(predator, predator->rlw.running_len);
+			}
+		}
+
+		size_t literals = min_size(rlw_i.rlw.literal_words, rlw_j.rlw.literal_words);
+
+		if (literals) {
+			size_t k;
+
+			for (k = 0; k < literals; ++k) {
+				ewah_add(out,
+					rlw_i.buffer[rlw_i.literal_word_start + k] &
+					~(rlw_j.buffer[rlw_j.literal_word_start + k])
+				);
+			}
+
+			rlwit_discard_first_words(&rlw_i, literals);
+			rlwit_discard_first_words(&rlw_j, literals);
+		}
+	}
+
+	if (rlwit_word_size(&rlw_i) > 0) {
+		rlwit_discharge(&rlw_i, out, ~0, false);
+	} else {
+		rlwit_discharge_empty(&rlw_j, out);
+	}
+
+	out->bit_size = max_size(bitmap_i->bit_size, bitmap_j->bit_size);
+}
+
+void ewah_or(
+	struct ewah_bitmap *bitmap_i,
+	struct ewah_bitmap *bitmap_j,
+	struct ewah_bitmap *out)
+{
+	struct rlw_iterator rlw_i;
+	struct rlw_iterator rlw_j;
+
+	rlwit_init(&rlw_i, bitmap_i);
+	rlwit_init(&rlw_j, bitmap_j);
+
+	while (rlwit_word_size(&rlw_i) > 0 && rlwit_word_size(&rlw_j) > 0) {
+		while (rlw_i.rlw.running_len > 0 || rlw_j.rlw.running_len > 0) {
+			struct rlw_iterator *prey, *predator;
+
+			if (rlw_i.rlw.running_len < rlw_j.rlw.running_len) {
+				prey = &rlw_i;
+				predator = &rlw_j;
+			} else {
+				prey = &rlw_j;
+				predator = &rlw_i;
+			}
+
+
+			if (predator->rlw.running_bit) {
+				ewah_add_empty_words(out, false, predator->rlw.running_len);
+				rlwit_discard_first_words(prey, predator->rlw.running_len);
+				rlwit_discard_first_words(predator, predator->rlw.running_len);
+			} else {
+				size_t index;
+				index = rlwit_discharge(prey, out, predator->rlw.running_len, false);
+				ewah_add_empty_words(out, false, predator->rlw.running_len - index);
+				rlwit_discard_first_words(predator, predator->rlw.running_len);
+			}
+		}
+
+		size_t literals = min_size(rlw_i.rlw.literal_words, rlw_j.rlw.literal_words);
+
+		if (literals) {
+			size_t k;
+
+			for (k = 0; k < literals; ++k) {
+				ewah_add(out,
+					rlw_i.buffer[rlw_i.literal_word_start + k] |
+					rlw_j.buffer[rlw_j.literal_word_start + k]
+				);
+			}
+
+			rlwit_discard_first_words(&rlw_i, literals);
+			rlwit_discard_first_words(&rlw_j, literals);
+		}
+	}
+
+	if (rlwit_word_size(&rlw_i) > 0) {
+		rlwit_discharge(&rlw_i, out, ~0, false);
+	} else {
+		rlwit_discharge(&rlw_j, out, ~0, false);
+	}
+
+	out->bit_size = max_size(bitmap_i->bit_size, bitmap_j->bit_size);
+}
+
+
+#define BITMAP_POOL_MAX 16
+static struct ewah_bitmap *bitmap_pool[BITMAP_POOL_MAX];
+static size_t bitmap_pool_size;
+
+struct ewah_bitmap *ewah_pool_new(void)
+{
+	if (bitmap_pool_size)
+		return bitmap_pool[--bitmap_pool_size];
+
+	return ewah_new();
+}
+
+void ewah_pool_free(struct ewah_bitmap *bitmap)
+{
+	if (bitmap == NULL)
+		return;
+
+	if (bitmap_pool_size == BITMAP_POOL_MAX ||
+		bitmap->alloc_size == 0) {
+		ewah_free(bitmap);
+		return;
+	}
+
+	ewah_clear(bitmap);
+	bitmap_pool[bitmap_pool_size++] = bitmap;
+}
+
+uint32_t
+ewah_checksum(struct ewah_bitmap *self)
+{
+	const uint8_t *p = (uint8_t *)self->buffer;
+	uint32_t crc = (uint32_t)self->bit_size;
+	size_t size = self->buffer_size * sizeof(eword_t);
+
+	while (size--)
+		crc = (crc << 5) - crc + (uint32_t)*p++;
+
+	return crc;
+}
diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
new file mode 100644
index 0000000..b44c90e
--- /dev/null
+++ b/ewah/ewah_io.c
@@ -0,0 +1,199 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#include <stdlib.h>
+#include <unistd.h>
+#include <stdio.h>
+
+#include "git-compat-util.h"
+#include "ewok.h"
+
+int ewah_serialize_native(struct ewah_bitmap *self, int fd)
+{
+	uint32_t write32;
+	size_t to_write = self->buffer_size * 8;
+
+	/* 32 bit -- bit size fr the map */
+	write32 = (uint32_t)self->bit_size;
+	if (write(fd, &write32, 4) != 4)
+		return -1;
+
+	/** 32 bit -- number of compressed 64-bit words */
+	write32 = (uint32_t)self->buffer_size;
+	if (write(fd, &write32, 4) != 4)
+		return -1;
+
+	if (write(fd, self->buffer, to_write) != to_write)
+		return -1;
+
+	/** 32 bit -- position for the RLW */
+	write32 = self->rlw - self->buffer;
+	if (write(fd, &write32, 4) != 4)
+		return -1;
+
+	return (3 * 4) + to_write;
+}
+
+int ewah_serialize(struct ewah_bitmap *self, int fd)
+{
+	size_t i;
+	eword_t dump[2048];
+	const size_t words_per_dump = sizeof(dump) / sizeof(eword_t);
+
+	/* 32 bit -- bit size fr the map */
+	uint32_t bitsize =  htonl((uint32_t)self->bit_size);
+	if (write(fd, &bitsize, 4) != 4)
+		return -1;
+
+	/** 32 bit -- number of compressed 64-bit words */
+	uint32_t word_count =  htonl((uint32_t)self->buffer_size);
+	if (write(fd, &word_count, 4) != 4)
+		return -1;
+
+	/** 64 bit x N -- compressed words */
+	const eword_t *buffer = self->buffer;
+	size_t words_left = self->buffer_size;
+
+	while (words_left >= words_per_dump) {
+		for (i = 0; i < words_per_dump; ++i, ++buffer)
+			dump[i] = htonll(*buffer);
+
+		if (write(fd, dump, sizeof(dump)) != sizeof(dump))
+			return -1;
+
+		words_left -= words_per_dump;
+	}
+
+	if (words_left) {
+		for (i = 0; i < words_left; ++i, ++buffer)
+			dump[i] = htonll(*buffer);
+
+		if (write(fd, dump, words_left * 8) != words_left * 8)
+			return -1;
+	}
+
+	/** 32 bit -- position for the RLW */
+	uint32_t rlw_pos = (uint8_t*)self->rlw - (uint8_t *)self->buffer;
+	rlw_pos = htonl(rlw_pos / sizeof(eword_t));
+
+	if (write(fd, &rlw_pos, 4) != 4)
+		return -1;
+
+	return 0;
+}
+
+int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len)
+{
+	uint32_t *read32 = map;
+	eword_t *read64;
+	size_t i;
+
+	self->bit_size = ntohl(*read32++);
+	self->buffer_size = self->alloc_size = ntohl(*read32++);
+	self->buffer = ewah_realloc(self->buffer, self->alloc_size * sizeof(eword_t));
+
+	if (!self->buffer)
+		return -1;
+
+	for (i = 0, read64 = (void *)read32; i < self->buffer_size; ++i) {
+		self->buffer[i] = ntohll(*read64++);
+	}
+
+	read32 = (void *)read64;
+	self->rlw = self->buffer + ntohl(*read32++);
+
+	return (char *)read32 - (char *)map;
+}
+
+int ewah_read_mmap_native(struct ewah_bitmap *self, void *map, size_t len)
+{
+	uint32_t *read32 = map;
+
+	self->bit_size = *read32++;
+	self->buffer_size = *read32++;
+
+	if (self->alloc_size)
+		free(self->buffer);
+
+	self->alloc_size = 0;
+	self->buffer = (eword_t *)read32;
+
+	read32 += self->buffer_size * 2;
+	self->rlw = self->buffer + *read32++;
+
+	return (char *)read32 - (char *)map;
+}
+
+int ewah_deserialize(struct ewah_bitmap *self, int fd)
+{
+	size_t i;
+	eword_t dump[2048];
+	const size_t words_per_dump = sizeof(dump) / sizeof(eword_t);
+
+	ewah_clear(self);
+
+	/* 32 bit -- bit size fr the map */
+	uint32_t bitsize;
+	if (read(fd, &bitsize, 4) != 4)
+		return -1;
+
+	self->bit_size = (size_t)ntohl(bitsize);
+
+	/** 32 bit -- number of compressed 64-bit words */
+	uint32_t word_count;
+	if (read(fd, &word_count, 4) != 4)
+		return -1;
+
+	self->buffer_size = self->alloc_size = (size_t)ntohl(word_count);
+	self->buffer = ewah_realloc(self->buffer, self->alloc_size * sizeof(eword_t));
+
+	if (!self->buffer)
+		return -1;
+
+	/** 64 bit x N -- compressed words */
+	eword_t *buffer = self->buffer;
+	size_t words_left = self->buffer_size;
+
+	while (words_left >= words_per_dump) {
+		if (read(fd, dump, sizeof(dump)) != sizeof(dump))
+			return -1;
+
+		for (i = 0; i < words_per_dump; ++i, ++buffer)
+			*buffer = ntohll(dump[i]);
+
+		words_left -= words_per_dump;
+	}
+
+	if (words_left) {
+		if (read(fd, dump, words_left * 8) != words_left * 8)
+			return -1;
+
+		for (i = 0; i < words_left; ++i, ++buffer)
+			*buffer = ntohll(dump[i]);
+	}
+
+	/** 32 bit -- position for the RLW */
+	uint32_t rlw_pos;
+	if (read(fd, &rlw_pos, 4) != 4)
+		return -1;
+
+	self->rlw = self->buffer + ntohl(rlw_pos);
+
+	return 0;
+}
diff --git a/ewah/ewah_rlw.c b/ewah/ewah_rlw.c
new file mode 100644
index 0000000..7e10fd4
--- /dev/null
+++ b/ewah/ewah_rlw.c
@@ -0,0 +1,124 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#include <assert.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <string.h>
+
+#include "ewok.h"
+#include "ewok_rlw.h"
+
+extern size_t ewah_add_empty_words(struct ewah_bitmap *self, bool v, size_t number);
+extern void ewah_add_dirty_words(
+	struct ewah_bitmap *self, const eword_t *buffer, size_t number, bool negate);
+
+static inline bool next_word(struct rlw_iterator *it)
+{
+	if (it->pointer >= it->size)
+		return false;
+
+	it->rlw.word = &it->buffer[it->pointer];
+	it->pointer += rlw_get_literal_words(it->rlw.word) + 1;
+
+	it->rlw.literal_words = rlw_get_literal_words(it->rlw.word);
+	it->rlw.running_len = rlw_get_running_len(it->rlw.word);
+	it->rlw.running_bit = rlw_get_run_bit(it->rlw.word);
+	it->rlw.literal_word_offset = 0;
+
+	return true;
+}
+
+void rlwit_init(struct rlw_iterator *it, struct ewah_bitmap *bitmap)
+{
+	it->buffer = bitmap->buffer;
+	it->size = bitmap->buffer_size;
+	it->pointer = 0;
+
+	next_word(it);
+
+	it->literal_word_start = rlwit_literal_words(it) + it->rlw.literal_word_offset;
+}
+
+void rlwit_discard_first_words(struct rlw_iterator *it, size_t x)
+{
+	while (x > 0) {
+		size_t discard;
+
+		if (it->rlw.running_len > x) {
+			it->rlw.running_len -= x;
+			return;
+		}
+
+		x -= it->rlw.running_len;
+		it->rlw.running_len = 0;
+
+		discard = (x > it->rlw.literal_words) ? it->rlw.literal_words : x;
+
+		it->literal_word_start += discard;
+		it->rlw.literal_words -= discard;
+		x -= discard;
+
+		if (x > 0 || rlwit_word_size(it) == 0) {
+			if (!next_word(it))
+				break;
+
+			it->literal_word_start =
+				rlwit_literal_words(it) + it->rlw.literal_word_offset;
+		}
+	}
+}
+
+size_t rlwit_discharge(
+	struct rlw_iterator *it, struct ewah_bitmap *out, size_t max, bool negate)
+{
+	size_t index = 0;
+
+	while (index < max && rlwit_word_size(it) > 0) {
+		size_t pd, pl = it->rlw.running_len;
+
+		if (index + pl > max) {
+			pl = max - index;
+		}
+
+		ewah_add_empty_words(out, it->rlw.running_bit ^ negate, pl);
+		index += pl;
+
+		pd = it->rlw.literal_words;
+		if (pd + index > max) {
+			pd = max - index;
+		}
+
+		ewah_add_dirty_words(out,
+			it->buffer + it->literal_word_start, pd, negate);
+
+		rlwit_discard_first_words(it, pd + pl);
+		index += pd;
+	}
+
+	return index;
+}
+
+void rlwit_discharge_empty(struct rlw_iterator *it, struct ewah_bitmap *out)
+{
+	while (rlwit_word_size(it) > 0) {
+		ewah_add_empty_words(out, false, rlwit_word_size(it));
+		rlwit_discard_first_words(it, rlwit_word_size(it));
+	}
+}
diff --git a/ewah/ewok.h b/ewah/ewok.h
new file mode 100644
index 0000000..691e21e
--- /dev/null
+++ b/ewah/ewok.h
@@ -0,0 +1,194 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#ifndef __EWOK_BITMAP_C__
+#define __EWOK_BITMAP_C__
+
+#include <stdbool.h>
+#include <stdint.h>
+
+#ifndef ewah_malloc
+#	define ewah_malloc malloc
+#endif
+#ifndef ewah_realloc
+#	define ewah_realloc realloc
+#endif
+#ifndef ewah_calloc
+#	define ewah_calloc calloc
+#endif
+
+typedef uint64_t eword_t;
+#define BITS_IN_WORD (sizeof(eword_t) * 8)
+
+struct ewah_bitmap {
+	eword_t *buffer;
+	size_t buffer_size;
+	size_t alloc_size;
+	size_t bit_size;
+	eword_t *rlw;
+};
+
+typedef void (*ewah_callback)(size_t pos, void *);
+
+struct ewah_bitmap *ewah_pool_new(void);
+void ewah_pool_free(struct ewah_bitmap *bitmap);
+
+/**
+ * Allocate a new EWAH Compressed bitmap
+ */
+struct ewah_bitmap *ewah_new(void);
+
+/**
+ * Clear all the bits in the bitmap. Does not free or resize
+ * memory.
+ */
+void ewah_clear(struct ewah_bitmap *bitmap);
+
+/**
+ * Free all the memory of the bitmap
+ */
+void ewah_free(struct ewah_bitmap *bitmap);
+
+int ewah_serialize(struct ewah_bitmap *self, int fd);
+int ewah_serialize_native(struct ewah_bitmap *self, int fd);
+
+int ewah_deserialize(struct ewah_bitmap *self, int fd);
+int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len);
+int ewah_read_mmap_native(struct ewah_bitmap *self, void *map, size_t len);
+
+uint32_t ewah_checksum(struct ewah_bitmap *self);
+
+/**
+ * Logical not (bitwise negation) in-place on the bitmap
+ *
+ * This operation is linear time based on the size of the bitmap.
+ */
+void ewah_not(struct ewah_bitmap *self);
+
+/**
+ * Call the given callback with the position of every single bit
+ * that has been set on the bitmap.
+ *
+ * This is an efficient operation that does not fully decompress
+ * the bitmap.
+ */
+void ewah_each_bit(struct ewah_bitmap *self, ewah_callback callback, void *payload);
+
+/**
+ * Set a given bit on the bitmap.
+ *
+ * The bit at position `pos` will be set to true. Because of the
+ * way that the bitmap is compressed, a set bit cannot be unset
+ * later on.
+ *
+ * Furthermore, since the bitmap uses streaming compression, bits
+ * can only set incrementally.
+ *
+ * E.g.
+ *		ewah_set(bitmap, 1); // ok
+ *		ewah_set(bitmap, 76); // ok
+ *		ewah_set(bitmap, 77); // ok
+ *		ewah_set(bitmap, 8712800127); // ok
+ *		ewah_set(bitmap, 25); // failed, assert raised
+ */
+void ewah_set(struct ewah_bitmap *self, size_t i);
+
+struct ewah_iterator {
+	const eword_t *buffer;
+	size_t buffer_size;
+
+	size_t pointer;
+	eword_t compressed, literals;
+	eword_t rl, lw;
+	bool b;
+};
+
+/**
+ * Initialize a new iterator to run through the bitmap in uncompressed form.
+ *
+ * The iterator can be stack allocated. The underlying bitmap must not be freed
+ * before the iteration is over.
+ *
+ * E.g.
+ *
+ *		struct ewah_bitmap *bitmap = ewah_new();
+ *		struct ewah_iterator it;
+ *
+ *		ewah_iterator_init(&it, bitmap);
+ */
+void ewah_iterator_init(struct ewah_iterator *it, struct ewah_bitmap *parent);
+
+/**
+ * Yield every single word in the bitmap in uncompressed form. This is:
+ * yield single words (32-64 bits) where each bit represents an actual
+ * bit from the bitmap.
+ *
+ * Return: true if a word was yield, false if there are no words left
+ */
+bool ewah_iterator_next(eword_t *next, struct ewah_iterator *it);
+
+void ewah_or(
+	struct ewah_bitmap *bitmap_i,
+	struct ewah_bitmap *bitmap_j,
+	struct ewah_bitmap *out);
+
+void ewah_and_not(
+	struct ewah_bitmap *bitmap_i,
+	struct ewah_bitmap *bitmap_j,
+	struct ewah_bitmap *out);
+
+void ewah_xor(
+	struct ewah_bitmap *bitmap_i,
+	struct ewah_bitmap *bitmap_j,
+	struct ewah_bitmap *out);
+
+void ewah_and(
+	struct ewah_bitmap *bitmap_i,
+	struct ewah_bitmap *bitmap_j,
+	struct ewah_bitmap *out);
+
+void ewah_dump(struct ewah_bitmap *bitmap);
+
+/**
+ * Uncompressed, old-school bitmap that can be efficiently compressed
+ * into an `ewah_bitmap`.
+ */
+struct bitmap {
+	eword_t *words;
+	size_t word_alloc;
+};
+
+struct bitmap *bitmap_new(void);
+void bitmap_set(struct bitmap *self, size_t pos);
+void bitmap_clear(struct bitmap *self, size_t pos);
+bool bitmap_get(struct bitmap *self, size_t pos);
+void bitmap_reset(struct bitmap *bitmap);
+void bitmap_free(struct bitmap *self);
+bool bitmap_equals(struct bitmap *self, struct bitmap *other);
+
+struct ewah_bitmap * bitmap_to_ewah(struct bitmap *bitmap);
+struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah);
+
+void bitmap_and_not_inplace(struct bitmap *self, struct bitmap *other);
+void bitmap_or_inplace(struct bitmap *self, struct ewah_bitmap *other);
+
+void bitmap_each_bit(struct bitmap *self, ewah_callback callback, void *data);
+size_t bitmap_popcount(struct bitmap *self);
+
+#endif
diff --git a/ewah/ewok_rlw.h b/ewah/ewok_rlw.h
new file mode 100644
index 0000000..2e31836
--- /dev/null
+++ b/ewah/ewok_rlw.h
@@ -0,0 +1,114 @@
+/**
+ * Copyright 2013, GitHub, Inc
+ * Copyright 2009-2013, Daniel Lemire, Cliff Moon,
+ *	David McIntosh, Robert Becho, Google Inc. and Veronika Zenz
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+#ifndef __EWOK_RLW_H__
+#define __EWOK_RLW_H__
+
+#define RLW_RUNNING_BITS (sizeof(eword_t) * 4)
+#define RLW_LITERAL_BITS (sizeof(eword_t) * 8 - 1 - RLW_RUNNING_BITS)
+
+#define RLW_LARGEST_RUNNING_COUNT (((eword_t)1 << RLW_RUNNING_BITS) - 1)
+#define RLW_LARGEST_LITERAL_COUNT (((eword_t)1 << RLW_LITERAL_BITS) - 1)
+
+#define RLW_LARGEST_RUNNING_COUNT_SHIFT (RLW_LARGEST_RUNNING_COUNT << 1)
+
+#define RLW_RUNNING_LEN_PLUS_BIT (((eword_t)1 << (RLW_RUNNING_BITS + 1)) - 1)
+
+static bool rlw_get_run_bit(const eword_t *word)
+{
+	return *word & (eword_t)1;
+}
+
+static inline void rlw_set_run_bit(eword_t *word, bool b)
+{
+	if (b) {
+		*word |= (eword_t)1;
+	} else {
+		*word &= (eword_t)(~1);
+	}
+}
+
+static inline void rlw_xor_run_bit(eword_t *word)
+{
+	if (*word & 1) {
+		*word &= (eword_t)(~1);
+	} else {
+		*word |= (eword_t)1;
+	}
+}
+
+static inline void rlw_set_running_len(eword_t *word, eword_t l)
+{
+	*word |= RLW_LARGEST_RUNNING_COUNT_SHIFT;
+	*word &= (l << 1) | (~RLW_LARGEST_RUNNING_COUNT_SHIFT);
+}
+
+static inline eword_t rlw_get_running_len(const eword_t *word)
+{
+	return (*word >> 1) & RLW_LARGEST_RUNNING_COUNT;
+}
+
+static inline eword_t rlw_get_literal_words(const eword_t *word)
+{
+	return *word >> (1 + RLW_RUNNING_BITS);
+}
+
+static inline void rlw_set_literal_words(eword_t *word, eword_t l)
+{
+	*word |= ~RLW_RUNNING_LEN_PLUS_BIT;
+	*word &= (l << (RLW_RUNNING_BITS + 1)) | RLW_RUNNING_LEN_PLUS_BIT;
+}
+
+static inline eword_t rlw_size(const eword_t *self)
+{
+	return rlw_get_running_len(self) + rlw_get_literal_words(self);
+}
+
+struct rlw_iterator {
+	const eword_t *buffer;
+	size_t size;
+	size_t pointer;
+	size_t literal_word_start;
+
+	struct {
+		const eword_t *word;
+		int literal_words;
+		int running_len;
+		int literal_word_offset;
+		int running_bit;
+	} rlw;
+};
+
+void rlwit_init(struct rlw_iterator *it, struct ewah_bitmap *bitmap);
+void rlwit_discard_first_words(struct rlw_iterator *it, size_t x);
+size_t rlwit_discharge(
+	struct rlw_iterator *it, struct ewah_bitmap *out, size_t max, bool negate);
+void rlwit_discharge_empty(struct rlw_iterator *it, struct ewah_bitmap *out);
+
+static inline size_t rlwit_word_size(struct rlw_iterator *it)
+{
+	return it->rlw.running_len + it->rlw.literal_words;
+}
+
+static inline size_t rlwit_literal_words(struct rlw_iterator *it)
+{
+	return it->pointer - it->rlw.literal_words;
+}
+
+#endif
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (7 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 08/16] ewah: compressed bitmap implementation Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-25  5:42   ` Shawn Pearce
  2013-06-25 15:58   ` Thomas Rast
  2013-06-24 23:23 ` [PATCH 10/16] pack-objects: use bitmaps when packing objects Vicent Marti
                   ` (7 subsequent siblings)
  16 siblings, 2 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

This is the technical documentation and design rationale for the new
Bitmap v2 on-disk format.
---
 Documentation/technical/bitmap-format.txt |  235 +++++++++++++++++++++++++++++
 1 file changed, 235 insertions(+)
 create mode 100644 Documentation/technical/bitmap-format.txt

diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
new file mode 100644
index 0000000..5400082
--- /dev/null
+++ b/Documentation/technical/bitmap-format.txt
@@ -0,0 +1,235 @@
+GIT bitmap v2 format & rationale
+================================
+
+	- A header appears at the beginning, using the same format
+	as JGit's original bitmap indexes.
+
+		4-byte signature: {'B', 'I', 'T', 'M'}
+
+		2-byte version number (network byte order)
+			The current implementation only supports version 2
+			of the bitmap index. The rationale for this is explained
+			in this document.
+
+		2-byte flags (network byte order)
+
+			The folowing flags are supported:
+
+			- BITMAP_OPT_FULL_DAG (0x1) REQUIRED
+			This flag must always be present. It implies that the bitmap
+			index has been generated for a packfile with full closure
+			(i.e. where every single object in the packfile can find
+			 its parent links inside the same packfile). This is a
+			requirement for the bitmap index format, also present in JGit,
+			that greatly reduces the complexity of the implementation.
+
+			- BITMAP_OPT_LE_BITMAPS (0x2)
+			If present, this implies that that the EWAH bitmaps in this
+			index has been serialized to disk in little-endian byte order.
+			Note that this only applies to the actual bitmaps, not to the
+			Git data structures in the index, which are always in Network
+			Byte order as it's costumary.
+
+			- BITMAP_OPT_BE_BITMAPS (0x4)
+			If present, this implies that the EWAH bitmaps have been serialized
+			using big-endian byte order (NWO). If the flag is missing, **the
+			default is to assume that the bitmaps are in big-endian**.
+
+			- BITMAP_OPT_HASH_CACHE (0x8)
+			If present, a hash cache for finding delta bases will be available
+			right after the header block in this index. See the following
+			section for details.
+
+		4-byte entry count (network byte order)
+
+			The total count of entries (bitmapped commits) in this bitmap index.
+
+		20-byte checksum
+
+			The SHA1 checksum of the pack this bitmap index belongs to.
+
+	- An OPTIONAL delta cache follows the header.
+
+		The cache is formed by `n` 4-byte hashes in a row, where `n` is
+		the amount of objects in the indexed packfile. Note that this amount
+		is the **total number of objects** and is not related to the
+		number of commits that have been selected and indexed in the
+		bitmap index.
+
+		The hashes are stored in Network Byte Order and they are the same
+		values generated by a normal revision walk during the `pack-objects`
+		phase.
+
+		The `n`nth hash in the cache is the name hash for the `n`th object
+		in the index for the indexed packfile.
+
+		[RATIONALE]:
+
+		The bitmap index allows us to skip the Counting Objects phase
+		during `pack-objects` and yield all the OIDs that would be reachable
+		("WANTS") when generating the pack.
+
+		This optimization, however, means that we're adding objects to the
+		packfile straight from the packfile index, and hence we are lacking
+		path information for the objects that would normally be generated
+		during the "Counting Objects" phase.
+
+		This path information for each object is hashed and used as a very
+		effective way to find good delta bases when compressing the packfile;
+		without these hashes, the resulting packfiles are much less optimal.
+
+		By storing all the hashes in a cache together with the bitmapsin
+		the bitmap index, we can yield not only the SHA1 of all the reachable
+		objects, but also their hashes, and allow Git to be much smarter when
+		finding delta bases for packing.
+
+		If the delta cache is not available, the bitmap index will obviously
+		be smaller in disk, but the packfiles generated using this index will
+		be between 20% and 30% bigger, because of the lack of name/path
+		information when finding delta bases.
+
+	- 4 EWAH bitmaps that act as type indexes
+
+		Type indexes are serialized after the hash cache in the shape
+		of four EWAH bitmaps stored consecutively (see Appendix A for
+		the serialization format of an EWAH bitmap).
+
+		There is a bitmap for each Git object type, stored in the following
+		order:
+
+			- Commits
+			- Trees
+			- Blobs
+			- Tags
+
+		In each bitmap, the `n`th bit is set to true if the `n`th object
+		in the packfile index is of that type.
+
+		The obvious consequence is that the XOR of all 4 bitmaps will result
+		in a full set (all bits sets), and the AND of all 4 bitmaps will
+		result in an empty bitmap (no bits set).
+
+	- N EWAH bitmaps, one for each indexed commit
+
+		Where `N` is the total amount of entries in this bitmap index.
+		See Appendix A for the serialization format of an EWAH bitmap.
+
+	- An entry index with `N` entries for the indexed commits
+
+		Index entries are stored consecutively, and each entry has the
+		following format:
+
+		- 20-byte SHA1
+			The SHA1 of the commit that this bitmap indexes
+
+		- 4-byte offset (Network Byte Order)
+			The offset **from the beginning of the file** where the
+			bitmap for this commit is stored.
+
+		- 1-byte XOR-offset
+			The xor offset used to compress this bitmap. For an entry
+			in position `x`, a XOR offset of `y` means that the actual
+			bitmap representing for this commit is composed by XORing the
+			bitmap for this entry with the bitmap in entry `x-y` (i.e.
+			the bitmap `y` entries before this one).
+
+			Note that this compression can be recursive. In order to
+			XOR this entry with a previous one, the previous entry needs
+			to be decompressed first, and so on.
+
+			The hard-limit for this offset is 160 (an entry can only be
+			xor'ed against one of the 160 entries preceding it). This
+			number is always positivea, and hence entries are always xor'ed
+			with **previous** bitmaps, not bitmaps that will come afterwards
+			in the index.
+
+		- 1-byte flags for this bitmap
+			At the moment the only available flag is `0x1`, which hints
+			that this bitmap can be re-used when rebuilding bitmap indexes
+			for the repository.
+
+		- 2 bytes of RESERVED data (used right now for better packing).
+
+== Rationale for changes from the Bitmap Format v1
+
+- Serialized EWAH bitmaps can be stored in Little-Endian byte order,
+  if defined by the BITMAP_OPT_LE_BITMAPS flag in the header.
+
+  The original JGit implementation stored bitmaps in Big-Endian byte
+  order (NWO) because it was unable to `mmap` the serialized format,
+  and hence always required a full parse of the bitmap index to memory,
+  where the BE->LE conversion could be performed.
+
+  This full parse, however, requires prohibitive loading times in LE
+  machines (i.e. all modern server hardware): a repository like
+  `torvalds/linux` can have about 8mb of bitmap indexes, resulting
+  in roughly 400ms of parse time.
+
+  This is not an issue in JGit, which is capable of serving repositories
+  from a single-process daemon running on the JVM, but `git-daemon` in
+  git has been implemented with a process-based design (a new
+  `pack-objects` is spawned for each request), and the boot times
+  of parsing the bitmap index every time `pack-objects` is spawned can
+  seriously slow down requests (particularly for small fetches, where we'd
+  spend about 1.5s booting up and 300ms performing the Counting Objects
+  phase).
+
+  By storing the bitmaps in Little-Endian, we're able to `mmap` their
+  compressed data straight in memory without parsing it beforehand, and
+  since most queries don't require accessing all the serialized bitmaps,
+  we'll only page in the minimal amount of bitmaps necessary to perform
+  the reachability analysis as they are accessed.
+
+- An index of all the bitmapped commits is written at the end of the packfile,
+  instead of interpersed with the serialized bitmaps in the middle of the
+  file.
+
+  Again, the old design implied a full parse of the whole bitmap index
+  (which JGit can afford because its daemon is single-process), but it made
+  impossible `mmaping` the bitmap index file and accessing only the parts
+  required to actually solve the query.
+
+  With an index at the end of the file, we can load only this index in memory,
+  allowing for very efficient access to all the available bitmaps lazily (we
+  have their offsets in the mmaped file).
+
+- The ordering of the objects in each bitmap has changed from
+  packfile-order (the nth bit in the bitmap is the nth object in the
+  packfile) to index-order (the nth bit in the bitmap is the nth object
+  in the INDEX of the packfile).
+
+  There is not a noticeable performance difference when actually converting
+  from bitmap position to SHA1 and from SHA1 to bitmap position, but when
+  using packfile ordering like JGit does, queries need to go through the
+  reverse index (pack-revindex.c).
+
+  Generating this reverse index at runtime is **not** free (around 900ms
+  generation time for a repository like `torvalds/linux`), and once again,
+  this generation time needs to happen every time `pack-objects` is
+  spawned.
+
+  With index-ordering, the only requirement for SHA1 -> Bitmap conversions
+  is the packfile index, which we essentially load for free.
+
+
+== Appendix A: Serialization format for an EWAH bitmap
+
+Ewah bitmaps are serialized in the protocol as the JAVAEWAH
+library, making them backwards compatible with the JGit
+implementation:
+
+	- 4-byte number of bits of the resulting UNCOMPRESSED bitmap
+
+	- 4-byte number of words of the COMPRESSED bitmap, when stored
+
+	- N x 8-byte words, as specified by the previous field
+
+		This is the actual content of the compressed bitmap.
+
+	- 4-byte position of the current RLW for the compressed
+		bitmap
+
+Note that the byte order for this serialization is not defined by
+default. The byte order for all the content in a serialized EWAH
+bitmap can be known by the byte order flags in the header of the
+bitmap index file.
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 10/16] pack-objects: use bitmaps when packing objects
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (8 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 09/16] documentation: add documentation for the bitmap format Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-25 12:48   ` Ramkumar Ramachandra
                     ` (2 more replies)
  2013-06-24 23:23 ` [PATCH 11/16] rev-list: add bitmap mode to speed up lists Vicent Marti
                   ` (6 subsequent siblings)
  16 siblings, 3 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

A bitmap index is used, if available, to speed up the Counting Objects
phase during `pack-objects`.

The bitmap index is a `.bitmap` file that can be found inside
`$GIT_DIR/objects/pack/`, next to its corresponding packfile, and
contains precalculated reachability information for selected commits.
The full specification of the format for these bitmap indexes can be found
in `Documentation/technical/bitmap-format.txt`.

For a given commit SHA1, if it happens to be available in the bitmap
index, its bitmap will represent every single object that is reachable
from the commit itself. The nth bit in the bitmap is the nth object in
the index of the packfile; if it's set to 1, the object is reachable.

By using the bitmaps available in the index, this commit implements a new
pair of functions:

	- `prepare_bitmap_walk`
	- `traverse_bitmap_commit_list`

This first function tries to build a bitmap of all the objects that can be
reached from the commit roots of a given `rev_info` struct by using
the following algorithm:

- If all the interesting commits for a revision walk are available in
the index, the resulting reachability bitmap is the bitwise OR of all
the individual bitmaps.

- When the full set of WANTs is not available in the index, we perform a
partial revision walk using the commits that don't have bitmaps as
roots, and limiting the revision walk as soon as we reach a commit that
has a corresponding bitmap. The earlier OR'ed bitmap with all the
indexed commits can now be completed as this walk progresses, so the end
result is the full reachability list.

- For revision walks with a HAVEs set (a set of commits that are deemed
uninteresting), first we perform the same method as for the WANTs, but
using our HAVEs as roots, in order to obtain a full reachability bitmap
of all the uninteresting commits. This bitmap then can be used to:

	a) limit the subsequent walk when building the WANTs bitmap
	b) finding the final set of interesting commits by performing an
	   AND-NOT of the WANTs and the HAVEs.

If `prepare_bitmap_walk` runs successfully, the resulting bitmap is
stored and the equivalent of a `traverse_commit_list` call can be
performed by using `traverse_bitmap_commit_list`; the bitmap version
of this call yields the objects straight from the packfile index
(without having to look them up or parse them) and hence is several
orders of magnitude faster.

If the `prepare_bitmap_walk` call fails (e.g. because no bitmap files
are available), the `rev_info` struct is left untouched, and can be used
to perform a manual rev-walk using `traverse_commit_list`.

Hence, this new pair of functions are a generic API that allows to
perform the equivalent of

	git rev-list --objects [roots...] [^uninteresting...]

for any set of commits, even if they don't have specific bitmaps
generated for them.

In this specific commit, we use the API to perform the
`Counting Objects` phase in `builtin/pack-objects.c`, although it could
be used to speed up other parts of Git that use the same mechanism.

If the pack-objects invocation is being piped to `stdout` (like a normal
`pack-objects` from `upload-pack` would be used) and bitmaps are
enabled, the new `bitmap_walk` API will be used instead of
`traverse_commit_list`.

There are two ways to enable bitmaps for pack-objecs:

	- Pass the `--use-bitmaps` flag when calling `pack-objects`
	- Set `pack.usebitmaps` to `true` in the git config for the
	repository.

Of course, simply enabling the bitmaps is not enought to perform the
optimization: a bitmap index must be available on disk. If no bitmap
index can be found, we'll silently fall back to the slow counting
objects phase.

The point of speeding up the Counting Objects phase of `pack-objects` is
to reduce fetch and clone times for big repositories, which right now
are definitely dominated by the rev-walk algorithm during the Counting
Objects phase.

Here are some sample timings from a full pack of `torvalds/linux` (i.e.
something very similar to what would be generated for a clone of the
repository):

	$ time ../git/git pack-objects --all --stdout
	Counting objects: 3053537, done.
	Compressing objects: 100% (495706/495706), done.
	Total 3053537 (delta 2529614), reused 3053537 (delta 2529614)

	real    0m36.686s
	user    0m34.440s
	sys     0m2.184s

	$ time ../git/git pack-objects --all --stdout
	Counting objects: 3053537, done.
	Compressing objects: 100% (495706/495706), done.
	Total 3053537 (delta 2529614), reused 3053537 (delta 2529614)

	real    0m7.255s
	user    0m6.892s
	sys     0m0.444s

>From a hotspot profiling run, we can see how the counting
objects phase has been reduced to about 400ms (down from 28s).
The remaining time is spent finding deltas and writing the packfile, the
optimization of which is out of the scope of this topic.
---
 Makefile               |    2 +
 builtin/pack-objects.c |   31 ++
 pack-bitmap.c          |  818 ++++++++++++++++++++++++++++++++++++++++++++++++
 pack-bitmap.h          |   53 ++++
 4 files changed, 904 insertions(+)
 create mode 100644 pack-bitmap.c
 create mode 100644 pack-bitmap.h

diff --git a/Makefile b/Makefile
index e03c773..0f2e72b 100644
--- a/Makefile
+++ b/Makefile
@@ -703,6 +703,7 @@ LIB_H += notes.h
 LIB_H += object.h
 LIB_H += pack-revindex.h
 LIB_H += pack.h
+LIB_H += pack-bitmap.h
 LIB_H += parse-options.h
 LIB_H += patch-ids.h
 LIB_H += pathspec.h
@@ -838,6 +839,7 @@ LIB_OBJS += notes.o
 LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
 LIB_OBJS += object.o
+LIB_OBJS += pack-bitmap.o
 LIB_OBJS += pack-check.o
 LIB_OBJS += pack-revindex.o
 LIB_OBJS += pack-write.o
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b7cab18..469b8da 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -19,6 +19,7 @@
 #include "streaming.h"
 #include "thread-utils.h"
 #include "khash.h"
+#include "pack-bitmap.h"
 
 static const char *pack_usage[] = {
 	N_("git pack-objects --stdout [options...] [< ref-list | < object-list]"),
@@ -83,6 +84,9 @@ static struct progress *progress_state;
 static int pack_compression_level = Z_DEFAULT_COMPRESSION;
 static int pack_compression_seen;
 
+static int bitmap_support;
+static int use_bitmap_index;
+
 static unsigned long delta_cache_size = 0;
 static unsigned long max_delta_cache_size = 256 * 1024 * 1024;
 static unsigned long cache_max_small_delta_size = 1000;
@@ -2131,6 +2135,10 @@ static int git_pack_config(const char *k, const char *v, void *cb)
 		cache_max_small_delta_size = git_config_int(k, v);
 		return 0;
 	}
+	if (!strcmp(k, "pack.usebitmaps")) {
+		bitmap_support = git_config_bool(k, v);
+		return 0;
+	}
 	if (!strcmp(k, "pack.threads")) {
 		delta_search_threads = git_config_int(k, v);
 		if (delta_search_threads < 0)
@@ -2366,8 +2374,24 @@ static void get_object_list(int ac, const char **av)
 			die("bad revision '%s'", line);
 	}
 
+	if (use_bitmap_index) {
+		uint32_t size_hint;
+
+		if (!prepare_bitmap_walk(&revs, &size_hint)) {
+			khint_t new_hash_size = (size_hint * (1.0 / __ac_HASH_UPPER)) + 0.5;
+			kh_resize_sha1(packed_objects, new_hash_size);
+
+			nr_alloc = (size_hint + 63) & ~63;
+			objects = xrealloc(objects, nr_alloc * sizeof(struct object_entry *));
+
+			traverse_bitmap_commit_list(&add_object_entry_1);
+			return;
+		}
+	}
+
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
+
 	mark_edges_uninteresting(revs.commits, &revs, show_edge);
 	traverse_commit_list(&revs, show_commit, show_object, NULL);
 
@@ -2495,6 +2519,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			    N_("pack compression level")),
 		OPT_SET_INT(0, "keep-true-parents", &grafts_replace_parents,
 			    N_("do not hide commits by grafts"), 0),
+		OPT_BOOL(0, "bitmaps", &bitmap_support,
+			 N_("enable support for bitmap optimizations")),
 		OPT_END(),
 	};
 
@@ -2561,6 +2587,11 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	if (keep_unreachable && unpack_unreachable)
 		die("--keep-unreachable and --unpack-unreachable are incompatible.");
 
+	if (bitmap_support) {
+		if (use_internal_rev_list && pack_to_stdout)
+			use_bitmap_index = 1;
+	}
+
 	if (progress && all_progress_implied)
 		progress = 2;
 
diff --git a/pack-bitmap.c b/pack-bitmap.c
new file mode 100644
index 0000000..090db15
--- /dev/null
+++ b/pack-bitmap.c
@@ -0,0 +1,818 @@
+#include <stdlib.h>
+
+#include "cache.h"
+#include "commit.h"
+#include "tag.h"
+#include "diff.h"
+#include "revision.h"
+#include "progress.h"
+#include "list-objects.h"
+#include "pack.h"
+#include "pack-bitmap.h"
+
+struct stored_bitmap {
+	unsigned char sha1[20];
+	struct ewah_bitmap *root;
+	struct stored_bitmap *xor;
+	int flags;
+};
+
+struct bitmap_index {
+	struct ewah_bitmap *commits;
+	struct ewah_bitmap *trees;
+	struct ewah_bitmap *blobs;
+	struct ewah_bitmap *tags;
+
+	khash_sha1 *bitmaps;
+
+	struct packed_git *pack;
+
+	struct {
+		struct object_array entries;
+		khash_sha1 *map;
+	} fake_index;
+
+	struct bitmap *result;
+
+	int entry_count;
+	char pack_checksum[20];
+
+	int version;
+	unsigned loaded : 1,
+			 native_bitmaps : 1,
+			 has_hash_cache : 1;
+
+	struct ewah_bitmap *(*read_bitmap)(struct bitmap_index *index);
+
+	void *map;
+	size_t map_size, map_pos;
+
+	uint32_t *delta_hashes;
+};
+
+static struct bitmap_index bitmap_git;
+
+static struct ewah_bitmap *
+lookup_stored_bitmap(struct stored_bitmap *st)
+{
+	struct ewah_bitmap *parent;
+	struct ewah_bitmap *composed;
+
+	if (st->xor == NULL)
+		return st->root;
+
+	composed = ewah_pool_new();
+	parent = lookup_stored_bitmap(st->xor);
+	ewah_xor(st->root, parent, composed);
+
+	ewah_pool_free(st->root);
+	st->root = composed;
+	st->xor = NULL;
+
+	return composed;
+}
+
+static struct ewah_bitmap *
+_read_bitmap(struct bitmap_index *index)
+{
+	struct ewah_bitmap *b = ewah_pool_new();
+	int bitmap_size;
+
+	bitmap_size = ewah_read_mmap(b,
+		index->map + index->map_pos,
+		index->map_size - index->map_pos);
+
+	if (bitmap_size < 0) {
+		error("Failed to load bitmap index (corruped?)");
+		ewah_pool_free(b);
+		return NULL;
+	}
+
+	index->map_pos += bitmap_size;
+	return b;
+}
+
+static struct ewah_bitmap *
+_read_bitmap_native(struct bitmap_index *index)
+{
+	struct ewah_bitmap *b = calloc(1, sizeof(struct ewah_bitmap));
+	int bitmap_size;
+
+	bitmap_size = ewah_read_mmap_native(b,
+		index->map + index->map_pos,
+		index->map_size - index->map_pos);
+
+	if (bitmap_size < 0) {
+		error("Failed to load bitmap index (corruped?)");
+		free(b);
+		return NULL;
+	}
+
+	index->map_pos += bitmap_size;
+	return b;
+}
+
+static int load_bitmap_header(struct bitmap_index *index)
+{
+	struct bitmap_disk_header *header = (void *)index->map;
+
+	if (index->map_size < sizeof(*header))
+		return error("Corrupted bitmap index (missing header data)");
+
+	if (memcmp(header->magic, BITMAP_MAGIC_PREFIX, sizeof(BITMAP_MAGIC_PREFIX)) != 0)
+		return error("Corrupted bitmap index file (wrong header)");
+
+	index->version = (int)ntohs(header->version);
+	if (index->version != 2)
+		return error("Unsupported version for bitmap index file (%d)", index->version);
+
+	/* Parse known bitmap format options */
+	{
+		uint32_t flags = ntohs(header->options);
+
+		if ((flags & BITMAP_OPT_FULL_DAG) == 0) {
+			return error("Unsupported options for bitmap index file "
+				"(Git requires BITMAP_OPT_FULL_DAG)");
+		}
+
+		if (flags & BITMAP_OPT_HASH_CACHE)
+			index->has_hash_cache = 1;
+
+		index->read_bitmap = &_read_bitmap;
+
+		/*
+		 * If we are in a little endian machine and the bitmap
+		 * was written in LE, we can mmap it straight into memory
+		 * without having to parse it
+		 */
+		if ((flags & BITMAP_OPT_LE_BITMAPS)) {
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+			index->native_bitmaps = 1;
+			index->read_bitmap = &_read_bitmap_native;
+#else
+			die("The existing bitmap index is written in little-endian "
+				"byte order and cannot be read in this machine.\n"
+				"Please re-build the bitmap indexes locally.");
+#endif
+		}
+	}
+
+	index->entry_count = ntohl(header->entry_count);
+	memcpy(index->pack_checksum, header->checksum, sizeof(header->checksum));
+	index->map_pos += sizeof(*header);
+
+	return 0;
+}
+
+static struct stored_bitmap *
+store_bitmap(struct bitmap_index *index,
+	const unsigned char *sha1,
+	struct ewah_bitmap *bitmap,
+	struct stored_bitmap *xor_with, int flags)
+{
+	struct stored_bitmap *stored;
+	khiter_t hash_pos;
+	int ret;
+
+	stored = xmalloc(sizeof(struct stored_bitmap));
+	stored->root = bitmap;
+	stored->xor = xor_with;
+	stored->flags = flags;
+	memcpy(stored->sha1, sha1, 20);
+
+	hash_pos = kh_put_sha1(index->bitmaps, stored->sha1, &ret);
+	if (ret == 0) {
+		error("Duplicate entry in bitmap index: %s", sha1_to_hex(sha1));
+		return NULL;
+	}
+
+	kh_value(index->bitmaps, hash_pos) = stored;
+	return stored;
+}
+
+static int
+load_bitmap_entries_v2(struct bitmap_index *index)
+{
+	static const int MAX_XOR_OFFSET = 16;
+
+	int i;
+	struct stored_bitmap *recent_bitmaps[16];
+	struct bitmap_disk_entry_v2 *entry;
+
+	void *index_pos = index->map + index->map_size -
+		(index->entry_count * sizeof(struct bitmap_disk_entry_v2));
+
+	for (i = 0; i < index->entry_count; ++i) {
+		int xor_offset, flags, ret;
+		struct stored_bitmap *xor_bitmap = NULL;
+		struct ewah_bitmap *bitmap = NULL;
+		uint32_t bitmap_pos;
+
+		entry = index_pos;
+		index_pos += sizeof(struct bitmap_disk_entry_v2);
+
+		bitmap_pos = ntohl(entry->bitmap_pos);
+		xor_offset = (int)entry->xor_offset;
+		flags = (int)entry->flags;
+
+		if (index->native_bitmaps) {
+			bitmap = calloc(1, sizeof(struct ewah_bitmap));
+			ret = ewah_read_mmap_native(bitmap,
+				index->map + bitmap_pos,
+				index->map_size - bitmap_pos);
+		} else {
+			bitmap = ewah_pool_new();
+			ret = ewah_read_mmap(bitmap,
+				index->map + bitmap_pos,
+				index->map_size - bitmap_pos);
+		}
+
+		if (ret < 0 || xor_offset > MAX_XOR_OFFSET || xor_offset > i) {
+			return error("Corrupted bitmap pack index");
+		}
+
+		if (xor_offset > 0) {
+			xor_bitmap = recent_bitmaps[(i - xor_offset) % MAX_XOR_OFFSET];
+
+			if (xor_bitmap == NULL)
+				return error("Invalid XOR offset in bitmap pack index");
+		}
+
+		recent_bitmaps[i % MAX_XOR_OFFSET] = store_bitmap(
+			index, entry->sha1, bitmap, xor_bitmap, flags);
+	}
+
+	return 0;
+}
+
+static int load_bitmap_index(
+	struct bitmap_index *index,
+	const char *path,
+	struct packed_git *packfile)
+{
+	int fd = git_open_noatime(path);
+	struct stat st;
+
+	if (fd < 0) {
+		return -1;
+	}
+
+	if (fstat(fd, &st)) {
+		close(fd);
+		return -1;
+	}
+
+	index->map_size = xsize_t(st.st_size);
+	index->map = xmmap(NULL, index->map_size, PROT_READ, MAP_PRIVATE, fd, 0);
+	close(fd);
+
+	index->bitmaps = kh_init_sha1();
+	index->pack = packfile;
+	index->fake_index.map = kh_init_sha1();
+
+	if (load_bitmap_header(index) < 0)
+		return -1;
+
+	if (index->has_hash_cache) {
+		index->delta_hashes = index->map + index->map_pos;
+		index->map_pos += (packfile->num_objects * sizeof(uint32_t));
+	}
+
+	if ((index->commits = index->read_bitmap(index)) == NULL ||
+		(index->trees = index->read_bitmap(index)) == NULL ||
+		(index->blobs = index->read_bitmap(index)) == NULL ||
+		(index->tags = index->read_bitmap(index)) == NULL)
+		return -1;
+
+	if (load_bitmap_entries_v2(index) < 0)
+		return -1;
+
+	index->loaded = true;
+	return 0;
+}
+
+char *pack_bitmap_filename(struct packed_git *p)
+{
+	char *idx_name;
+	int len;
+
+	len = strlen(p->pack_name) - strlen(".pack");
+	idx_name = xmalloc(len + strlen(".bitmap") + 1);
+
+	memcpy(idx_name, p->pack_name, len);
+	memcpy(idx_name + len, ".bitmap", strlen(".bitmap") + 1);
+
+	return idx_name;
+}
+
+int open_pack_bitmap(struct packed_git *p)
+{
+	char *idx_name;
+	int ret;
+
+	if (open_pack_index(p))
+		die("failed to open pack %s", p->pack_name);
+
+	idx_name = pack_bitmap_filename(p);
+	ret = load_bitmap_index(&bitmap_git, idx_name, p);
+	free(idx_name);
+
+	return ret;
+}
+
+void prepare_bitmap_git(void)
+{
+	struct packed_git *p;
+
+	if (bitmap_git.loaded)
+		return;
+
+	for (p = packed_git; p; p = p->next) {
+		if (open_pack_bitmap(p) == 0)
+			return;
+	}
+}
+
+struct include_data {
+	struct bitmap *base;
+	struct bitmap *seen;
+};
+
+static inline int bitmap_position_extended(const unsigned char *sha1)
+{
+	struct object_array *array = &bitmap_git.fake_index.entries;
+	struct object_array_entry *entry ;
+	int bitmap_pos;
+
+	khiter_t pos = kh_get_sha1(bitmap_git.fake_index.map, sha1);
+
+	if (pos < kh_end(bitmap_git.fake_index.map)) {
+		entry = kh_value(bitmap_git.fake_index.map, pos);
+
+		bitmap_pos = (entry - array->objects);
+		bitmap_pos += bitmap_git.pack->num_objects;
+
+		return bitmap_pos;
+	}
+
+	return -1;
+}
+
+static int bitmap_position(const unsigned char *sha1)
+{
+	int pos = find_pack_entry_pos(sha1, bitmap_git.pack);
+	return (pos >= 0) ? pos : bitmap_position_extended(sha1);
+}
+
+static int fake_index_add_object(struct object *object, const char *name)
+{
+	khiter_t hash_pos;
+	int hash_ret;
+	int bitmap_pos;
+
+	struct object_array *array = &bitmap_git.fake_index.entries;
+	struct object_array_entry *entry;
+
+	hash_pos = kh_put_sha1(bitmap_git.fake_index.map, object->sha1, &hash_ret);
+	if (hash_ret > 0) {
+		add_object_array(object, name, array);
+		entry = &array->objects[array->nr - 1];
+		kh_value(bitmap_git.fake_index.map, hash_pos) = entry;
+	} else {
+		entry = kh_value(bitmap_git.fake_index.map, hash_pos);
+	}
+
+	bitmap_pos = (entry - array->objects);
+	bitmap_pos += bitmap_git.pack->num_objects;
+
+	return bitmap_pos;
+}
+
+static void show_object(struct object *object,
+	const struct name_path *path, const char *last, void *data)
+{
+	struct bitmap *base = data;
+	int bitmap_pos;
+
+	bitmap_pos = bitmap_position(object->sha1);
+	if (bitmap_pos < 0) {
+		bitmap_pos = fake_index_add_object(object, path_name(path, last));
+	}
+
+	bitmap_set(base, bitmap_pos);
+}
+
+static void show_commit(struct commit *commit, void *data)
+{
+	/* Nothing to do here */
+}
+
+static int
+add_to_include_set(struct include_data *data, const unsigned char *sha1, int bitmap_pos)
+{
+	khiter_t hash_pos;
+
+	if (data->seen && bitmap_get(data->seen, bitmap_pos))
+		return 0;
+
+	if (bitmap_get(data->base, bitmap_pos))
+		return 0;
+
+	hash_pos = kh_get_sha1(bitmap_git.bitmaps, sha1);
+	if (hash_pos < kh_end(bitmap_git.bitmaps)) {
+		struct stored_bitmap *st = kh_value(bitmap_git.bitmaps, hash_pos);
+		bitmap_or_inplace(data->base, lookup_stored_bitmap(st));
+		return 0;
+	}
+
+	bitmap_set(data->base, bitmap_pos);
+	return 1;
+}
+
+static int
+should_include(struct commit *commit, void *_data)
+{
+	struct include_data *data = _data;
+	int bitmap_pos;
+
+	bitmap_pos = bitmap_position(commit->object.sha1);
+	if (bitmap_pos < 0) {
+		bitmap_pos = fake_index_add_object((struct object *)commit, "");
+	}
+
+	if (!add_to_include_set(data, commit->object.sha1, bitmap_pos)) {
+		struct commit_list *parent = commit->parents;
+
+		while (parent) {
+			parent->item->object.flags |= SEEN;
+			parent = parent->next;
+		}
+
+		return 0;
+	}
+
+	return 1;
+}
+
+static struct bitmap *
+find_objects(
+	struct rev_info *revs,
+	struct object_list *roots,
+	struct bitmap *seen)
+{
+	struct bitmap *base = NULL;
+	bool needs_walk = false;
+
+	struct object_list *not_mapped = NULL;
+
+	/**
+	 * Go through all the roots for the walk. The ones that have bitmaps
+	 * on the bitmap index will be `or`ed together to form an initial
+	 * global reachability analysis.
+	 *
+	 * The ones without bitmaps in the index will be stored in the
+	 * `not_mapped_list` for further processing.
+	 */
+	while (roots) {
+		struct object *object = roots->item;
+		roots = roots->next;
+
+		if (object->type == OBJ_COMMIT) {
+			khiter_t pos = kh_get_sha1(bitmap_git.bitmaps, object->sha1);
+
+			if (pos < kh_end(bitmap_git.bitmaps)) {
+				struct stored_bitmap *st = kh_value(bitmap_git.bitmaps, pos);
+				struct ewah_bitmap *or_with = lookup_stored_bitmap(st);
+
+				if (base == NULL)
+					base = ewah_to_bitmap(or_with);
+				else
+					bitmap_or_inplace(base, or_with);
+
+				object->flags |= SEEN;
+				continue;
+			}
+		}
+
+		object_list_insert(object, &not_mapped);
+	}
+
+	/**
+	 * Best case scenario: We found bitmaps for all the roots,
+	 * so the resulting `or` bitmap has the full reachability analysis
+	 */
+	if (not_mapped == NULL)
+		return base;
+
+	roots = not_mapped;
+
+	/**
+	 * Let's iterate through all the roots that don't have bitmaps to
+	 * check we can determine them to be reachable from the existing
+	 * global bitmap.
+	 *
+	 * If we cannot find them in the existing global bitmap, we'll need
+	 * to push them to an actual walk and run it until we can confirm
+	 * they are reachable
+	 */
+	while (roots) {
+		struct object *object = roots->item;
+		int pos;
+
+		roots = roots->next;
+		pos = bitmap_position(object->sha1);
+
+		if (pos < 0 || base == NULL || !bitmap_get(base, pos)) {
+			object->flags &= ~UNINTERESTING;
+			add_pending_object(revs, object, "");
+			needs_walk = true;
+		} else {
+			object->flags |= SEEN;
+		}
+	}
+
+	if (needs_walk) {
+		struct include_data incdata;
+
+		if (base == NULL)
+			base = bitmap_new();
+
+		incdata.base = base;
+		incdata.seen = seen;
+
+		revs->include_check = should_include;
+		revs->include_check_data = &incdata;
+
+		if (prepare_revision_walk(revs))
+			die("revision walk setup failed");
+
+		traverse_commit_list(revs, show_commit, show_object, base);
+	}
+
+	return base;
+}
+
+static void show_extended_objects(
+	struct bitmap *objects,
+	show_reachable_fn show_reach)
+{
+	struct object_array_entry *entries = bitmap_git.fake_index.entries.objects;
+	unsigned int nr = bitmap_git.fake_index.entries.nr;
+	unsigned int i;
+
+	for (i = 0; i < nr; ++i) {
+		struct object *obj;
+
+		if (!bitmap_get(objects, bitmap_git.pack->num_objects + i))
+			continue;
+
+		obj = entries[i].item;
+		show_reach(obj->sha1, obj->type, pack_name_hash(entries[i].name), 0, NULL, 0);
+	}
+}
+
+static void show_objects_for_type(
+	struct bitmap *objects,
+	struct ewah_bitmap *type_filter,
+	enum object_type object_type,
+	show_reachable_fn show_reach)
+{
+	size_t pos = 0, i = 0;
+	uint32_t offset;
+
+	struct ewah_iterator it;
+	eword_t filter;
+
+	ewah_iterator_init(&it, type_filter);
+
+	while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) {
+		eword_t word = objects->words[i] & filter;
+
+		for (offset = 0; offset < BITS_IN_WORD; ++offset) {
+			const unsigned char *sha1;
+			off_t pack_off;
+			uint32_t hash = 0;
+
+			if ((word >> offset) == 0)
+				break;
+
+			offset += __builtin_ctzll(word >> offset);
+
+			sha1 = nth_packed_object_sha1(bitmap_git.pack, pos + offset);
+			pack_off = nth_packed_object_offset(bitmap_git.pack, pos + offset);
+
+			if (bitmap_git.delta_hashes)
+				hash = ntohl(bitmap_git.delta_hashes[pos + offset]);
+
+			show_reach(sha1, object_type, hash, 0, bitmap_git.pack, pack_off);
+		}
+
+		pos += BITS_IN_WORD;
+		i++;
+	}
+}
+
+int prepare_bitmap_walk(struct rev_info *revs, uint32_t *result_size)
+{
+	unsigned int i;
+	unsigned int pending_nr = revs->pending.nr;
+	unsigned int pending_alloc = revs->pending.alloc;
+	struct object_array_entry *pending_e = revs->pending.objects;
+
+	struct object_list *wants = NULL;
+	struct object_list *haves = NULL;
+
+	struct bitmap *wants_bitmap = NULL;
+	struct bitmap *haves_bitmap = NULL;
+
+	prepare_bitmap_git();
+
+	if (!bitmap_git.loaded)
+		return -1;
+
+	revs->pending.nr = 0;
+	revs->pending.alloc = 0;
+	revs->pending.objects = NULL;
+
+	for (i = 0; i < pending_nr; ++i) {
+		struct object *object = pending_e[i].item;
+
+		if (object->type == OBJ_NONE)
+			parse_object(object->sha1);
+
+		while (object->type == OBJ_TAG) {
+			struct tag *tag = (struct tag *) object;
+
+			if (object->flags & UNINTERESTING) {
+				object_list_insert(object, &haves);
+			} else {
+				object_list_insert(object, &wants);
+			}
+
+			if (!tag->tagged)
+				die("bad tag");
+			object = parse_object(tag->tagged->sha1);
+			if (!object)
+				die("bad object %s", sha1_to_hex(tag->tagged->sha1));
+		}
+
+		if (object->flags & UNINTERESTING) {
+			object_list_insert(object, &haves);
+		} else {
+			object_list_insert(object, &wants);
+		}
+	}
+
+	if (wants == NULL) {
+		/* we don't want anything! we're done! */
+		return 0;
+	}
+
+	if (haves != NULL) {
+		haves_bitmap = find_objects(revs, haves, NULL);
+		reset_revision_walk();
+
+		if (haves_bitmap == NULL)
+			goto restore_revs;
+	}
+
+	wants_bitmap = find_objects(revs, wants, haves_bitmap);
+
+	if (wants_bitmap == NULL) {
+		bitmap_free(haves_bitmap);
+		reset_revision_walk();
+		goto restore_revs;
+	}
+
+	if (haves_bitmap) {
+		bitmap_and_not_inplace(wants_bitmap, haves_bitmap);
+	}
+
+	bitmap_git.result = wants_bitmap;
+
+	if (result_size) {
+		*result_size = bitmap_popcount(wants_bitmap);
+	}
+
+	bitmap_free(haves_bitmap);
+	return 0;
+
+restore_revs:
+	revs->pending.nr = pending_nr;
+	revs->pending.alloc = pending_alloc;
+	revs->pending.objects = pending_e;
+	return -1;
+}
+
+void traverse_bitmap_commit_list(show_reachable_fn show_reachable)
+{
+	if (!bitmap_git.result)
+		die("Tried to traverse bitmap commit without setting it up first");
+
+	show_objects_for_type(bitmap_git.result, bitmap_git.commits, OBJ_COMMIT, show_reachable);
+	show_objects_for_type(bitmap_git.result, bitmap_git.trees, OBJ_TREE, show_reachable);
+	show_objects_for_type(bitmap_git.result, bitmap_git.blobs, OBJ_BLOB, show_reachable);
+	show_objects_for_type(bitmap_git.result, bitmap_git.tags, OBJ_TAG, show_reachable);
+
+	show_extended_objects(bitmap_git.result, show_reachable);
+
+	bitmap_free(bitmap_git.result);
+	bitmap_git.result = NULL;
+}
+
+struct bitmap_test_data {
+	struct bitmap *base;
+	struct progress *prg;
+	size_t seen;
+};
+
+static void test_show_object(struct object *object,
+	const struct name_path *path, const char *last, void *data)
+{
+	struct bitmap_test_data *tdata = data;
+	int bitmap_pos;
+
+	bitmap_pos = bitmap_position(object->sha1);
+	if (bitmap_pos < 0) {
+		die("Object not in bitmap: %s\n", sha1_to_hex(object->sha1));
+	}
+
+	bitmap_set(tdata->base, bitmap_pos);
+	display_progress(tdata->prg, ++tdata->seen);
+}
+
+static void test_show_commit(struct commit *commit, void *data)
+{
+	struct bitmap_test_data *tdata = data;
+	int bitmap_pos;
+
+	bitmap_pos = bitmap_position(commit->object.sha1);
+	if (bitmap_pos < 0) {
+		die("Object not in bitmap: %s\n", sha1_to_hex(commit->object.sha1));
+	}
+
+	bitmap_set(tdata->base, bitmap_pos);
+	display_progress(tdata->prg, ++tdata->seen);
+}
+
+void test_bitmap_walk(struct rev_info *revs)
+{
+	struct object *root;
+	struct bitmap *result = NULL;
+	khiter_t pos;
+	size_t result_popcnt;
+	struct bitmap_test_data tdata;
+
+	prepare_bitmap_git();
+
+	if (!bitmap_git.loaded) {
+		die("failed to load bitmap indexes");
+	}
+
+	if (revs->pending.nr != 1) {
+		die("only one bitmap can be tested at a time");
+	}
+
+	fprintf(stderr, "Bitmap v%d test (%d entries loaded)\n",
+		bitmap_git.version, bitmap_git.entry_count);
+
+	root = revs->pending.objects[0].item;
+	pos = kh_get_sha1(bitmap_git.bitmaps, root->sha1);
+
+	if (pos < kh_end(bitmap_git.bitmaps)) {
+		struct stored_bitmap *st = kh_value(bitmap_git.bitmaps, pos);
+		struct ewah_bitmap *bm = lookup_stored_bitmap(st);
+
+		fprintf(stderr, "Found bitmap for %s. %d bits / %08x checksum\n",
+			sha1_to_hex(root->sha1), (int)bm->bit_size, ewah_checksum(bm));
+
+		result = ewah_to_bitmap(bm);
+	}
+
+	if (result == NULL) {
+		die("Commit %s doesn't have an indexed bitmap", sha1_to_hex(root->sha1));
+	}
+
+	revs->tag_objects = 1;
+	revs->tree_objects = 1;
+	revs->blob_objects = 1;
+
+	result_popcnt = bitmap_popcount(result);
+
+	if (prepare_revision_walk(revs))
+		die("revision walk setup failed");
+
+	tdata.base = bitmap_new();
+	tdata.prg = start_progress("Verifying bitmap entries", result_popcnt);
+	tdata.seen = 0;
+
+	traverse_commit_list(revs, &test_show_commit, &test_show_object, &tdata);
+
+	stop_progress(&tdata.prg);
+
+	if (bitmap_equals(result, tdata.base)) {
+		fprintf(stderr, "OK!\n");
+	} else {
+		fprintf(stderr, "Mismatch!\n");
+	}
+}
diff --git a/pack-bitmap.h b/pack-bitmap.h
new file mode 100644
index 0000000..b97bd46
--- /dev/null
+++ b/pack-bitmap.h
@@ -0,0 +1,53 @@
+#ifndef PACK_BITMAP_H
+#define PACK_BITMAP_H
+
+#define ewah_malloc xmalloc
+#define ewah_calloc xcalloc
+#define ewah_realloc xrealloc
+#include "ewah/ewok.h"
+#include "khash.h"
+
+struct bitmap_disk_entry {
+	uint32_t object_pos;
+	uint8_t xor_offset;
+	uint8_t flags;
+};
+
+struct bitmap_disk_entry_v2 {
+	unsigned char sha1[20];
+	uint32_t bitmap_pos;
+	uint8_t xor_offset;
+	uint8_t flags;
+	uint8_t __pad[2];
+};
+
+struct bitmap_disk_header {
+	char magic[4];
+	uint16_t version;
+	uint16_t options;
+	uint32_t entry_count;
+	char checksum[20];
+};
+
+static const char BITMAP_MAGIC_PREFIX[] = {'B', 'I', 'T', 'M'};;
+
+enum pack_bitmap_opts {
+	BITMAP_OPT_FULL_DAG = 1,
+	BITMAP_OPT_LE_BITMAPS = 2,
+	BITMAP_OPT_BE_BITMAPS = 4,
+	BITMAP_OPT_HASH_CACHE = 8
+};
+
+typedef int (*show_reachable_fn)(
+	const unsigned char *sha1,
+	enum object_type type,
+	uint32_t hash, int exclude,
+	struct packed_git *found_pack,
+	off_t found_offset);
+
+void traverse_bitmap_commit_list(show_reachable_fn show_reachable);
+int prepare_bitmap_walk(struct rev_info *revs, uint32_t *result_size);
+void test_bitmap_walk(struct rev_info *revs);
+char *pack_bitmap_filename(struct packed_git *p);
+
+#endif
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 11/16] rev-list: add bitmap mode to speed up lists
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (9 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 10/16] pack-objects: use bitmaps when packing objects Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-25 16:22   ` Thomas Rast
  2013-06-24 23:23 ` [PATCH 12/16] pack-objects: implement bitmap writing Vicent Marti
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

The bitmap reachability index used to speed up the counting objects
phase during `pack-objects` can also be used to optimize a normal
rev-list if the only thing required are the SHA1s of the objects during
the list.

Calling `git rev-list --use-bitmaps [committish]` is the equivalent
of `git rev-list --objects`, but the rev list is performed based on
a bitmap result instead of using a manual counting objects phase.

These are some example timings for `torvalds/linux`:

	$ time ../git/git rev-list --objects master > /dev/null

	real    0m25.567s
	user    0m25.148s
	sys     0m0.384s

	$ time ../git/git rev-list --use-bitmaps master > /dev/null

	real    0m0.393s
	user    0m0.356s
	sys     0m0.036s

Additionally, a `--test-bitmap` flag has been added that will perform
the same rev-list manually (i.e. using a normal revwalk) and using
bitmaps, and verify that the results are the same.
---
 builtin/rev-list.c |   28 +++++++++++++++++++++++++++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 67701be..905ed08 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -3,6 +3,8 @@
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
+#include "pack.h"
+#include "pack-bitmap.h"
 #include "builtin.h"
 #include "log-tree.h"
 #include "graph.h"
@@ -256,6 +258,17 @@ static int show_bisect_vars(struct rev_list_info *info, int reaches, int all)
 	return 0;
 }
 
+static int show_object_fast(
+	const unsigned char *sha1,
+	enum object_type type,
+	uint32_t hash, int exclude,
+	struct packed_git *found_pack,
+	off_t found_offset)
+{
+	fprintf(stdout, "%ss\n", sha1_to_hex(sha1));
+	return 1;
+}
+
 int cmd_rev_list(int argc, const char **argv, const char *prefix)
 {
 	struct rev_info revs;
@@ -264,6 +277,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	int bisect_list = 0;
 	int bisect_show_vars = 0;
 	int bisect_find_all = 0;
+	int use_bitmaps = 0;
 
 	git_config(git_default_config, NULL);
 	init_revisions(&revs, prefix);
@@ -305,8 +319,15 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 			bisect_show_vars = 1;
 			continue;
 		}
+		if (!strcmp(arg, "--use-bitmaps")) {
+			use_bitmaps = 1;
+			continue;
+		}
+		if (!strcmp(arg, "--test-bitmap")) {
+			test_bitmap_walk(&revs);
+			return 0;
+		}
 		usage(rev_list_usage);
-
 	}
 	if (revs.commit_format != CMIT_FMT_UNSPECIFIED) {
 		/* The command line has a --pretty  */
@@ -332,6 +353,11 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	if (bisect_list)
 		revs.limited = 1;
 
+	if (use_bitmaps && !prepare_bitmap_walk(&revs, NULL)) {
+		traverse_bitmap_commit_list(&show_object_fast);
+		return 0;
+	}
+
 	if (prepare_revision_walk(&revs))
 		die("revision walk setup failed");
 	if (revs.tree_objects)
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 12/16] pack-objects: implement bitmap writing
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (10 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 11/16] rev-list: add bitmap mode to speed up lists Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-24 23:23 ` [PATCH 13/16] repack: consider bitmaps when performing repacks Vicent Marti
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

This commit extends more the functionality of `pack-objects` by allowing
it to write out a `.bitmap` index next to any written packs, together
with the `.idx` index that currently gets written.

If bitmaps are enabled for a given repository (either by calling
`pack-objects` with the `--use-bitmaps` flag or by having
`pack.usebitmaps` set to `true` in the config) and pack-objects is
writing a packfile that would normally be indexed (i.e. not piping to
stdout), we will attempt to write the corresponding bitmap index for the
packfile.

Bitmap index writing happens after the packfile and its index has been
successfully written to disk (`finish_tmp_packfile`). The process is
performed in several steps:

	1. `bitmap_writer_build_type_index`: this call uses the array of
	`struct object_entry`es that has just been sorted when writing out
	the actual packfile index to disk to generate 4 type-index bitmaps
	(one for each object type).

	These bitmaps have their nth bit set if the given object is of the
	bitmap's type. E.g. the nth bit of the Commits bitmap will be 1 if
	the nth object in the packfile index is a commit.

	This is a very cheap operation because the bitmap writing code has
	access to the metadata stored in the `struct object_entry` array,
	and hence the real type for each object in the packfile.

	2. `bitmap_writer_select_commits`: if bitmap writing is enabled for
	a given `pack-objects` run, the sequence of commits generated during
	the Counting Objects phase will be stored in an array.

	We then use that array to build up the list of selected commits.
	Writing a bitmap in the index for each object in the repository
	would be cost-prohibitive, so we use a simple heuristic to pick the
	commits that will be indexed with bitmaps.

	The current heuristics are a simplified version of JGit's original
	implementation. We select a higher density of commits depending on
	their age: the 100 most recent commits are always selected, after
	that we pick 1 commit of each 100, and the gap increases as the
	commits grow older. On top of that, we make sure that every single
	branch that has not been merged (all the tips that would be required
	from a clone) gets their own bitmap, and when selecting commits
	between a gap, we tend to prioritize the commit with the most
	parents.

	Do note that there is no right/wrong way to perform commit selection;
	different selection algorithms will result in different commits
	being selected, but there's no such thing as "missing a commit". The
	bitmap walker algorithm implemented in `prepare_bitmap_walk` is able
	to adapt to missing bitmaps by performing manual walks that complete
	the bitmap: the ideal selection algorithm, however, would select
	the commits that are more likely to be used as roots for a walk in
	the future (e.g. the tips of each branch, and so on) to ensure a
	bitmap for them is always available.

	3. `bitmap_writer_build`: this is the computationally expensive part
	of bitmap generation. Based on the list of commits that were
	selected in the previous step, we perform several incremental walks
	to generate the bitmap for each commit.

	The walks begin from the oldest commit, and are built up
	incrementally for each branch. E.g. consider this dag where A, B, C,
	D, E, F are the selected commits, and a, b, c, e are a chunk of
	simplified history that will not receive bitmaps.

		A---a---B--b--C--c--D
		         \
		          E--e--F

	We start by building the bitmap for A, using A as the root for a
	revision walk and marking all the objects that are reachable until
	the walk is over. Once this bitmap is stored, we reuse the bitmap
	walker to perform the walk for B, assuming that once we reach A
	again, the walk will be terminated because A has already been SEEN
	on the previous walk.

	This process is repeated for C, and D, but when we try to generate
	the bitmaps for E, we cannot reuse neither the current walk nor the
	bitmap we have generated so far.

	What we do now is resetting both the walk and clearing the bitmap,
	and performing the walk from scratch using E as the origin. This new
	walk, however, does not need to be completed. Once we hit B, we can
	lookup the bitmap we have already stored for that commit and OR it
	with the existing bitmap we've composed so far, allowing us to limit
	the walk early.

	After all the bitmaps have been generated, another iteration through
	the list of commits is performed to find the best XOR offsets for
	compression before writing them to disk. Because of the incremental
	nature of these bitmaps, XORing one of them with its predecesor
	results in a minimal "bitmap delta" most of the time. We can write
	this delta to the on-disk bitmap index, and then re-compose the
	original bitmaps by XORing them again when loaded.

	This is a phase very similar to pack-object's `find_delta` (using
	bitmaps instead of objects, of course), except the heuristics have
	been greatly simplified: we only check the 10 bitmaps before any
	given one to find best compressing one. This operation gives optimal
	results (again, because of the incremental nature of the bitmaps)
	and has a very good runtime performance because of the way EWAH
	bitmaps are implemented.

	3. `bitmap_writer_finish`: the last step in the process is
	serializing to disk all the bitmap data that has been generated in
	the two previous steps.

	The bitmap is written to a tmp file and then moved atomically to its
	final destination, using the same process as `pack-write.c:write_idx_file`.
---
 Makefile               |    1 +
 builtin/pack-objects.c |  117 +++++++----
 builtin/pack-objects.h |   33 +++
 pack-bitmap-write.c    |  520 ++++++++++++++++++++++++++++++++++++++++++++++++
 pack-bitmap.h          |    9 +
 pack-write.c           |    2 +
 6 files changed, 646 insertions(+), 36 deletions(-)
 create mode 100644 builtin/pack-objects.h
 create mode 100644 pack-bitmap-write.c

diff --git a/Makefile b/Makefile
index 0f2e72b..599aa59 100644
--- a/Makefile
+++ b/Makefile
@@ -840,6 +840,7 @@ LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
 LIB_OBJS += object.o
 LIB_OBJS += pack-bitmap.o
+LIB_OBJS += pack-bitmap-write.o
 LIB_OBJS += pack-check.o
 LIB_OBJS += pack-revindex.o
 LIB_OBJS += pack-write.o
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 469b8da..58003ec 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -20,6 +20,7 @@
 #include "thread-utils.h"
 #include "khash.h"
 #include "pack-bitmap.h"
+#include "builtin/pack-objects.h"
 
 static const char *pack_usage[] = {
 	N_("git pack-objects --stdout [options...] [< ref-list | < object-list]"),
@@ -27,32 +28,6 @@ static const char *pack_usage[] = {
 	NULL
 };
 
-struct object_entry {
-	struct pack_idx_entry idx;
-	unsigned long size;	/* uncompressed size */
-	struct packed_git *in_pack; 	/* already in pack */
-	off_t in_pack_offset;
-	struct object_entry *delta;	/* delta base object */
-	struct object_entry *delta_child; /* deltified objects who bases me */
-	struct object_entry *delta_sibling; /* other deltified objects who
-					     * uses the same base as me
-					     */
-	void *delta_data;	/* cached delta (uncompressed) */
-	unsigned long delta_size;	/* delta data size (uncompressed) */
-	unsigned long z_delta_size;	/* delta data size (compressed) */
-	unsigned int hash;	/* name hint hash */
-	enum object_type type;
-	enum object_type in_pack_type;	/* could be delta */
-	unsigned char in_pack_header_size;
-	unsigned char preferred_base; /* we do not pack this, but is available
-				       * to be used as the base object to delta
-				       * objects against.
-				       */
-	unsigned char no_try_delta;
-	unsigned char tagged; /* near the very tip of refs */
-	unsigned char filled; /* assigned write-order */
-};
-
 /*
  * Objects we are going to pack are collected in objects array (dynamically
  * expanded).  nr_objects & nr_alloc controls this array.  They are stored
@@ -86,6 +61,7 @@ static int pack_compression_seen;
 
 static int bitmap_support;
 static int use_bitmap_index;
+static int write_bitmap_index;
 
 static unsigned long delta_cache_size = 0;
 static unsigned long max_delta_cache_size = 256 * 1024 * 1024;
@@ -108,6 +84,12 @@ static struct object_entry *locate_object_entry(const unsigned char *sha1);
 static uint32_t written, written_delta;
 static uint32_t reused, reused_delta;
 
+/*
+ * Indexed commits
+ */
+struct commit **indexed_commits;
+unsigned int indexed_commits_nr;
+unsigned int indexed_commits_alloc;
 
 static struct object_slab {
 	struct object_slab *next;
@@ -137,6 +119,16 @@ static struct object_entry *alloc_object_entry(void)
 	return &slab->data[slab->count++];
 }
 
+static void index_commit_for_bitmap(struct commit *commit)
+{
+	if (indexed_commits_nr >= indexed_commits_alloc) {
+		indexed_commits_alloc = (indexed_commits_alloc + 32) * 2;
+		indexed_commits = xrealloc(indexed_commits,
+			indexed_commits_alloc * sizeof(struct commit *));
+	}
+
+	indexed_commits[indexed_commits_nr++] = commit;
+}
 
 static void *get_delta(struct object_entry *entry)
 {
@@ -746,6 +738,29 @@ static struct object_entry **compute_write_order(void)
 	return wo;
 }
 
+static void resolve_real_types(
+	 struct pack_idx_entry **index, uint32_t index_nr)
+{
+	uint32_t i;
+
+	for (i = 0; i < index_nr; ++i) {
+		struct object_entry *entry = (struct object_entry *)index[i];
+
+		switch (entry->type) {
+		case OBJ_COMMIT:
+		case OBJ_TREE:
+		case OBJ_BLOB:
+		case OBJ_TAG:
+			entry->real_type = entry->type;
+			break;
+
+		default:
+			entry->real_type = sha1_object_info(entry->idx.sha1, NULL);
+			break;
+		}
+	}
+}
+
 static void write_pack_file(void)
 {
 	uint32_t i = 0, j;
@@ -824,9 +839,27 @@ static void write_pack_file(void)
 			if (sizeof(tmpname) <= strlen(base_name) + 50)
 				die("pack base name '%s' too long", base_name);
 			snprintf(tmpname, sizeof(tmpname), "%s-", base_name);
+
+			if (write_bitmap_index)
+				resolve_real_types(written_list, nr_written);
+
 			finish_tmp_packfile(tmpname, pack_tmp_name,
 					    written_list, nr_written,
 					    &pack_idx_opts, sha1);
+
+			if (write_bitmap_index && nr_remaining == nr_written) {
+				char *end_of_name_prefix = strrchr(tmpname, 0);
+				sprintf(end_of_name_prefix, "%s.bitmap", sha1_to_hex(sha1));
+
+				stop_progress(&progress_state);
+
+				bitmap_writer_show_progress(progress);
+				bitmap_writer_build_type_index(written_list, nr_written);
+				bitmap_writer_select_commits(indexed_commits, indexed_commits_nr, -1);
+				bitmap_writer_build(packed_objects);
+				bitmap_writer_finish(tmpname, sha1, BITMAP_OPT_HASH_CACHE);
+			}
+
 			free(pack_tmp_name);
 			puts(sha1_to_hex(sha1));
 		}
@@ -900,10 +933,8 @@ static int add_object_entry_1(const unsigned char *sha1, enum object_type type,
 		return 0;
 	}
 
-	if (!exclude && local && has_loose_object_nonlocal(sha1)) {
-		kh_del_sha1(packed_objects, ix);
-		return 0;
-	}
+	if (!exclude && local && has_loose_object_nonlocal(sha1))
+		goto skip_entry;
 
 	if (!found_pack) {
 		for (p = packed_git; p; p = p->next) {
@@ -919,12 +950,12 @@ static int add_object_entry_1(const unsigned char *sha1, enum object_type type,
 				}
 				if (exclude)
 					break;
-				if (incremental ||
-					(local && !p->pack_local) ||
-					(ignore_packed_keep && p->pack_local && p->pack_keep)) {
-					kh_del_sha1(packed_objects, ix);
-					return 0;
-				}
+				if (incremental)
+					goto skip_entry;
+				if (local && !p->pack_local)
+					goto skip_entry;
+				if (ignore_packed_keep && p->pack_local && p->pack_keep)
+					goto skip_entry;
 			}
 		}
 	}
@@ -956,6 +987,11 @@ static int add_object_entry_1(const unsigned char *sha1, enum object_type type,
 	display_progress(progress_state, nr_objects);
 
 	return 1;
+
+skip_entry:
+	kh_del_sha1(packed_objects, ix);
+	write_bitmap_index = 0;
+	return 0;
 }
 
 static int add_object_entry(const unsigned char *sha1, enum object_type type,
@@ -1266,6 +1302,7 @@ static void check_object(struct object_entry *entry)
 		used = unpack_object_header_buffer(buf, avail,
 						   &entry->in_pack_type,
 						   &entry->size);
+
 		if (used == 0)
 			goto give_up;
 
@@ -2197,6 +2234,10 @@ static void show_commit(struct commit *commit, void *data)
 {
 	add_object_entry(commit->object.sha1, OBJ_COMMIT, NULL, 0);
 	commit->object.flags |= OBJECT_ADDED;
+
+	if (write_bitmap_index) {
+		index_commit_for_bitmap(commit);
+	}
 }
 
 static void show_object(struct object *obj,
@@ -2366,6 +2407,7 @@ static void get_object_list(int ac, const char **av)
 		if (*line == '-') {
 			if (!strcmp(line, "--not")) {
 				flags ^= UNINTERESTING;
+				write_bitmap_index = 0;
 				continue;
 			}
 			die("not a rev '%s'", line);
@@ -2588,6 +2630,9 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		die("--keep-unreachable and --unpack-unreachable are incompatible.");
 
 	if (bitmap_support) {
+		if (!pack_to_stdout && rev_list_all)
+			write_bitmap_index = 1;
+
 		if (use_internal_rev_list && pack_to_stdout)
 			use_bitmap_index = 1;
 	}
diff --git a/builtin/pack-objects.h b/builtin/pack-objects.h
new file mode 100644
index 0000000..e186161
--- /dev/null
+++ b/builtin/pack-objects.h
@@ -0,0 +1,33 @@
+#ifndef BUILTIN_PACK_OBJECTS_H
+#define BUILTIN_PACK_OBJECTS_H
+
+struct object_entry {
+	struct pack_idx_entry idx;
+	unsigned long size;	/* uncompressed size */
+	struct packed_git *in_pack; 	/* already in pack */
+	off_t in_pack_offset;
+	struct object_entry *delta;	/* delta base object */
+	struct object_entry *delta_child; /* deltified objects who bases me */
+	struct object_entry *delta_sibling; /* other deltified objects who
+					     * uses the same base as me
+					     */
+	void *delta_data;	/* cached delta (uncompressed) */
+	unsigned long delta_size;	/* delta data size (uncompressed) */
+	unsigned long z_delta_size;	/* delta data size (compressed) */
+	unsigned int hash;	/* name hint hash */
+
+	enum object_type type;
+	enum object_type in_pack_type;
+	enum object_type real_type;
+
+	unsigned int index_pos;
+
+	unsigned char in_pack_header_size;
+	unsigned char preferred_base;
+	unsigned char no_try_delta;
+	unsigned char tagged;
+	unsigned char filled;
+	unsigned char refered;
+};
+
+#endif
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
new file mode 100644
index 0000000..d232545
--- /dev/null
+++ b/pack-bitmap-write.c
@@ -0,0 +1,520 @@
+#include <stdlib.h>
+
+#include "cache.h"
+#include "commit.h"
+#include "tag.h"
+#include "diff.h"
+#include "revision.h"
+#include "list-objects.h"
+#include "progress.h"
+#include "pack-revindex.h"
+#include "pack.h"
+#include "pack-bitmap.h"
+#include "builtin/pack-objects.h"
+
+struct bitmapped_commit {
+	struct commit *commit;
+	struct ewah_bitmap *bitmap;
+	struct ewah_bitmap *write_as;
+	int flags;
+	int xor_offset;
+	uint32_t write_pos;
+};
+
+struct bitmap_writer {
+	struct ewah_bitmap *commits;
+	struct ewah_bitmap *trees;
+	struct ewah_bitmap *blobs;
+	struct ewah_bitmap *tags;
+
+	khash_sha1 *bitmaps;
+	khash_sha1 *packed_objects;
+
+	struct bitmapped_commit *selected;
+	unsigned int selected_nr, selected_alloc;
+
+	struct object_entry **index;
+	uint32_t index_nr;
+
+	int fd;
+	uint32_t written;
+
+	struct progress *progress;
+	int show_progress;
+};
+
+static struct bitmap_writer writer;
+
+void bitmap_writer_show_progress(int show)
+{
+	writer.show_progress = show;
+}
+
+/**
+ * Build the initial type index for the packfile
+ */
+void bitmap_writer_build_type_index(
+	 struct pack_idx_entry **index, uint32_t index_nr)
+{
+	uint32_t i = 0;
+
+	if (writer.show_progress)
+		writer.progress = start_progress("Building bitmap type index", index_nr);
+
+	writer.commits = ewah_new();
+	writer.trees = ewah_new();
+	writer.blobs = ewah_new();
+	writer.tags = ewah_new();
+
+	writer.index = (struct object_entry **)index;
+	writer.index_nr = index_nr;
+
+	while (i < index_nr) {
+		struct object_entry *entry = (struct object_entry *)index[i];
+		entry->index_pos = i;
+
+		switch (entry->real_type) {
+		case OBJ_COMMIT:
+			ewah_set(writer.commits, i);
+			break;
+
+		case OBJ_TREE:
+			ewah_set(writer.trees, i);
+			break;
+
+		case OBJ_BLOB:
+			ewah_set(writer.blobs, i);
+			break;
+
+		case OBJ_TAG:
+			ewah_set(writer.tags, i);
+			break;
+
+		default:
+			die("Missing type information for %s (%d/%d)",
+					sha1_to_hex(entry->idx.sha1), entry->real_type, entry->type);
+		}
+
+		i++;
+		display_progress(writer.progress, i);
+	}
+
+	stop_progress(&writer.progress);
+}
+
+/**
+ * Compute the actual bitmaps
+ */
+static struct object **seen_objects;
+static unsigned int seen_objects_nr, seen_objects_alloc;
+
+static inline void push_bitmapped_commit(struct commit *commit)
+{
+	if (writer.selected_nr >= writer.selected_alloc) {
+		writer.selected_alloc = (writer.selected_alloc + 32) * 2;
+		writer.selected = xrealloc(writer.selected,
+			writer.selected_alloc * sizeof(struct bitmapped_commit));
+	}
+
+	writer.selected[writer.selected_nr].commit = commit;
+	writer.selected[writer.selected_nr].bitmap = NULL;
+	writer.selected[writer.selected_nr].flags = 0;
+
+	writer.selected_nr++;
+}
+
+static inline void mark_as_seen(struct object *object)
+{
+	if (seen_objects_nr >= seen_objects_alloc) {
+		seen_objects_alloc = (seen_objects_alloc + 32) * 2;
+		seen_objects = xrealloc(seen_objects,
+			seen_objects_alloc * sizeof(struct object*));
+	}
+
+	seen_objects[seen_objects_nr++] = object;
+}
+
+static inline void reset_all_seen(void)
+{
+	unsigned int i;
+	for (i = 0; i < seen_objects_nr; ++i) {
+		seen_objects[i]->flags &= ~(SEEN | ADDED | SHOWN);
+	}
+	seen_objects_nr = 0;
+}
+
+static uint32_t find_object_pos(const unsigned char *sha1)
+{
+	khiter_t pos = kh_get_sha1(writer.packed_objects, sha1);
+
+	if (pos < kh_end(writer.packed_objects)) {
+		struct object_entry *entry = kh_value(writer.packed_objects, pos);
+		return entry->index_pos;
+	}
+
+	die("Failed to write bitmap index. Packfile doesn't have full closure "
+		"(object %s is missing)", sha1_to_hex(sha1));
+}
+
+static void show_object(struct object *object,
+	const struct name_path *path, const char *last, void *data)
+{
+	struct bitmap *base = data;
+	bitmap_set(base, find_object_pos(object->sha1));
+	mark_as_seen(object);
+}
+
+static void show_commit(struct commit *commit, void *data)
+{
+	mark_as_seen((struct object *)commit);
+}
+
+static int
+add_to_include_set(struct bitmap *base, struct commit *commit)
+{
+	khiter_t hash_pos;
+	uint32_t bitmap_pos = find_object_pos(commit->object.sha1);
+
+	if (bitmap_get(base, bitmap_pos))
+		return 0;
+
+	hash_pos = kh_get_sha1(writer.bitmaps, commit->object.sha1);
+	if (hash_pos < kh_end(writer.bitmaps)) {
+		struct bitmapped_commit *bc = kh_value(writer.bitmaps, hash_pos);
+		bitmap_or_inplace(base, bc->bitmap);
+		return 0;
+	}
+
+	bitmap_set(base, bitmap_pos);
+	return 1;
+}
+
+static int
+should_include(struct commit *commit, void *_data)
+{
+	struct bitmap *base = _data;
+
+	if (!add_to_include_set(base, commit)) {
+		struct commit_list *parent = commit->parents;
+
+		mark_as_seen((struct object *)commit);
+
+		while (parent) {
+			parent->item->object.flags |= SEEN;
+			mark_as_seen((struct object *)parent->item);
+			parent = parent->next;
+		}
+
+		return 0;
+	}
+
+	return 1;
+}
+
+static void
+compute_xor_offsets(void)
+{
+	static const int MAX_XOR_OFFSET_SEARCH = 10;
+
+	int i, next = 0;
+
+	while (next < writer.selected_nr) {
+		struct bitmapped_commit *stored = &writer.selected[next];
+
+		int best_offset = 0;
+		struct ewah_bitmap *best_bitmap = stored->bitmap;
+		struct ewah_bitmap *test_xor;
+
+		for (i = 1; i <= MAX_XOR_OFFSET_SEARCH; ++i) {
+			int curr = next - i;
+
+			if (curr < 0)
+				break;
+
+			test_xor = ewah_pool_new();
+			ewah_xor(writer.selected[curr].bitmap, stored->bitmap, test_xor);
+
+			if (test_xor->buffer_size < best_bitmap->buffer_size) {
+				if (best_bitmap != stored->bitmap)
+					ewah_pool_free(best_bitmap);
+
+				best_bitmap = test_xor;
+				best_offset = i;
+			} else {
+				ewah_pool_free(test_xor);
+			}
+		}
+
+		stored->xor_offset = best_offset;
+		stored->write_as = best_bitmap;
+
+		next++;
+	}
+}
+
+void
+bitmap_writer_build(khash_sha1 *packed_objects)
+{
+	int i;
+	struct bitmap *base = bitmap_new();
+	struct rev_info revs;
+
+	writer.bitmaps = kh_init_sha1();
+	writer.packed_objects = packed_objects;
+
+	if (writer.show_progress)
+		writer.progress = start_progress("Building bitmaps", writer.selected_nr);
+
+	init_revisions(&revs, NULL);
+	revs.tag_objects = 1;
+	revs.tree_objects = 1;
+	revs.blob_objects = 1;
+	revs.no_walk = 0;
+
+	revs.include_check = should_include;
+	reset_revision_walk();
+
+	for (i = writer.selected_nr - 1; i >= 0; --i) {
+		struct bitmapped_commit *stored;
+		struct object *object;
+
+		khiter_t hash_pos;
+		int hash_ret;
+
+		stored = &writer.selected[i];
+		object = (struct object *)stored->commit;
+
+		if (i < writer.selected_nr - 1) {
+			if (!in_merge_bases(writer.selected[i + 1].commit, stored->commit)) {
+				bitmap_reset(base);
+				reset_all_seen();
+			}
+		}
+
+		add_pending_object(&revs, object, "");
+		revs.include_check_data = base;
+
+		if (prepare_revision_walk(&revs))
+			die("revision walk setup failed");
+
+		traverse_commit_list(&revs, show_commit, show_object, base);
+
+		revs.pending.nr = 0;
+		revs.pending.alloc = 0;
+		revs.pending.objects = NULL;
+
+		stored->bitmap = bitmap_to_ewah(base);
+		stored->flags = object->flags;
+
+		hash_pos = kh_put_sha1(writer.bitmaps, object->sha1, &hash_ret);
+		if (hash_ret == 0)
+			die("Duplicate entry when writing index: %s",
+				sha1_to_hex(object->sha1));
+
+		kh_value(writer.bitmaps, hash_pos) = stored;
+
+		display_progress(writer.progress, writer.selected_nr - i);
+	}
+
+	bitmap_free(base);
+	stop_progress(&writer.progress);
+
+	compute_xor_offsets();
+}
+
+/**
+ * Select the commits that will be bitmapped
+ */
+static inline unsigned int next_commit_index(unsigned int idx)
+{
+	static const unsigned int MIN_COMMITS = 100;
+	static const unsigned int MAX_COMMITS = 5000;
+
+	static const unsigned int MUST_REGION = 100;
+	static const unsigned int MIN_REGION = 20000;
+
+	unsigned int offset, next;
+
+	if (idx <= MUST_REGION)
+		return 0;
+
+	if (idx <= MIN_REGION) {
+		offset = idx - MUST_REGION;
+		return (offset < MIN_COMMITS) ? offset : MIN_COMMITS;
+	}
+
+	offset = idx - MIN_REGION;
+	next = (offset < MAX_COMMITS) ? offset : MAX_COMMITS;
+
+	return (next > MIN_COMMITS) ? next : MIN_COMMITS;
+}
+
+void bitmap_writer_select_commits(
+		struct commit **indexed_commits,
+		unsigned int indexed_commits_nr,
+		int max_bitmaps)
+{
+	unsigned int i = 0, next;
+
+	if (writer.show_progress)
+		writer.progress = start_progress("Selecting bitmap commits", 0);
+
+	if (indexed_commits_nr < 100) {
+		for (i = 0; i < indexed_commits_nr; ++i) {
+			push_bitmapped_commit(indexed_commits[i]);
+		}
+		return;
+	}
+
+	for (;;) {
+		next = next_commit_index(i);
+
+		if (i + next >= indexed_commits_nr)
+			break;
+
+		if (max_bitmaps > 0 && writer.selected_nr >= max_bitmaps) {
+			writer.selected_nr = max_bitmaps;
+			break;
+		}
+
+		if (next == 0) {
+			push_bitmapped_commit(indexed_commits[i]);
+		} else {
+			unsigned int j;
+			struct commit *chosen = indexed_commits[i + next];
+
+			for (j = 0; j <= next; ++j) {
+				struct commit *cm = indexed_commits[i + j];
+				if (cm->parents && cm->parents->next)
+					chosen = cm;
+			}
+
+			push_bitmapped_commit(chosen);
+		}
+
+		i += next + 1;
+		display_progress(writer.progress, i);
+	}
+
+	stop_progress(&writer.progress);
+}
+
+/**
+ * Write the bitmap index to disk
+ */
+static void write_hash_table(
+	 struct object_entry **index, uint32_t index_nr)
+{
+	uint32_t i, j = 0;
+	uint32_t buffer[1024];
+
+	for (i = 0; i < index_nr; ++i) {
+		struct object_entry *entry = index[i];
+
+		buffer[j++] = htonl(entry->hash);
+		if (j == 1024) {
+			write_or_die(writer.fd, buffer, sizeof(buffer));
+			j = 0;
+		}
+	}
+
+	if (j > 0) {
+		write_or_die(writer.fd, buffer, j * sizeof(uint32_t));
+	}
+
+	writer.written += (index_nr * sizeof(uint32_t));
+}
+
+static void dump_bitmap(struct ewah_bitmap *bitmap)
+{
+	int written;
+
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+	written = ewah_serialize_native(bitmap, writer.fd);
+#else
+	written = ewah_serialize(bitmap, writer.fd);
+#endif
+
+	if (written < 0)
+		die("Failed to write bitmap index");
+
+	writer.written += written;
+}
+
+static void
+write_selected_commits_v2(void)
+{
+	int i;
+
+	for (i = 0; i < writer.selected_nr; ++i) {
+		struct bitmapped_commit *stored = &writer.selected[i];
+		stored->write_pos = writer.written;
+		dump_bitmap(stored->write_as);
+	}
+
+	for (i = 0; i < writer.selected_nr; ++i) {
+		struct bitmapped_commit *stored = &writer.selected[i];
+		struct bitmap_disk_entry_v2 on_disk;
+
+		memcpy(on_disk.sha1, stored->commit->object.sha1, 20);
+		on_disk.bitmap_pos = htonl(stored->write_pos);
+		on_disk.xor_offset = stored->xor_offset;
+		on_disk.flags = stored->flags;
+
+		write_or_die(writer.fd, &on_disk, sizeof(on_disk));
+		writer.written += sizeof(on_disk);
+	}
+}
+
+void bitmap_writer_finish(
+	 const char *filename, unsigned char sha1[], uint16_t flags)
+{
+	static char tmp_file[PATH_MAX];
+	static uint16_t default_version = 2;
+
+	struct bitmap_disk_header header;
+
+	flags |= BITMAP_OPT_FULL_DAG;
+
+#if __BYTE_ORDER == __LITTLE_ENDIAN
+	/*
+	 * In little endian machines (i.e. most of them) we're
+	 * going to dump the bitmaps straight from memory into
+	 * disk, and tag the bitmap index as having LE bitmaps
+	 */
+	flags |= BITMAP_OPT_LE_BITMAPS;
+#else
+	flags |= BITMAP_OPT_BE_BITMAPS;
+#endif
+
+	writer.fd = odb_mkstemp(tmp_file, sizeof(tmp_file), "pack/tmp_bitmap_XXXXXX");
+
+	if (writer.fd < 0)
+		die_errno("unable to create '%s'", tmp_file);
+
+	memcpy(header.magic, BITMAP_MAGIC_PREFIX, sizeof(BITMAP_MAGIC_PREFIX));
+	header.version = htons(default_version);
+	header.options = htons(flags);
+	header.entry_count = htonl(writer.selected_nr);
+	memcpy(header.checksum, sha1, 20);
+
+	write_or_die(writer.fd, &header, sizeof(header));
+	writer.written += sizeof(header);
+
+	if (flags & BITMAP_OPT_HASH_CACHE)
+		write_hash_table(writer.index, writer.index_nr);
+
+	dump_bitmap(writer.commits);
+	dump_bitmap(writer.trees);
+	dump_bitmap(writer.blobs);
+	dump_bitmap(writer.tags);
+	write_selected_commits_v2();
+
+	close(writer.fd);
+
+	if (adjust_shared_perm(tmp_file))
+		die_errno("unable to make temporary bitmap file readable");
+
+	if (rename(tmp_file, filename))
+		die_errno("unable to rename temporary bitmap file to '%s'", filename);
+}
diff --git a/pack-bitmap.h b/pack-bitmap.h
index b97bd46..8e7e3dc 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -31,6 +31,8 @@ struct bitmap_disk_header {
 
 static const char BITMAP_MAGIC_PREFIX[] = {'B', 'I', 'T', 'M'};;
 
+#define NEEDS_BITMAP (1u<<22)
+
 enum pack_bitmap_opts {
 	BITMAP_OPT_FULL_DAG = 1,
 	BITMAP_OPT_LE_BITMAPS = 2,
@@ -50,4 +52,11 @@ int prepare_bitmap_walk(struct rev_info *revs, uint32_t *result_size);
 void test_bitmap_walk(struct rev_info *revs);
 char *pack_bitmap_filename(struct packed_git *p);
 
+void bitmap_writer_show_progress(int show);
+void bitmap_writer_build_type_index(struct pack_idx_entry **index, uint32_t index_nr);
+void bitmap_writer_select_commits(struct commit **indexed_commits,
+		unsigned int indexed_commits_nr, int max_bitmaps);
+void bitmap_writer_build(khash_sha1 *packed_objects);
+void bitmap_writer_finish(const char *filename, unsigned char sha1[], uint16_t flags);
+
 #endif
diff --git a/pack-write.c b/pack-write.c
index ca9e63b..6203d37 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -371,5 +371,7 @@ void finish_tmp_packfile(char *name_buffer,
 	if (rename(idx_tmp_name, name_buffer))
 		die_errno("unable to rename temporary index file");
 
+	*end_of_name_prefix = '\0';
+
 	free((void *)idx_tmp_name);
 }
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 13/16] repack: consider bitmaps when performing repacks
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (11 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 12/16] pack-objects: implement bitmap writing Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-25 23:00   ` Junio C Hamano
  2013-06-24 23:23 ` [PATCH 14/16] sha1_file: implement `nth_packed_object_info` Vicent Marti
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

Since `pack-objects` will write a `.bitmap` file next to the `.pack` and
`.idx` files, this commit teaches `git-repack` to consider the new
bitmap indexes (if they exist) when performing repack operations.

This implies moving old bitmap indexes out of the way if we are
repacking a repository that already has them, and moving the newly
generated bitmap indexes into the `objects/pack` directory, next to
their corresponding packfiles.

Since `git repack` is now capable of handling these `.bitmap` files,
a normal `git gc` run on a repository that has `pack.usebitmaps` set
to true in its config file will generate bitmap indexes as part of the
garbage collection process.
---
 git-repack.sh |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/git-repack.sh b/git-repack.sh
index 7579331..d5355ae 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -108,7 +108,7 @@ rollback=
 failed=
 for name in $names
 do
-	for sfx in pack idx
+	for sfx in pack idx bitmap
 	do
 		file=pack-$name.$sfx
 		test -f "$PACKDIR/$file" || continue
@@ -156,6 +156,11 @@ do
 	fullbases="$fullbases pack-$name"
 	chmod a-w "$PACKTMP-$name.pack"
 	chmod a-w "$PACKTMP-$name.idx"
+
+	test -f "$PACKTMP-$name.bitmap" &&
+	chmod a-w "$PACKTMP-$name.bitmap" &&
+	mv -f "$PACKTMP-$name.bitmap" "$PACKDIR/pack-$name.bitmap"
+
 	mv -f "$PACKTMP-$name.pack" "$PACKDIR/pack-$name.pack" &&
 	mv -f "$PACKTMP-$name.idx"  "$PACKDIR/pack-$name.idx" ||
 	exit
@@ -166,6 +171,7 @@ for name in $names
 do
 	rm -f "$PACKDIR/old-pack-$name.idx"
 	rm -f "$PACKDIR/old-pack-$name.pack"
+	rm -f "$PACKDIR/old-pack-$name.bitmap"
 done
 
 # End of pack replacement.
@@ -180,7 +186,7 @@ then
 		  do
 			case " $fullbases " in
 			*" $e "*) ;;
-			*)	rm -f "$e.pack" "$e.idx" "$e.keep" ;;
+			*)	rm -f "$e.pack" "$e.idx" "$e.keep" "$e.bitmap" ;;
 			esac
 		  done
 		)
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 14/16] sha1_file: implement `nth_packed_object_info`
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (12 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 13/16] repack: consider bitmaps when performing repacks Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-24 23:23 ` [PATCH 15/16] write-bitmap: implement new git command to write bitmaps Vicent Marti
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

A new helper function allows to efficiently query the size and real type
of an object in a packfile based on its position on the packfile index.

This is particularly useful when trying to parse all the information of
an index in memory.
---
 cache.h     |    1 +
 sha1_file.c |    6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/cache.h b/cache.h
index bbe5e2a..26e4567 100644
--- a/cache.h
+++ b/cache.h
@@ -1104,6 +1104,7 @@ extern void clear_delta_base_cache(void);
 extern struct packed_git *add_packed_git(const char *, int, int);
 extern const unsigned char *nth_packed_object_sha1(struct packed_git *, uint32_t);
 extern off_t nth_packed_object_offset(const struct packed_git *, uint32_t);
+extern int nth_packed_object_info(struct packed_git *p, uint32_t n, unsigned long *sizep);
 extern int find_pack_entry_pos(const unsigned char *sha1, struct packed_git *p);
 extern off_t find_pack_entry_one(const unsigned char *, struct packed_git *);
 extern int is_pack_valid(struct packed_git *);
diff --git a/sha1_file.c b/sha1_file.c
index 018a847..fd5bd01 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -2223,6 +2223,12 @@ off_t nth_packed_object_offset(const struct packed_git *p, uint32_t n)
 	}
 }
 
+int nth_packed_object_info(struct packed_git *p, uint32_t n, unsigned long *sizep)
+{
+	off_t offset = nth_packed_object_offset(p, n);
+	return packed_object_info(p, offset, sizep, NULL);
+}
+
 int find_pack_entry_pos(const unsigned char *sha1, struct packed_git *p)
 {
 	const uint32_t *level1_ofs = p->index_data;
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 15/16] write-bitmap: implement new git command to write bitmaps
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (13 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 14/16] sha1_file: implement `nth_packed_object_info` Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-24 23:23 ` [PATCH 16/16] rev-list: Optimize --count using bitmaps too Vicent Marti
  2013-06-25 16:05 ` [PATCH 00/16] Speed up Counting Objects with bitmap data Thomas Rast
  16 siblings, 0 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

The `pack-objects` builtin is capable of writing out bitmap indexes
(.bitmap) next to the their corresponding packfile, as part of the
process of actually generating the packfile.

This is a very efficient operation because all the required data for
writing the bitmap index (commit traversal list, list of all objects in
a packfile, sorted index for the packfile, and types for all objects in
the packfile) is readily available in memory as part of the process of
building the packfile itself.

There are however cases when we want to generate a bitmap index for a
packfile that already exists on disk (i.e. one we're not writing from
scratch). This new git builtin implements the bitmap index equivalent of
`git index-pack`: it writes a `.bitmap` file given a pair of existing
`.pack` and `.idx` files.

	NOTE that `write-bitmap` requires the packfile to have been indexed
	beforehand. If the packfile doesn't have its corresponding `.idx`
	file, `git index-pack` must be called before `write-bitmap` can
	work.

The process of generating bitmaps for an existing packfile is as
follows:

	1. Load the existing pack index in memory. The `.idx` for the
	packfile is loaded into a hash table in mememory so it can be
	efficiently queried. As part of this loading process, the real type
	for each object in the packfile is resolved (this implies resolving
	deltas, which can make this process rather expensive).

	2. Find the full closure for the packfile. All the objects from the
	packfile that have been loaded in memory are iterated, looking for
	commits. These commits are parsed, and their parents are marked as
	such to ensure that

		a) there is a full closure in the packfile, and no commit has a
		dangling parent pointer

		b) we can find the set of "tips" for the packfile, i.e., the set
		of commits that don't have any commits pointing to them

	3. The "tips" of the packfile are then used as the roots to perform a
	normal revision walk. The result of this revision walk is the list
	of commits that will be used by `bitmap_writer_select_commits` when
	selecting which commits are going to be bitmapped.

	4. We build and write the bitmap index in the same way that
	`pack-objects` does, given that we have all the required metadata:

		- an array of all the objects in the packfile, in index order
		- the types of all these objects
		- an array with a walk-ordering of all the commits in the
			packfile, which will be used for selection
		- a hash table to efficiently look up objects in the index

	See the previous patch "pack-objects: implement bitmap writing" for
	details on how the bitmap computation happens.
---
 Makefile               |    1 +
 builtin.h              |    1 +
 builtin/write-bitmap.c |  256 ++++++++++++++++++++++++++++++++++++++++++++++++
 git.c                  |    1 +
 4 files changed, 259 insertions(+)
 create mode 100644 builtin/write-bitmap.c

diff --git a/Makefile b/Makefile
index 599aa59..4a0a7dd 100644
--- a/Makefile
+++ b/Makefile
@@ -1000,6 +1000,7 @@ BUILTIN_OBJS += builtin/upload-archive.o
 BUILTIN_OBJS += builtin/var.o
 BUILTIN_OBJS += builtin/verify-pack.o
 BUILTIN_OBJS += builtin/verify-tag.o
+BUILTIN_OBJS += builtin/write-bitmap.o
 BUILTIN_OBJS += builtin/write-tree.o
 
 GITLIBS = $(LIB_FILE) $(XDIFF_LIB)
diff --git a/builtin.h b/builtin.h
index 64bab6b..e39685f 100644
--- a/builtin.h
+++ b/builtin.h
@@ -144,6 +144,7 @@ extern int cmd_var(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_tag(int argc, const char **argv, const char *prefix);
 extern int cmd_version(int argc, const char **argv, const char *prefix);
 extern int cmd_whatchanged(int argc, const char **argv, const char *prefix);
+extern int cmd_write_bitmap(int argc, const char **argv, const char *prefix);
 extern int cmd_write_tree(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_show_ref(int argc, const char **argv, const char *prefix);
diff --git a/builtin/write-bitmap.c b/builtin/write-bitmap.c
new file mode 100644
index 0000000..0cc1c9e
--- /dev/null
+++ b/builtin/write-bitmap.c
@@ -0,0 +1,256 @@
+#include <stdlib.h>
+
+#include "cache.h"
+#include "commit.h"
+#include "tag.h"
+#include "diff.h"
+#include "revision.h"
+#include "progress.h"
+#include "list-objects.h"
+#include "pack.h"
+#include "refs.h"
+#include "pack-bitmap.h"
+
+#include "builtin/pack-objects.h"
+
+static int progress = 1;
+static struct progress *progress_state;
+static int write_hash_cache;
+
+static struct object_entry **objects;
+static uint32_t nr_objects;
+
+static struct commit **walked_commits;
+static uint32_t nr_commits;
+
+static khash_sha1 *packed_objects;
+
+static struct object_entry *
+allocate_entry(const unsigned char *sha1)
+{
+	struct object_entry *entry;
+	khiter_t pos;
+	int hash_ret;
+
+	entry = calloc(1, sizeof(struct object_entry));
+	hashcpy(entry->idx.sha1, sha1);
+
+	pos = kh_put_sha1(packed_objects, entry->idx.sha1, &hash_ret);
+	if (hash_ret == 0) {
+		die("BUG: duplicate entry in packfile");
+	}
+
+	kh_value(packed_objects, pos) = entry;
+	objects[nr_objects++] = entry;
+
+	return entry;
+}
+
+static void
+load_pack_index(struct packed_git *pack)
+{
+	uint32_t i, commits_found = 0;
+	khint_t new_hash_size, nr_alloc;
+
+	if (open_pack_index(pack))
+		die("Failed to load packfile");
+
+	new_hash_size = (pack->num_objects * (1.0 / __ac_HASH_UPPER)) + 0.5;
+	kh_resize_sha1(packed_objects, new_hash_size);
+
+	nr_alloc = (pack->num_objects + 63) & ~63;
+	objects = xmalloc(nr_alloc * sizeof(struct object_entry *));
+
+	if (progress)
+		progress_state = start_progress("Loading existing index", pack->num_objects);
+
+	for (i = 0; i < pack->num_objects; ++i) {
+		struct object_entry *entry;
+		const unsigned char *sha1;
+
+		sha1 = nth_packed_object_sha1(pack, i);
+		entry = allocate_entry(sha1);
+
+		entry->in_pack = pack;
+		entry->type = entry->real_type = nth_packed_object_info(pack, i, NULL);
+		entry->index_pos = i;
+
+		display_progress(progress_state, i + 1);
+	}
+
+	stop_progress(&progress_state);
+	if (progress)
+		progress_state = start_progress("Finding pack closure", 0);
+
+	for (i = 0; i < nr_objects; ++i) {
+		struct commit *commit;
+		struct commit_list *parent;
+
+		if (objects[i]->type != OBJ_COMMIT)
+			continue;
+
+		commit = lookup_commit(objects[i]->idx.sha1);
+		if (parse_commit(commit)) {
+			die("Bad commit: %s\n", sha1_to_hex(objects[i]->idx.sha1));
+		}
+
+		parent = commit->parents;
+
+		while (parent) {
+			khiter_t pos = kh_get_sha1(packed_objects, parent->item->object.sha1);
+
+			if (pos < kh_end(packed_objects)) {
+				struct object_entry *entry = kh_value(packed_objects, pos);
+				entry->refered = 1;
+			} else {
+				die("Failed to write bitmaps for packfile: No closure");
+			}
+
+			parent = parent->next;
+		}
+
+		display_progress(progress_state, ++commits_found);
+	}
+
+	stop_progress(&progress_state);
+}
+
+static void show_object(struct object *object,
+	const struct name_path *path, const char *last, void *data)
+{
+	char *name = path_name(path, last);
+	khiter_t pos = kh_get_sha1(packed_objects, object->sha1);
+
+	if (pos < kh_end(packed_objects)) {
+		struct object_entry *entry = kh_value(packed_objects, pos);
+		entry->hash = pack_name_hash(name);
+	}
+
+	free(name);
+}
+
+static void show_commit(struct commit *commit, void *data)
+{
+	walked_commits[nr_commits++] = commit;
+	display_progress(progress_state, nr_commits);
+}
+
+static void
+find_all_objects(struct packed_git *pack)
+{
+	struct rev_info revs;
+	uint32_t i, found_commits = 0;
+
+	init_revisions(&revs, NULL);
+	if (write_hash_cache) {
+		revs.tag_objects = 1;
+		revs.tree_objects = 1;
+		revs.blob_objects = 1;
+	}
+	revs.no_walk = 0;
+
+	for (i = 0; i < nr_objects; ++i) {
+		if (objects[i]->type == OBJ_COMMIT) {
+			if (!objects[i]->refered) {
+				struct object *object = parse_object(objects[i]->idx.sha1);
+				add_pending_object(&revs, object, "");
+			}
+
+			found_commits++;
+		}
+	}
+
+	if (progress)
+		progress_state = start_progress("Computing walk order", found_commits);
+
+	walked_commits = xmalloc(found_commits * sizeof(struct commit *));
+
+	if (prepare_revision_walk(&revs))
+		die("revision walk setup failed");
+
+	traverse_commit_list(&revs, show_commit, show_object, NULL);
+	stop_progress(&progress_state);
+
+	if (found_commits != nr_commits)
+		die("Missing commits in the walk? Got %d, expected %d", i, nr_commits);
+}
+
+static const char *write_bitmaps_usage[] = {
+	N_("git write-bitmap --hash-cache [options...] [pack-sha1]"),
+	NULL
+};
+
+int cmd_write_bitmap(int argc, const char **argv, const char *prefix)
+{
+	int max_bitmaps = 0;
+
+	struct option write_bitmaps_options[] = {
+		OPT_SET_INT('q', "quiet", &progress,
+			    N_("do not show progress meter"), 0),
+		OPT_SET_INT(0, "progress", &progress,
+			    N_("show progress meter"), 1),
+		OPT_BOOL(0, "hash-cache", &write_hash_cache,
+			 N_("Write a cache of hashes for delta resolution")),
+		OPT_INTEGER(0, "max", &max_bitmaps,
+			    N_("max number of bitmaps to generate")),
+		OPT_END(),
+	};
+
+	struct packed_git *p;
+	struct packed_git *pack_to_index = NULL;
+	char *bitmap_filename;
+	uint16_t write_flags;
+
+	progress = isatty(2);
+	argc = parse_options(argc, argv, prefix,
+			write_bitmaps_options, write_bitmaps_usage, 0);
+
+	packed_objects = kh_init_sha1();
+	prepare_packed_git();
+
+	if (argc) {
+		unsigned char pack_sha[20];
+
+		if (get_sha1_hex(argv[0], pack_sha))
+			die("Invalid SHA1 for packfile");
+
+		for (p = packed_git; p; p = p->next) {
+			if (hashcmp(p->sha1, pack_sha) == 0) {
+				pack_to_index = p;
+				break;
+			}
+		}
+	} else {
+		pack_to_index = packed_git;
+
+		for (p = packed_git; p; p = p->next) {
+			if (p->pack_size > pack_to_index->pack_size)
+				pack_to_index = p;
+		}
+	}
+
+	if (!pack_to_index)
+		die("No packs found for indexing");
+
+	if (progress)
+		fprintf(stderr, "Indexing 'pack-%s.pack'\n",
+			sha1_to_hex(pack_to_index->sha1));
+
+	load_pack_index(pack_to_index);
+	find_all_objects(pack_to_index);
+
+	bitmap_filename = pack_bitmap_filename(pack_to_index);
+	write_flags = 0;
+
+	if (write_hash_cache)
+		write_flags |= BITMAP_OPT_HASH_CACHE;
+
+	bitmap_writer_show_progress(progress);
+	bitmap_writer_build_type_index((struct pack_idx_entry **)objects, nr_objects);
+	bitmap_writer_select_commits(walked_commits, nr_commits, max_bitmaps);
+	bitmap_writer_build(packed_objects);
+	bitmap_writer_finish(bitmap_filename, pack_to_index->sha1, write_flags);
+
+	free(bitmap_filename);
+	return 0;
+}
diff --git a/git.c b/git.c
index 4359086..66ceb2c 100644
--- a/git.c
+++ b/git.c
@@ -426,6 +426,7 @@ static void handle_internal_command(int argc, const char **argv)
 		{ "verify-tag", cmd_verify_tag, RUN_SETUP },
 		{ "version", cmd_version },
 		{ "whatchanged", cmd_whatchanged, RUN_SETUP },
+		{ "write-bitmap", cmd_write_bitmap, RUN_SETUP },
 		{ "write-tree", cmd_write_tree, RUN_SETUP },
 	};
 	int i;
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH 16/16] rev-list: Optimize --count using bitmaps too
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (14 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 15/16] write-bitmap: implement new git command to write bitmaps Vicent Marti
@ 2013-06-24 23:23 ` Vicent Marti
  2013-06-25 16:05 ` [PATCH 00/16] Speed up Counting Objects with bitmap data Thomas Rast
  16 siblings, 0 replies; 64+ messages in thread
From: Vicent Marti @ 2013-06-24 23:23 UTC (permalink / raw)
  To: git; +Cc: Vicent Marti

If bitmap indexes are available, the process of counting reachable
commits with `git rev-list --count` can be greatly sped up. Instead of
having to use callbacks that yield each object in the revision list, we
can build the reachable bitmap for the list and then use an efficient
popcount to find the number of bits set in the bitmap.

This commit implements a `count_bitmap_commit_list` that can be used
after `prepare_bitmap_walk` has returned successfully to return the
number of commits, trees, blobs or tags that have been found to be
reachable during the walk.

`git rev-list` is taught to use this function call when bitmaps are
enabled instead of going through the old rev-list machinery. Do note,
however, that counts with `left_right` and `cherry_mark` are not
optimized by this patch.

Here are some sample timings of different ways to count commits in
`torvalds/linux`:

	$ time ../git/git rev-list master | wc -l
	376549

	real    0m6.973s
	user    0m3.216s
	sys     0m5.316s

	$ time ../git/git rev-list --count master
	376549

	real    0m1.933s
	user    0m1.744s
	sys     0m0.188s

	$ time ../git/git rev-list --use-bitmaps --count master
	376549

	real    0m0.005s
	user    0m0.000s
	sys     0m0.004s

Note that the time in the `--use-bitmaps` invocation is basically noise.
In my machine it ranges from 2ms to 6ms.
---
 builtin/rev-list.c |   11 +++++++++--
 pack-bitmap.c      |   37 +++++++++++++++++++++++++++++++++++++
 pack-bitmap.h      |    2 ++
 3 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 905ed08..097adb8 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -354,8 +354,15 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 		revs.limited = 1;
 
 	if (use_bitmaps && !prepare_bitmap_walk(&revs, NULL)) {
-		traverse_bitmap_commit_list(&show_object_fast);
-		return 0;
+		if (revs.count && !revs.left_right && !revs.cherry_mark) {
+			uint32_t commit_count;
+			count_bitmap_commit_list(&commit_count, NULL, NULL, NULL);
+			printf("%d\n", commit_count);
+			return 0;
+		} else {
+			traverse_bitmap_commit_list(&show_object_fast);
+			return 0;
+		}
 	}
 
 	if (prepare_revision_walk(&revs))
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 090db15..65fdce7 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -720,6 +720,43 @@ void traverse_bitmap_commit_list(show_reachable_fn show_reachable)
 	bitmap_git.result = NULL;
 }
 
+static uint32_t count_object_type(
+	struct bitmap *objects,
+	struct ewah_bitmap *type_filter)
+{
+	size_t i = 0, count = 0;
+	struct ewah_iterator it;
+	eword_t filter;
+
+	ewah_iterator_init(&it, type_filter);
+
+	while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) {
+		eword_t word = objects->words[i++] & filter;
+		count += __builtin_popcountll(word);
+	}
+
+	return count;
+}
+
+void count_bitmap_commit_list(
+	uint32_t *commits, uint32_t *trees, uint32_t *blobs, uint32_t *tags)
+{
+	if (!bitmap_git.result)
+		die("Tried to count bitmap without setting it up first");
+
+	if (commits)
+		*commits = count_object_type(bitmap_git.result, bitmap_git.commits);
+
+	if (trees)
+		*trees = count_object_type(bitmap_git.result, bitmap_git.trees);
+
+	if (blobs)
+		*blobs = count_object_type(bitmap_git.result, bitmap_git.blobs);
+
+	if (tags)
+		*tags = count_object_type(bitmap_git.result, bitmap_git.tags);
+}
+
 struct bitmap_test_data {
 	struct bitmap *base;
 	struct progress *prg;
diff --git a/pack-bitmap.h b/pack-bitmap.h
index 8e7e3dc..816da6d 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -47,6 +47,8 @@ typedef int (*show_reachable_fn)(
 	struct packed_git *found_pack,
 	off_t found_offset);
 
+void count_bitmap_commit_list(
+	uint32_t *commits, uint32_t *trees, uint32_t *blobs, uint32_t *tags);
 void traverse_bitmap_commit_list(show_reachable_fn show_reachable);
 int prepare_bitmap_walk(struct rev_info *revs, uint32_t *result_size);
 void test_bitmap_walk(struct rev_info *revs);
-- 
1.7.9.5

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 08/16] ewah: compressed bitmap implementation
  2013-06-24 23:23 ` [PATCH 08/16] ewah: compressed bitmap implementation Vicent Marti
@ 2013-06-25  1:10   ` Junio C Hamano
  2013-06-25 22:51     ` Junio C Hamano
  2013-06-25 15:38   ` Thomas Rast
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2013-06-25  1:10 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti <tanoku@gmail.com> writes:

> The library is re-licensed under the GPLv2 with the permission of Daniel
> Lemire, the original author. The source code for the C version can
> be found on GitHub:
>
> 	https://github.com/vmg/libewok
>
> The original Java implementation can also be found on GitHub:
>
> 	https://github.com/lemire/javaewah
> ---

Please make sure that all patches are properly signed off.

>  Makefile           |    6 +
>  ewah/bitmap.c      |  229 +++++++++++++++++
>  ewah/ewah_bitmap.c |  703 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  ewah/ewah_io.c     |  199 +++++++++++++++
>  ewah/ewah_rlw.c    |  124 +++++++++
>  ewah/ewok.h        |  194 +++++++++++++++
>  ewah/ewok_rlw.h    |  114 +++++++++

This is lovely.  A few comments after an initial quick scan-through.

 - The code and the headers are well commented, which is good.

 - What's __builtin_popcountll() doing there in a presumably generic
   codepath?

 - Two variants of "bitmap" are given different and easy to
   understand type names (vanilla one is "bitmap", the clever one is
   "ewah_bitmap"), but at many places, a pointer to ewah_bitmap is
   simply called "bitmap" or "bitmap_i" without "ewah" anywhere,
   which waas confusing to read.  Especially, the "NAND" operation
   for bitmap takes two bitmaps, while "OR" takes one bitmap and
   ewah_bitmap.  That is fine as long as the combination is
   convenient for callers, but I wished the ewah variables be called
   with "ewah" somewhere in their names.

 - I compile with "-Werror -Wdeclaration-after-statement"; some
   places seem to trigger it.

 - Some "extern" declarations in *.c sources were irritating;
   shouldn't they be declared in *.h file and included?

 - There are some instances of "if (condition) stmt;" on a single
   line; looked irritating.   

 - "bool" is not a C type we use (and not a particularly good type
   in C++, either).

That is it for now. I am looking forward to read through the users
of the library ;-)

Thanks for working on this.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-24 23:23 ` [PATCH 09/16] documentation: add documentation for the bitmap format Vicent Marti
@ 2013-06-25  5:42   ` Shawn Pearce
  2013-06-25 19:33     ` Vicent Martí
  2013-06-25 15:58   ` Thomas Rast
  1 sibling, 1 reply; 64+ messages in thread
From: Shawn Pearce @ 2013-06-25  5:42 UTC (permalink / raw)
  To: Vicent Marti, Colby Ranger; +Cc: git

On Mon, Jun 24, 2013 at 5:23 PM, Vicent Marti <tanoku@gmail.com> wrote:
> This is the technical documentation and design rationale for the new
> Bitmap v2 on-disk format.
> ---
>  Documentation/technical/bitmap-format.txt |  235 +++++++++++++++++++++++++++++
>  1 file changed, 235 insertions(+)
>  create mode 100644 Documentation/technical/bitmap-format.txt
>
> diff --git a/Documentation/technical/bitmap-format.txt b/Documentation/technical/bitmap-format.txt
> new file mode 100644
> index 0000000..5400082
> --- /dev/null
> +++ b/Documentation/technical/bitmap-format.txt
> @@ -0,0 +1,235 @@
> +GIT bitmap v2 format & rationale
> +================================
> +
> +       - A header appears at the beginning, using the same format
> +       as JGit's original bitmap indexes.
> +
> +               4-byte signature: {'B', 'I', 'T', 'M'}
> +
> +               2-byte version number (network byte order)
> +                       The current implementation only supports version 2
> +                       of the bitmap index. The rationale for this is explained
> +                       in this document.
> +
> +               2-byte flags (network byte order)
> +
> +                       The folowing flags are supported:
> +
> +                       - BITMAP_OPT_FULL_DAG (0x1) REQUIRED
> +                       This flag must always be present. It implies that the bitmap
> +                       index has been generated for a packfile with full closure
> +                       (i.e. where every single object in the packfile can find
> +                        its parent links inside the same packfile). This is a
> +                       requirement for the bitmap index format, also present in JGit,
> +                       that greatly reduces the complexity of the implementation.
> +
> +                       - BITMAP_OPT_LE_BITMAPS (0x2)
> +                       If present, this implies that that the EWAH bitmaps in this
> +                       index has been serialized to disk in little-endian byte order.
> +                       Note that this only applies to the actual bitmaps, not to the
> +                       Git data structures in the index, which are always in Network
> +                       Byte order as it's costumary.
> +
> +                       - BITMAP_OPT_BE_BITMAPS (0x4)
> +                       If present, this implies that the EWAH bitmaps have been serialized
> +                       using big-endian byte order (NWO). If the flag is missing, **the
> +                       default is to assume that the bitmaps are in big-endian**.

I very much hate seeing a file format that is supposed to be portable
that supports both big-endian and little-endian encoding. Such a
specification forces everyone to implement two code paths to handle
reading data from the file, on the off-chance they are on the wrong
platform. Or it forces one platform to be unable to use the file. In
which case a repository may want to build two files, one for each
platform, but you blocked that off by allowing only one file.

What is wrong with picking one encoding and sticking to it? The .idx
file and the dircache use big-endian format. Why not just use
big-endian here too and convert words on demand as they are accessed
from the mmap region? That is what the .idx format does when accessing
an offset.

> +                       - BITMAP_OPT_HASH_CACHE (0x8)
> +                       If present, a hash cache for finding delta bases will be available
> +                       right after the header block in this index. See the following
> +                       section for details.
> +
> +               4-byte entry count (network byte order)
> +
> +                       The total count of entries (bitmapped commits) in this bitmap index.
> +
> +               20-byte checksum
> +
> +                       The SHA1 checksum of the pack this bitmap index belongs to.
> +
> +       - An OPTIONAL delta cache follows the header.

Some may find the name "delta cache" confusing as it does not cache
deltas of objects. May I suggest "path hash cache" as an alternative
name?

> +               The cache is formed by `n` 4-byte hashes in a row, where `n` is
> +               the amount of objects in the indexed packfile. Note that this amount
> +               is the **total number of objects** and is not related to the
> +               number of commits that have been selected and indexed in the
> +               bitmap index.
> +
> +               The hashes are stored in Network Byte Order and they are the same
> +               values generated by a normal revision walk during the `pack-objects`
> +               phase.

I find it interesting this is network byte order and not big-endian or
little-endian based on the flag in the header.

> +               The `n`nth hash in the cache is the name hash for the `n`th object
> +               in the index for the indexed packfile.
> +
> +               [RATIONALE]:
> +
> +               The bitmap index allows us to skip the Counting Objects phase
> +               during `pack-objects` and yield all the OIDs that would be reachable
> +               ("WANTS") when generating the pack.
> +
> +               This optimization, however, means that we're adding objects to the
> +               packfile straight from the packfile index, and hence we are lacking
> +               path information for the objects that would normally be generated
> +               during the "Counting Objects" phase.
> +
> +               This path information for each object is hashed and used as a very
> +               effective way to find good delta bases when compressing the packfile;
> +               without these hashes, the resulting packfiles are much less optimal.
> +
> +               By storing all the hashes in a cache together with the bitmapsin
> +               the bitmap index, we can yield not only the SHA1 of all the reachable
> +               objects, but also their hashes, and allow Git to be much smarter when
> +               finding delta bases for packing.
> +
> +               If the delta cache is not available, the bitmap index will obviously
> +               be smaller in disk, but the packfiles generated using this index will
> +               be between 20% and 30% bigger, because of the lack of name/path
> +               information when finding delta bases.

JGit does not encode this because we were afraid of freezing the hash
function into the file format. Indeed we are not certain JGit even
uses the same path hash function as C Git does, because C Git's
implementation is covered by the GPL and JGit prefers to license its
work under BSD.

If the path hash is going to become part of the format, the algorithm
for computing the hash should also be specified in the format so that
non-GPL implementations have an opportunity to be compatible.

One way we side-stepped the size inflation problem in JGit was to only
use the bitmap index information when sending data on the wire to a
client. Here delta reuse plays a significant factor in building the
pack, and we don't have to be as accurate on matching deltas. During
the equivalent of `git repack` bitmaps are not used, allowing the
traditional graph enumeration algorithm to generate path hash
information.

> +       - 4 EWAH bitmaps that act as type indexes
> +
> +               Type indexes are serialized after the hash cache in the shape
> +               of four EWAH bitmaps stored consecutively (see Appendix A for
> +               the serialization format of an EWAH bitmap).
> +
> +               There is a bitmap for each Git object type, stored in the following
> +               order:
> +
> +                       - Commits
> +                       - Trees
> +                       - Blobs
> +                       - Tags
> +
> +               In each bitmap, the `n`th bit is set to true if the `n`th object
> +               in the packfile index is of that type.
> +
> +               The obvious consequence is that the XOR of all 4 bitmaps will result
> +               in a full set (all bits sets), and the AND of all 4 bitmaps will
> +               result in an empty bitmap (no bits set).

Instead of XOR did you mean OR here?

> +       - N EWAH bitmaps, one for each indexed commit
> +
> +               Where `N` is the total amount of entries in this bitmap index.
> +               See Appendix A for the serialization format of an EWAH bitmap.
> +
> +       - An entry index with `N` entries for the indexed commits
> +
> +               Index entries are stored consecutively, and each entry has the
> +               following format:
> +
> +               - 20-byte SHA1
> +                       The SHA1 of the commit that this bitmap indexes
> +
> +               - 4-byte offset (Network Byte Order)
> +                       The offset **from the beginning of the file** where the
> +                       bitmap for this commit is stored.

Eh, another network byte order field in a file that also has selective
ordering. *sigh*

> +               - 1-byte XOR-offset
> +                       The xor offset used to compress this bitmap. For an entry
> +                       in position `x`, a XOR offset of `y` means that the actual
> +                       bitmap representing for this commit is composed by XORing the
> +                       bitmap for this entry with the bitmap in entry `x-y` (i.e.
> +                       the bitmap `y` entries before this one).
> +
> +                       Note that this compression can be recursive. In order to
> +                       XOR this entry with a previous one, the previous entry needs
> +                       to be decompressed first, and so on.
> +
> +                       The hard-limit for this offset is 160 (an entry can only be
> +                       xor'ed against one of the 160 entries preceding it). This
> +                       number is always positivea, and hence entries are always xor'ed
> +                       with **previous** bitmaps, not bitmaps that will come afterwards
> +                       in the index.

What order are these entries in? Sorted by SHA-1 or random?

Colby found that doing an XOR against the descendant commit yielded
very small bitmaps, so JGit tries to XOR-compress bitmaps along common
linear slices of history. This is trivial in Linus' kernel tree where
there is effectively only one history, but its more relevant with
long-running side branches that have release tags that may not have
fully merged into "master".

> +               - 1-byte flags for this bitmap
> +                       At the moment the only available flag is `0x1`, which hints
> +                       that this bitmap can be re-used when rebuilding bitmap indexes
> +                       for the repository.
> +
> +               - 2 bytes of RESERVED data (used right now for better packing).
> +
> +== Rationale for changes from the Bitmap Format v1
> +
> +- Serialized EWAH bitmaps can be stored in Little-Endian byte order,
> +  if defined by the BITMAP_OPT_LE_BITMAPS flag in the header.
> +
> +  The original JGit implementation stored bitmaps in Big-Endian byte
> +  order (NWO) because it was unable to `mmap` the serialized format,
> +  and hence always required a full parse of the bitmap index to memory,
> +  where the BE->LE conversion could be performed.

You can mmap a file and convert each word on access if your machine is
not using the same byte order. It is not necessary to convert the
entire file before using it through a mmap region.

> +  This full parse, however, requires prohibitive loading times in LE
> +  machines (i.e. all modern server hardware): a repository like
> +  `torvalds/linux` can have about 8mb of bitmap indexes, resulting
> +  in roughly 400ms of parse time.

This makes me wonder what the JGit parse time is. It is ugly if we are
spending 400ms to load the bitmap index for the kernel repository.

> +  This is not an issue in JGit, which is capable of serving repositories
> +  from a single-process daemon running on the JVM, but `git-daemon` in
> +  git has been implemented with a process-based design (a new
> +  `pack-objects` is spawned for each request), and the boot times
> +  of parsing the bitmap index every time `pack-objects` is spawned can
> +  seriously slow down requests (particularly for small fetches, where we'd
> +  spend about 1.5s booting up and 300ms performing the Counting Objects
> +  phase).

There are other strategies that Git could use to handle request
processing at scale. But I guess its reasonable to assume these aren't
viable for Git for a number of reasons. E.g. "long tail" access effect
that many servers have, where most requests are to a large number of
repositories that themselves receive very few requests, an environment
that does not lend itself to caching.

> +  By storing the bitmaps in Little-Endian, we're able to `mmap` their
> +  compressed data straight in memory without parsing it beforehand, and
> +  since most queries don't require accessing all the serialized bitmaps,
> +  we'll only page in the minimal amount of bitmaps necessary to perform
> +  the reachability analysis as they are accessed.

FWIW the .idx and .pack file formats `mmap` the compressed data
straight into memory without parsing it beforehand, and do not use
little-endian byte order. It is possible to have a single compressed
file format definition that is portable to all architectures, and is
accessed by mmap, at scale, with reasonable efficiency.

> +- An index of all the bitmapped commits is written at the end of the packfile,
> +  instead of interpersed with the serialized bitmaps in the middle of the
> +  file.

This is probably a mistake in the JGit design. Your approach is
slightly more complex, but in general I agree with having a table of
the SHA-1s isolated from the bitmaps themselves so that a reader can
access specific bitmaps at random without needing to wade through all
compressed bitmaps.

I would have proposed putting the table at the start of the file, not
the end. The writer making the file can completely serialize the
bitmaps into memory before writing them to disk, and thus knows the
full layout of the resulting file. If the bitmaps don't fit in RAM at
writing time, game over, the optimization of having a very compact
representation of the graph is no longer helping you.

> +  Again, the old design implied a full parse of the whole bitmap index
> +  (which JGit can afford because its daemon is single-process), but it made
> +  impossible `mmaping` the bitmap index file and accessing only the parts
> +  required to actually solve the query.
> +
> +  With an index at the end of the file, we can load only this index in memory,
> +  allowing for very efficient access to all the available bitmaps lazily (we
> +  have their offsets in the mmaped file).
> +
> +- The ordering of the objects in each bitmap has changed from
> +  packfile-order (the nth bit in the bitmap is the nth object in the
> +  packfile) to index-order (the nth bit in the bitmap is the nth object
> +  in the INDEX of the packfile).

Did you notice an increase in bitmap size when you did this? Colby
tested both orderings and we observed the bitmaps were quantifiably
smaller when using the pack file ordering, due to the pack file
locality rules and the EWAH compression. Using the pack file ordering
was a very conscious design decision.

> +  There is not a noticeable performance difference when actually converting
> +  from bitmap position to SHA1 and from SHA1 to bitmap position, but when
> +  using packfile ordering like JGit does, queries need to go through the
> +  reverse index (pack-revindex.c).
> +
> +  Generating this reverse index at runtime is **not** free (around 900ms
> +  generation time for a repository like `torvalds/linux`), and once again,
> +  this generation time needs to happen every time `pack-objects` is
> +  spawned.

Did you know the packer needs the reverse index in order to compute
the end offset of an object it will copy as-is during delta reuse? How
have you avoided making the reverse index?

Again this is why we chose to pin the JGit bitmap on the reverse index
being present. It already had to be present to support as-is reuse.
Once we knew we had to have that reverse index it was OK to rely on it
to get better compression on the bitmaps, and thus make them take up
less memory when loaded into a server. Even if you mmap a file you
want it to be small so it is more likely to retain in the kernel
buffer cache across process invocations.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 10/16] pack-objects: use bitmaps when packing objects
  2013-06-24 23:23 ` [PATCH 10/16] pack-objects: use bitmaps when packing objects Vicent Marti
@ 2013-06-25 12:48   ` Ramkumar Ramachandra
  2013-06-25 15:58   ` Thomas Rast
  2013-06-25 23:06   ` Junio C Hamano
  2 siblings, 0 replies; 64+ messages in thread
From: Ramkumar Ramachandra @ 2013-06-25 12:48 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti wrote:
>         $ time ../git/git pack-objects --all --stdout
>         Counting objects: 3053537, done.
>         Compressing objects: 100% (495706/495706), done.
>         Total 3053537 (delta 2529614), reused 3053537 (delta 2529614)
>
>         real    0m36.686s
>         user    0m34.440s
>         sys     0m2.184s
>
>         $ time ../git/git pack-objects --all --stdout
>         Counting objects: 3053537, done.
>         Compressing objects: 100% (495706/495706), done.
>         Total 3053537 (delta 2529614), reused 3053537 (delta 2529614)
>
>         real    0m7.255s
>         user    0m6.892s
>         sys     0m0.444s

Awesome work!  Can you put up this series on gh:vmg so I can try it
out for myself?

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index b7cab18..469b8da 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> +       if (!strcmp(k, "pack.usebitmaps")) {
> +               bitmap_support = git_config_bool(k, v);
> +               return 0;
> +       }

Not using config_error_nonbool() to indicate an error?

> +       if (use_bitmap_index) {
> +               uint32_t size_hint;
> +
> +               if (!prepare_bitmap_walk(&revs, &size_hint)) {
> +                       khint_t new_hash_size = (size_hint * (1.0 / __ac_HASH_UPPER)) + 0.5;

How does this work?  You've taken the inverse of __ac_HASH_UPPER,
multiplied it by the size_hint you get from prepare_bitmap_walk(), and
add 0.5?

> +                       kh_resize_sha1(packed_objects, new_hash_size);

So packed_objects is a hashtable of type kh_sha1_t * (you introduced
in [03/16]) that you're now resizing to new_hash_size.  To find out
what the significance of this new_hash_size is, it looks like I have
to read prepare_bitmap_walk().

> +                       nr_alloc = (size_hint + 63) & ~63;
> +                       objects = xrealloc(objects, nr_alloc * sizeof(struct object_entry *));

Interesting.  The only other place where we realloc the objects in
this file is in pack-objects.c:949, and we do that because nr_object
>= nr_alloc.  What is this 63 magic?

>         if (prepare_revision_walk(&revs))
>                 die("revision walk setup failed");
> +

Stray newline.

> +       if (bitmap_support) {
> +               if (use_internal_rev_list && pack_to_stdout)
> +                       use_bitmap_index = 1;
> +       }
> +

Wait, what does pack_to_stdout have to do with deciding whether or not
to walk the bitmap?

> diff --git a/pack-bitmap.c b/pack-bitmap.c
> new file mode 100644
> index 0000000..090db15
> --- /dev/null
> +++ b/pack-bitmap.c
> +struct stored_bitmap {
> +       unsigned char sha1[20];
> +       struct ewah_bitmap *root;
> +       struct stored_bitmap *xor;
> +       int flags;
> +};

What exactly is this?  What is stored_bitmap *xor?  It looks like some
sort of next-pointer, but why is it named xor?

> +struct bitmap_index {

Okay, the bitmap index.

> +       struct ewah_bitmap *commits;
> +       struct ewah_bitmap *trees;
> +       struct ewah_bitmap *blobs;
> +       struct ewah_bitmap *tags;

I might be asking a really stupid question here, but why do you have
different bitmaps for different object types?  Unless I'm mistaken,
the packfile index doesn't make this differentiation: it sorts and
stores the SHA-1s of the various objects; you request a SHA-1, it does
a binary search and returns the object.

> +       khash_sha1 *bitmaps;

A hashmap keyed with the SHA-1, I presume.

> +       struct packed_git *pack;

You're defining which pack this bitmap index is for, right?

> +       struct {
> +               struct object_array entries;
> +               khash_sha1 *map;
> +       } fake_index;

What is this?

> +       struct bitmap *result;

No clue what this is about.

> +       int entry_count;

No clue what this is, but I'm assuming it can't be important because
it's an int.

> +       char pack_checksum[20];
> +
> +       int version;

Use something invariant like uint32_t?  Also, there is no clear
indication about where this information is going to go (header,
presumably?).  Look at pack.h:

struct pack_idx_header {
	uint32_t idx_signature;
	uint32_t idx_version;
};

> +       unsigned loaded : 1,
> +                        native_bitmaps : 1,
> +                        has_hash_cache : 1;

Booleans, but I don't know what they're doing here even after reading
your bitmap-format.txt.

> +       struct ewah_bitmap *(*read_bitmap)(struct bitmap_index *index);

I'm very confused now.  Each bitmap_index has a specialized read_bitmap()?

> +       void *map;
> +       size_t map_size, map_pos;
> +
> +       uint32_t *delta_hashes;

I'll give up on the rest.

> +static struct bitmap_index bitmap_git;

You could have made the struct static to begin with and ended it with
a } bitmap_git;

> +static struct ewah_bitmap *
> +lookup_stored_bitmap(struct stored_bitmap *st)

Please conform to Linux style and make it easier for us to grep by
putting this in one line?

> +{
> +       struct ewah_bitmap *parent;
> +       struct ewah_bitmap *composed;
> +
> +       if (st->xor == NULL)

if (!st->xor)

> +               return st->root;

Okay, st->xor needs to be set to something for lookup_stored_bitmap()
to do something useful.

> +       composed = ewah_pool_new();
> +       parent = lookup_stored_bitmap(st->xor);

So st->xor is a parent-pointer?  Still doesn't answer my question
about why it is named xor.

> +       ewah_xor(st->root, parent, composed);

I would have loved it if the prototype of this function made it clear
what it was writing like: ewah_xor(st->root, parent, &composed); but
then it expects the caller to do memory allocation, so never mind.

> +       ewah_pool_free(st->root);
> +       st->root = composed;
> +       st->xor = NULL;
> +
> +       return composed;

So lookup_stored_bitmap() just xors st->root with parent (determined
by recursively looking up st->xor)?

> +}
> +
> +static struct ewah_bitmap *
> +_read_bitmap(struct bitmap_index *index)

We usually use the _1 suffix convention for internal functions, not _ prefix.

> +       int bitmap_size;
> +
> +       bitmap_size = ewah_read_mmap(b,
> +               index->map + index->map_pos,
> +               index->map_size - index->map_pos);

Make this a single statement.  Also, why can't I see this on
gh:vmg/libework?  Your [08/16] has diverged from there :/

> +       return b;

So _read_bitmap() mmaps the bitmap and returns it.

> +static struct ewah_bitmap *
> +_read_bitmap_native(struct bitmap_index *index)

The counterpart that calls *_mmap_native() in ewah.  Have to look at
the difference.  In any case, I hope you've used xmmap().

> +static int load_bitmap_header(struct bitmap_index *index)
> +{

I'm going to compare this to sha1_file.c:check_packed_git_idx().

> +       struct bitmap_disk_header *header = (void *)index->map;
> +
> +       if (index->map_size < sizeof(*header))
> +               return error("Corrupted bitmap index (missing header data)");

No munmap()?  Was abstracting out the mmap detail a good idea?

> +       if (memcmp(header->magic, BITMAP_MAGIC_PREFIX, sizeof(BITMAP_MAGIC_PREFIX)) != 0)
> +               return error("Corrupted bitmap index file (wrong header)");

PACK_SIGNATURE, PACK_IDX_SIGNATURE.  Name this BITMAP_IDX_SIGNATURE?

> +       index->version = (int)ntohs(header->version);

You wouldn't have to coerce to int if version were a uint32_t in the
first place.

> +       /* Parse known bitmap format options */
> +       {
> +               uint32_t flags = ntohs(header->options);

Okay.

> +               if ((flags & BITMAP_OPT_FULL_DAG) == 0) {
> +                       return error("Unsupported options for bitmap index file "
> +                               "(Git requires BITMAP_OPT_FULL_DAG)");
> +               }

Unnecessary braces for single statement.

> +               if (flags & BITMAP_OPT_HASH_CACHE)
> +                       index->has_hash_cache = 1;
> +
> +               index->read_bitmap = &_read_bitmap;

So you've set the read_bitmap() function to _read_bitmap().  Let's see why.

> +               /*
> +                * If we are in a little endian machine and the bitmap
> +                * was written in LE, we can mmap it straight into memory
> +                * without having to parse it
> +                */
> +               if ((flags & BITMAP_OPT_LE_BITMAPS)) {
> +#if __BYTE_ORDER == __LITTLE_ENDIAN
> +                       index->native_bitmaps = 1;
> +                       index->read_bitmap = &_read_bitmap_native;
> +#else
> +                       die("The existing bitmap index is written in little-endian "
> +                               "byte order and cannot be read in this machine.\n"
> +                               "Please re-build the bitmap indexes locally.");
> +#endif
> +               }
> +       }

Okay.

> +       index->entry_count = ntohl(header->entry_count);
> +       memcpy(index->pack_checksum, header->checksum, sizeof(header->checksum));

I might be asking (yet another) really stupid question here, but why
isn't the checksum an unsigned char[20] (i.e. a simple 20-byte SHA-1)?
 We already have infrastructure to deal with SHA-1s, so might as well
reuse it, right?

> +       index->map_pos += sizeof(*header);

You've read the header successfully, and incremented map_pos for other callers.

> +static struct stored_bitmap *
> +store_bitmap(struct bitmap_index *index,
> +       const unsigned char *sha1,
> +       struct ewah_bitmap *bitmap,
> +       struct stored_bitmap *xor_with, int flags)

Why don't you just prepare a struct and send it to this function to
write instead of so many arguments?

> +       stored = xmalloc(sizeof(struct stored_bitmap));
> +       stored->root = bitmap;
> +       stored->xor = xor_with;
> +       stored->flags = flags;
> +       memcpy(stored->sha1, sha1, 20);

Use hashcpy().  You would have had to do none of this if the caller
had passed a readymade struct, no?

> +       hash_pos = kh_put_sha1(index->bitmaps, stored->sha1, &ret);

Okay, you store a SHA-1.

> +       if (ret == 0) {
> +               error("Duplicate entry in bitmap index: %s", sha1_to_hex(sha1));
> +               return NULL;
> +       }

0 is success by convention!

> +       kh_value(index->bitmaps, hash_pos) = stored;

Okay.

> +       return stored;

You're returning allocated memory, that the caller must remember to free.

> +static int
> +load_bitmap_entries_v2(struct bitmap_index *index)

I'm not sure it's a great idea to put a volatile version number in the
function name.

> +{
> +       static const int MAX_XOR_OFFSET = 16;
> +
> +       int i;
> +       struct stored_bitmap *recent_bitmaps[16];

Does this 16 have anything to do with MAX_XOR_OFFSET?

> +       struct bitmap_disk_entry_v2 *entry;
> +
> +       void *index_pos = index->map + index->map_size -
> +               (index->entry_count * sizeof(struct bitmap_disk_entry_v2));

Wait, why did we set map_pos earlier if you're recomputing it here?
And why is this a void *?

> +       for (i = 0; i < index->entry_count; ++i) {
> +               int xor_offset, flags, ret;
> +               struct stored_bitmap *xor_bitmap = NULL;
> +               struct ewah_bitmap *bitmap = NULL;
> +               uint32_t bitmap_pos;
> +
> +               entry = index_pos;
> +               index_pos += sizeof(struct bitmap_disk_entry_v2);

Okay, so I understand that you're parsing one bitmap_disk_entry_v2
struct at a time.

> +               bitmap_pos = ntohl(entry->bitmap_pos);
> +               xor_offset = (int)entry->xor_offset;
> +               flags = (int)entry->flags;

I have no clue why you're casting like this.

> +               if (index->native_bitmaps) {

What is this native versus non-native bitmaps?  Your
bitmap-formats.txt has nothing to say on the matter.

> +                       bitmap = calloc(1, sizeof(struct ewah_bitmap));
> +                       ret = ewah_read_mmap_native(bitmap,
> +                               index->map + bitmap_pos,
> +                               index->map_size - bitmap_pos);

Wait a minute.  Isn't this what you wrapped in _read_bitmap_native()?
Totally confused.

> +               } else {
> +                       bitmap = ewah_pool_new();
> +                       ret = ewah_read_mmap(bitmap,
> +                               index->map + bitmap_pos,
> +                               index->map_size - bitmap_pos);

Did you forget about _read_bitmap()?

> +               if (ret < 0 || xor_offset > MAX_XOR_OFFSET || xor_offset > i) {
> +                       return error("Corrupted bitmap pack index");
> +               }

Unnecessary braces.

> +               if (xor_offset > 0) {
> +                       xor_bitmap = recent_bitmaps[(i - xor_offset) % MAX_XOR_OFFSET];
> +
> +                       if (xor_bitmap == NULL)

if (!xor_bitmap)

> +                               return error("Invalid XOR offset in bitmap pack index");

I haven't seen a single die() until now, and that's a Good sign.

> +               recent_bitmaps[i % MAX_XOR_OFFSET] = store_bitmap(
> +                       index, entry->sha1, bitmap, xor_bitmap, flags);

So you fill in the 16 recent bitmaps in this function?

> +static int load_bitmap_index(
> +       struct bitmap_index *index,
> +       const char *path,
> +       struct packed_git *packfile)
> +{
> +       int fd = git_open_noatime(path);

I assume you exposed this static defined in sha1_file.c in an earlier
patch, but I didn't check.

> +       struct stat st;
> +
> +       if (fd < 0) {
> +               return -1;
> +       }

Unnecessary braces.

> +       index->map_size = xsize_t(st.st_size);
> +       index->map = xmmap(NULL, index->map_size, PROT_READ, MAP_PRIVATE, fd, 0);
> +       close(fd);

I like how similar this is to check_packed_git_idx().  What happened
to your ewah mapping abstractions though?

> +       index->bitmaps = kh_init_sha1();
> +       index->pack = packfile;
> +       index->fake_index.map = kh_init_sha1();

I'll hopefully get to find out what fake_index is here.

> +       if (load_bitmap_header(index) < 0)
> +               return -1;

Okay.  Notice how the format we're parsing is documented tersely as
inline comments in sha1_name.c: you might like to do that too.

> +       if (index->has_hash_cache) {
> +               index->delta_hashes = index->map + index->map_pos;
> +               index->map_pos += (packfile->num_objects * sizeof(uint32_t));
> +       }

Okay.

> +       if ((index->commits = index->read_bitmap(index)) == NULL ||
> +               (index->trees = index->read_bitmap(index)) == NULL ||
> +               (index->blobs = index->read_bitmap(index)) == NULL ||
> +               (index->tags = index->read_bitmap(index)) == NULL)
> +               return -1;

As usual, please use !() instead of explicitly comparing with NULL.
It looks like I'll get to find out why you have four different bitmaps
set to the same thing (?) soon; exciting!

> +       if (load_bitmap_entries_v2(index) < 0)
> +               return -1;
> +
> +       index->loaded = true;

Fine.  This function calls out to various little parsing functions and
sets index->loaded.  It returns -1 instead of error(), because those
little functions report the errors.

> +char *pack_bitmap_filename(struct packed_git *p)

Compare with sha1_name.c:open_pack_index().

> +int open_pack_bitmap(struct packed_git *p)

Okay.

> +void prepare_bitmap_git(void)

Okay.

> +struct include_data {
> +       struct bitmap *base;
> +       struct bitmap *seen;
> +};

I wonder what this is.

> +static inline int bitmap_position_extended(const unsigned char *sha1)

Sorry, I'm stopping here.  It's impossible to review this gigantic
patch in one sitting.

Thanks.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/16] compat: add endinanness helpers
  2013-06-24 23:23 ` [PATCH 07/16] compat: add endinanness helpers Vicent Marti
@ 2013-06-25 13:08   ` Peter Krefting
  2013-06-25 13:25     ` Vicent Martí
  0 siblings, 1 reply; 64+ messages in thread
From: Peter Krefting @ 2013-06-25 13:08 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti:

> The POSIX standard doesn't currently define a `nothll`/`htonll` 
> function pair to perform network-to-host and host-to-network swaps 
> of 64-bit data. These 64-bit swaps are necessary for the on-disk 
> storage of EWAH bitmaps if they are not in native byte order.

endian(3) claims that glibc 2.9+ define be64toh() and htobe64() which 
should do what you are looking for. The manual page does mention them 
being named differently across OSes, though, so you may need to be 
careful with that.

-- 
\\// Peter - http://www.softwolves.pp.se/

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/16] compat: add endinanness helpers
  2013-06-25 13:08   ` Peter Krefting
@ 2013-06-25 13:25     ` Vicent Martí
  2013-06-27  5:56       ` Peter Krefting
  0 siblings, 1 reply; 64+ messages in thread
From: Vicent Martí @ 2013-06-25 13:25 UTC (permalink / raw)
  To: Peter Krefting; +Cc: git

On Tue, Jun 25, 2013 at 3:08 PM, Peter Krefting <peter@softwolves.pp.se> wrote:
> endian(3) claims that glibc 2.9+ define be64toh() and htobe64() which should
> do what you are looking for. The manual page does mention them being named
> differently across OSes, though, so you may need to be careful with that.

I'm aware of that, but Git needs to build with glibc 2.7+ (or was it
2.6?), hence the need for this compat layer.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 02/16] sha1_file: refactor into `find_pack_object_pos`
  2013-06-24 23:22 ` [PATCH 02/16] sha1_file: refactor into `find_pack_object_pos` Vicent Marti
@ 2013-06-25 13:59   ` Thomas Rast
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Rast @ 2013-06-25 13:59 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti <tanoku@gmail.com> writes:

>  	if (use_lookup) {
> -		int pos = sha1_entry_pos(index, stride, 0,
> -					 lo, hi, p->num_objects, sha1);
> -		if (pos < 0)
> -			return 0;
> -		return nth_packed_object_offset(p, pos);
> +		return sha1_entry_pos(index, stride, 0, lo, hi, p->num_objects, sha1);
>  	}

Our house style prefers not having the braces in a single-line conditional.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/16] pack-objects: use a faster hash table
  2013-06-24 23:23 ` [PATCH 03/16] pack-objects: use a faster hash table Vicent Marti
@ 2013-06-25 14:03   ` Thomas Rast
  2013-06-26  2:14     ` Jeff King
  2013-06-25 17:58   ` Ramkumar Ramachandra
  2013-06-25 22:48   ` Junio C Hamano
  2 siblings, 1 reply; 64+ messages in thread
From: Thomas Rast @ 2013-06-25 14:03 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git, Jeff King

Vicent Marti <tanoku@gmail.com> writes:

> This commit brings `khash.h`, a header only hash table implementation
> that while remaining rather simple (uses quadratic probing and a
> standard hashing scheme) and self-contained, offers a significant
> performance improvement in both insertion and lookup times.
>
> `khash` is a generic hash table implementation that can be 'templated'
> for any given type while maintaining good performance by using preprocessor
> macros. This specific version has been modified to define by default a
> `khash_sha1` type, a map of SHA1s (const unsigned char[20]) to void *
> pointers.
>
> When replacing the old hash table implementation in `pack-objects` with
> the khash_sha1 table, the insertion time is greatly reduced:
>
> 	kh_put_sha1 :: 284.011ms
> 	add_object_entry_1 : 36.06ms
> 	hashcmp :: 24.045ms
>
> This reduction of more than 50% in the insertion and lookup times,
> although nice, is not particularly noticeable for normal `pack-objects`
> operation: `pack-objects` performs massive batch insertions and
> relatively few lookups, so `khash` doesn't get a chance to shine here.
>
> The big win here, however, is in the massively reduced amount of hash
> collisions (as you can see from the huge reduction of time spent in
> `hashcmp` after the change). These greatly improved lookup times
> will result critical once we implement the writing algorithm for bitmap
> indxes in a later patch of this series.

Is that reduction in collisions purely because it uses quadratic
probing, or is there some other magic trick involved?  Is the same also
applicable to the other users of the "big" object hash table?  (I assume
Peff has already tried applying it there, but I'm still curious...)

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 08/16] ewah: compressed bitmap implementation
  2013-06-24 23:23 ` [PATCH 08/16] ewah: compressed bitmap implementation Vicent Marti
  2013-06-25  1:10   ` Junio C Hamano
@ 2013-06-25 15:38   ` Thomas Rast
  1 sibling, 0 replies; 64+ messages in thread
From: Thomas Rast @ 2013-06-25 15:38 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti <tanoku@gmail.com> writes:

> The library is re-licensed under the GPLv2 with the permission of Daniel
> Lemire, the original author.

This says "GPLv2", but the license blurbs all say "or (at your option)
any later version".  IANAL, does this cause any problems?  If so, can
they be GPLv2-only instead?

>  Makefile           |    6 +
>  ewah/bitmap.c      |  229 +++++++++++++++++
>  ewah/ewah_bitmap.c |  703 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  ewah/ewah_io.c     |  199 +++++++++++++++
>  ewah/ewah_rlw.c    |  124 +++++++++
>  ewah/ewok.h        |  194 +++++++++++++++
>  ewah/ewok_rlw.h    |  114 +++++++++

Can we have a Documentation/technical/api-ewah.txt?

(Maybe if you insert all the comments I ask for in the below, it's not
necessary, but it would still be nice to have some central place where
the formats are documented.)

[...]
> +struct ewah_bitmap *bitmap_to_ewah(struct bitmap *bitmap)
> +{
> +	struct ewah_bitmap *ewah = ewah_new();
> +	size_t i, running_empty_words = 0;
> +	eword_t last_word = 0;
> +
> +	for (i = 0; i < bitmap->word_alloc; ++i) {
> +		if (bitmap->words[i] == 0) {
> +			running_empty_words++;
> +			continue;
> +		}
> +
> +		if (last_word != 0) {
> +			ewah_add(ewah, last_word);
> +		}

There are a lot of "noisy" braces -- like in this instance -- if you
apply the git style to the files in ewah/.  I assume we'll give the
directory its own style, so that it should always use braces even on
one-line blocks.

[...]
> +	ewah_add(ewah, last_word);
> +	return ewah;
> +}
> +
> +struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah)
> +{
> +	struct bitmap *bitmap = bitmap_new();
> +	struct ewah_iterator it;
> +	eword_t blowup;
> +	size_t i = 0;
> +
> +	ewah_iterator_init(&it, ewah);
> +
> +	while (ewah_iterator_next(&blowup, &it)) {
> +		if (i >= bitmap->word_alloc) {
> +			bitmap->word_alloc *= 1.5;

Any reason that this uses a scale factor of 1.5, while the bitmap_set
operation above uses 2?

> +			bitmap->words = ewah_realloc(
> +				bitmap->words, bitmap->word_alloc * sizeof(eword_t));
> +		}
[...]
> +
> +void bitmap_each_bit(struct bitmap *self, ewah_callback callback, void *data)
> +{
[...]
> +			for (offset = 0; offset < BITS_IN_WORD; ++offset) {
> +				if ((word >> offset) == 0)
> +					break;
> +
> +				offset += __builtin_ctzll(word >> offset);

Here and in the rest, you use __builtin_* within the code.  This needs
to be either in a separate helper that reimplements the function in
terms of C if it is not available (i.e. you don't use GCC).
(Alternatively, the whole series could be conditional on some
HAVE_GCC_BUILTINS macro.  I'd think that would be a bad tradeoff
though.)

> +				callback(pos + offset, data);
> +			}
> +			pos += BITS_IN_WORD;
> +		}
> +	}
> +}
[...]

> diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c
[...]
> +void ewah_free(struct ewah_bitmap *bitmap)
> +{
> +	if (bitmap->alloc_size)
> +		free(bitmap->buffer);
> +
> +	free(bitmap);
> +}

Maybe first if (!bitmap) return, so that it behaves like other free()s?

> diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
[...]
> +int ewah_serialize_native(struct ewah_bitmap *self, int fd)
> +{
> +	uint32_t write32;
> +	size_t to_write = self->buffer_size * 8;
> +
> +	/* 32 bit -- bit size fr the map */

You cut&pasted the typo ("for") throughout the file :-)

[...]
> +	/** 32 bit -- number of compressed 64-bit words */
> +	write32 = (uint32_t)self->buffer_size;
> +	if (write(fd, &write32, 4) != 4)
> +		return -1;
> +
> +	if (write(fd, self->buffer, to_write) != to_write)
> +		return -1;

Shouldn't you use our neat write_in_full() and read_in_full() helpers,
throughout the file?

[...]
> diff --git a/ewah/ewok.h b/ewah/ewok.h
[...]
> +#ifndef __EWOK_BITMAP_C__
> +#define __EWOK_BITMAP_C__

_H_?

> +#ifndef ewah_malloc
> +#	define ewah_malloc malloc
> +#endif
> +#ifndef ewah_realloc
> +#	define ewah_realloc realloc
> +#endif
> +#ifndef ewah_calloc
> +#	define ewah_calloc calloc
> +#endif

I see you later #define them to the corresponding x*alloc version in
pack-bitmap.h.  Good.

> +
> +typedef uint64_t eword_t;

I assume this isn't ifdef'd to help 32bit platforms because the on-disk
format depends on it?

> +#define BITS_IN_WORD (sizeof(eword_t) * 8)
> +
> +struct ewah_bitmap {
> +	eword_t *buffer;
> +	size_t buffer_size;
> +	size_t alloc_size;
> +	size_t bit_size;
> +	eword_t *rlw;
> +};
> +
> +typedef void (*ewah_callback)(size_t pos, void *);
> +
> +struct ewah_bitmap *ewah_pool_new(void);
> +void ewah_pool_free(struct ewah_bitmap *bitmap);

How do the pool versions differ from the non-pool versions below?  I
would have expected a memory pool argument somewhere.

> +
> +/**
> + * Allocate a new EWAH Compressed bitmap
> + */
> +struct ewah_bitmap *ewah_new(void);
> +
> +/**
> + * Clear all the bits in the bitmap. Does not free or resize
> + * memory.
> + */
> +void ewah_clear(struct ewah_bitmap *bitmap);
> +
> +/**
> + * Free all the memory of the bitmap
> + */
> +void ewah_free(struct ewah_bitmap *bitmap);
> +
> +int ewah_serialize(struct ewah_bitmap *self, int fd);
> +int ewah_serialize_native(struct ewah_bitmap *self, int fd);
> +
> +int ewah_deserialize(struct ewah_bitmap *self, int fd);
> +int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len);
> +int ewah_read_mmap_native(struct ewah_bitmap *self, void *map, size_t len);

The whole file is so neatly commented, and then you skimp on these? :-)

In particular, it would be nice to have a comment here on what the
_native distinction means, and what (if any) the constraints are if you
want to use _mmap.  Also, if you read or deserialize, does the 'self'
have to be initialized first?

[...]
> +/**
> + * Set a given bit on the bitmap.
> + *
> + * The bit at position `pos` will be set to true. Because of the
> + * way that the bitmap is compressed, a set bit cannot be unset
> + * later on.
> + *
> + * Furthermore, since the bitmap uses streaming compression, bits
> + * can only set incrementally.

I'm not a native speaker, but does 'incrementally' also mean 'in order
of increasing indexes'?  That's what the example seems to say.

> + *
> + * E.g.
> + *		ewah_set(bitmap, 1); // ok
> + *		ewah_set(bitmap, 76); // ok
> + *		ewah_set(bitmap, 77); // ok
> + *		ewah_set(bitmap, 8712800127); // ok
> + *		ewah_set(bitmap, 25); // failed, assert raised
> + */
> +void ewah_set(struct ewah_bitmap *self, size_t i);
> +
[...]
> +struct ewah_bitmap * bitmap_to_ewah(struct bitmap *bitmap);

Style (around the *).

> +struct bitmap *ewah_to_bitmap(struct ewah_bitmap *ewah);
> +
> +void bitmap_and_not_inplace(struct bitmap *self, struct bitmap *other);
> +void bitmap_or_inplace(struct bitmap *self, struct ewah_bitmap *other);

Why does one of them take an ewah_bitmap for 'other', but the other
takes a straight 'bitmap'?

> +
> +void bitmap_each_bit(struct bitmap *self, ewah_callback callback, void *data);
> +size_t bitmap_popcount(struct bitmap *self);
> +
> +#endif
> diff --git a/ewah/ewok_rlw.h b/ewah/ewok_rlw.h
> new file mode 100644
> index 0000000..2e31836
> --- /dev/null
> +++ b/ewah/ewok_rlw.h
> @@ -0,0 +1,114 @@
[...]
> +#define RLW_RUNNING_BITS (sizeof(eword_t) * 4)
> +#define RLW_LITERAL_BITS (sizeof(eword_t) * 8 - 1 - RLW_RUNNING_BITS)

It would be nice to have some minimal documentation of the word format
here (or in ewok.h), in particular because you snip off 1 bit here for a
reason that is not immediately obvious.

> +#define RLW_LARGEST_RUNNING_COUNT (((eword_t)1 << RLW_RUNNING_BITS) - 1)
> +#define RLW_LARGEST_LITERAL_COUNT (((eword_t)1 << RLW_LITERAL_BITS) - 1)
> +
> +#define RLW_LARGEST_RUNNING_COUNT_SHIFT (RLW_LARGEST_RUNNING_COUNT << 1)
> +
> +#define RLW_RUNNING_LEN_PLUS_BIT (((eword_t)1 << (RLW_RUNNING_BITS + 1)) - 1)

This one is doubly strange.  The name claims it's a bit(?), but the
definition (if you expand the preceding macros) effectively makes it
0x1ffffffff, i.e., a mask for RLW_RUNNING_BITS+1 number of bits.

> +static inline void rlw_xor_run_bit(eword_t *word)
> +{
> +	if (*word & 1) {
> +		*word &= (eword_t)(~1);
> +	} else {
> +		*word |= (eword_t)1;
> +	}
> +}

Why is this called xor?  Looks a lot like a negation to me.

> +static bool rlw_get_run_bit(const eword_t *word)
> +{
> +	return *word & (eword_t)1;
> +}
[...]
> +static inline void rlw_set_running_len(eword_t *word, eword_t l)
> +{
> +	*word |= RLW_LARGEST_RUNNING_COUNT_SHIFT;
> +	*word &= (l << 1) | (~RLW_LARGEST_RUNNING_COUNT_SHIFT);
> +}
> +
> +static inline void rlw_set_literal_words(eword_t *word, eword_t l)
> +{
> +	*word |= ~RLW_RUNNING_LEN_PLUS_BIT;
> +	*word &= (l << (RLW_RUNNING_BITS + 1)) | RLW_RUNNING_LEN_PLUS_BIT;
> +}

>From these I gather that the layout is, LSB first:

  1 bit:   bit that will be repeated
  32 bits: length of the run
  31 bits: number of literal words to be read after the run

Is that correct?  This took some figuring out for me, please add a
comment.

And then from there I would extrapolate that the data format requires
one such "specifier" word in between of chunks of stuff, but it's not
clear how exactly.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 10/16] pack-objects: use bitmaps when packing objects
  2013-06-24 23:23 ` [PATCH 10/16] pack-objects: use bitmaps when packing objects Vicent Marti
  2013-06-25 12:48   ` Ramkumar Ramachandra
@ 2013-06-25 15:58   ` Thomas Rast
  2013-06-25 23:06   ` Junio C Hamano
  2 siblings, 0 replies; 64+ messages in thread
From: Thomas Rast @ 2013-06-25 15:58 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti <tanoku@gmail.com> writes:

> diff --git a/Makefile b/Makefile
> index e03c773..0f2e72b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -703,6 +703,7 @@ LIB_H += notes.h
>  LIB_H += object.h
>  LIB_H += pack-revindex.h
>  LIB_H += pack.h
> +LIB_H += pack-bitmap.h
>  LIB_H += parse-options.h
>  LIB_H += patch-ids.h
>  LIB_H += pathspec.h
> @@ -838,6 +839,7 @@ LIB_OBJS += notes.o
>  LIB_OBJS += notes-cache.o
>  LIB_OBJS += notes-merge.o
>  LIB_OBJS += object.o
> +LIB_OBJS += pack-bitmap.o
>  LIB_OBJS += pack-check.o
>  LIB_OBJS += pack-revindex.o
>  LIB_OBJS += pack-write.o

What does this apply on?  When starting with the series from
origin/master, git-am fails, and 'git am -3' tells me I don't have the
necessary blobs (from the 'index' line above).

Not that it's super hard to fix this up as long as it's in the Makefile
only, but still.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-24 23:23 ` [PATCH 09/16] documentation: add documentation for the bitmap format Vicent Marti
  2013-06-25  5:42   ` Shawn Pearce
@ 2013-06-25 15:58   ` Thomas Rast
  2013-06-25 22:30     ` Vicent Martí
  1 sibling, 1 reply; 64+ messages in thread
From: Thomas Rast @ 2013-06-25 15:58 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti <tanoku@gmail.com> writes:

> This is the technical documentation and design rationale for the new
> Bitmap v2 on-disk format.

Hrmpf, that's what I get for reading the series in order...

> +			The folowing flags are supported:
                              ^^

typos marked by ^

> +		By storing all the hashes in a cache together with the bitmapsin
                                                                             ^^

> +		The obvious consequence is that the XOR of all 4 bitmaps will result
> +		in a full set (all bits sets), and the AND of all 4 bitmaps will
                                           ^

> +		- 1-byte XOR-offset
> +			The xor offset used to compress this bitmap. For an entry
> +			in position `x`, a XOR offset of `y` means that the actual
> +			bitmap representing for this commit is composed by XORing the
> +			bitmap for this entry with the bitmap in entry `x-y` (i.e.
> +			the bitmap `y` entries before this one).
> +
> +			Note that this compression can be recursive. In order to
> +			XOR this entry with a previous one, the previous entry needs
> +			to be decompressed first, and so on.
> +
> +			The hard-limit for this offset is 160 (an entry can only be
> +			xor'ed against one of the 160 entries preceding it). This
> +			number is always positivea, and hence entries are always xor'ed
                                                 ^

> +			with **previous** bitmaps, not bitmaps that will come afterwards
> +			in the index.

Clever.  Why 160 though?

> +		- 2 bytes of RESERVED data (used right now for better packing).

What do they mean?

> +  With an index at the end of the file, we can load only this index in memory,
> +  allowing for very efficient access to all the available bitmaps lazily (we
> +  have their offsets in the mmaped file).

Is there anything preventing you from mmap()ing the index also?

> +== Appendix A: Serialization format for an EWAH bitmap
> +
> +Ewah bitmaps are serialized in the protocol as the JAVAEWAH
> +library, making them backwards compatible with the JGit
> +implementation:
> +
> +	- 4-byte number of bits of the resulting UNCOMPRESSED bitmap
> +
> +	- 4-byte number of words of the COMPRESSED bitmap, when stored
> +
> +	- N x 8-byte words, as specified by the previous field
> +
> +		This is the actual content of the compressed bitmap.
> +
> +	- 4-byte position of the current RLW for the compressed
> +		bitmap
> +
> +Note that the byte order for this serialization is not defined by
> +default. The byte order for all the content in a serialized EWAH
> +bitmap can be known by the byte order flags in the header of the
> +bitmap index file.

Please document the RLW format here.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 00/16] Speed up Counting Objects with bitmap data
  2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
                   ` (15 preceding siblings ...)
  2013-06-24 23:23 ` [PATCH 16/16] rev-list: Optimize --count using bitmaps too Vicent Marti
@ 2013-06-25 16:05 ` Thomas Rast
  16 siblings, 0 replies; 64+ messages in thread
From: Thomas Rast @ 2013-06-25 16:05 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti <tanoku@gmail.com> writes:

> Like with every other patch that offers performance improvements,
> sample benchmarks are provided (spoiler: they are pretty fucking
> cool).

Great stuff.

I read the first half, and skimmed the second half.  See the individual
replies for comments.

However:

>  Documentation/technical/bitmap-format.txt |  235 ++++++++
>  Makefile                                  |   11 +
>  builtin.h                                 |    1 +
>  builtin/pack-objects.c                    |  362 +++++++-----
>  builtin/pack-objects.h                    |   33 ++
>  builtin/rev-list.c                        |   35 +-
>  builtin/write-bitmap.c                    |  256 +++++++++
>  cache.h                                   |    5 +
>  ewah/bitmap.c                             |  229 ++++++++
>  ewah/ewah_bitmap.c                        |  703 ++++++++++++++++++++++++
>  ewah/ewah_io.c                            |  199 +++++++
>  ewah/ewah_rlw.c                           |  124 +++++
>  ewah/ewok.h                               |  194 +++++++
>  ewah/ewok_rlw.h                           |  114 ++++
>  git-compat-util.h                         |   28 +
>  git-repack.sh                             |   10 +-
>  git.c                                     |    1 +
>  khash.h                                   |  329 +++++++++++
>  list-objects.c                            |    1 +
>  pack-bitmap-write.c                       |  520 ++++++++++++++++++
>  pack-bitmap.c                             |  855 +++++++++++++++++++++++++++++
>  pack-bitmap.h                             |   64 +++
>  pack-write.c                              |    2 +
>  revision.c                                |    5 +
>  revision.h                                |    2 +
>  sha1_file.c                               |   57 +-

It's pretty hard to miss that there isn't a single test in the entire
series.  It seems that the features you add depend on pack.usebitmaps,
and since the tests run with empty config (unless of course they set
their own) your feature is completely untested -- unless I'm missing
something.

I imagine the tests would be of the format

test_expect_success 'do <stuff> without bitmaps' '
	git ... >expect
'

test_expect_success 'do <stuff> with bitmaps' '
	test_config pack.usebitmaps true &&
	# do something to ensure that we have bitmaps
	git ... >actual &&
	test_cmp expect actual
'

or some such.

For bonus points, you could also add some light performance tests in
t/perf/, just to show off ;-)

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists
  2013-06-24 23:23 ` [PATCH 11/16] rev-list: add bitmap mode to speed up lists Vicent Marti
@ 2013-06-25 16:22   ` Thomas Rast
  2013-06-26  1:45     ` Vicent Martí
  2013-06-26  5:22     ` Jeff King
  0 siblings, 2 replies; 64+ messages in thread
From: Thomas Rast @ 2013-06-25 16:22 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti <tanoku@gmail.com> writes:

> Calling `git rev-list --use-bitmaps [committish]` is the equivalent
> of `git rev-list --objects`, but the rev list is performed based on
> a bitmap result instead of using a manual counting objects phase.

Why would we ever want to not --use-bitmaps, once it actually works?
I.e., shouldn't this be the default if pack.usebitmaps is set (or
possibly even core.usebitmaps for these things)?

> These are some example timings for `torvalds/linux`:
>
> 	$ time ../git/git rev-list --objects master > /dev/null
>
> 	real    0m25.567s
> 	user    0m25.148s
> 	sys     0m0.384s
>
> 	$ time ../git/git rev-list --use-bitmaps master > /dev/null
>
> 	real    0m0.393s
> 	user    0m0.356s
> 	sys     0m0.036s

I see your badass numbers, and raise you a critical issue:

  $ time git rev-list --use-bitmaps --count --left-right origin/pu...origin/next
  Segmentation fault

  real    0m0.408s
  user    0m0.383s
  sys     0m0.022s

It actually seems to be related solely to having negated commits in the
walk:

  thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count origin/pu
  32315

  real    0m0.041s
  user    0m0.034s
  sys     0m0.006s
  thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count origin/pu ^origin/next
  Segmentation fault

  real    0m0.460s
  user    0m0.214s
  sys     0m0.244s

I also can't help noticing that the time spent generating the segfault
would have sufficed to generate the answer "the old way" as well:

  $ time git rev-list --count --left-right origin/pu...origin/next
  189     125

  real    0m0.409s
  user    0m0.386s
  sys     0m0.022s

Can we use the same trick to speed up merge base computation and then
--left-right?  The latter is a component of __git_ps1 and can get
somewhat slow in some cases, so it would be nice to make it really fast,
too.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/16] pack-objects: use a faster hash table
  2013-06-24 23:23 ` [PATCH 03/16] pack-objects: use a faster hash table Vicent Marti
  2013-06-25 14:03   ` Thomas Rast
@ 2013-06-25 17:58   ` Ramkumar Ramachandra
  2013-06-25 22:48   ` Junio C Hamano
  2 siblings, 0 replies; 64+ messages in thread
From: Ramkumar Ramachandra @ 2013-06-25 17:58 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti wrote:
> When replacing the old hash table implementation in `pack-objects` with
> the khash_sha1 table, the insertion time is greatly reduced:

Why?  What is the exact change?

> The big win here, however, is in the massively reduced amount of hash
> collisions

Okay, so there seems to be some problem with how collisions are
handled in the hashtable.

> -static int locate_object_entry_hash(const unsigned char *sha1)
> -{
> -       int i;
> -       unsigned int ui;
> -       memcpy(&ui, sha1, sizeof(unsigned int));
> -       i = ui % object_ix_hashsz;
> -       while (0 < object_ix[i]) {
> -               if (!hashcmp(sha1, objects[object_ix[i] - 1].idx.sha1))
> -                       return i;
> -               if (++i == object_ix_hashsz)
> -                       i = 0;
> -       }
> -       return -1 - i;
> -}

Classical chaining to handle collisions: very naive.  Deserves to be thrown out.

> -static void rehash_objects(void)
> -{
> -       uint32_t i;
> -       struct object_entry *oe;
> -
> -       object_ix_hashsz = nr_objects * 3;
> -       if (object_ix_hashsz < 1024)
> -               object_ix_hashsz = 1024;
> -       object_ix = xrealloc(object_ix, sizeof(int) * object_ix_hashsz);
> -       memset(object_ix, 0, sizeof(int) * object_ix_hashsz);
> -       for (i = 0, oe = objects; i < nr_objects; i++, oe++) {
> -               int ix = locate_object_entry_hash(oe->idx.sha1);
> -               if (0 <= ix)
> -                       continue;
> -               ix = -1 - ix;
> -               object_ix[ix] = i + 1;
> -       }
> -}

This is called when the hashtable runs out of space.  It didn't appear
in your profiler because it doesn't appear to be a bottleneck, right?
Growing aggressively to 3x times the number of objects probably
explains it.  Just for comparison, how does khash grow?

>  static struct object_entry *locate_object_entry(const unsigned char *sha1)
>  {
> -       int i;
> +       khiter_t pos = kh_get_sha1(packed_objects, sha1);
>
> -       if (!object_ix_hashsz)
> -               return NULL;
> +       if (pos < kh_end(packed_objects)) {

Wait, why is this required?  When will kh_get_sha1() return a position
beyond kh_end()?  What does that mean?

> +               return kh_value(packed_objects, pos);
> +       }
>
> -       i = locate_object_entry_hash(sha1);
> -       if (0 <= i)
> -               return &objects[object_ix[i]-1];
>         return NULL;
>  }

Overall, replaced call to locate_object_entry_hash() with a call to
kh_get_sha1().  Okay.

> -static int add_object_entry(const unsigned char *sha1, enum object_type type,
> -                           const char *name, int exclude)
> +static int add_object_entry_1(const unsigned char *sha1, enum object_type type,
> +                           uint32_t hash, int exclude, struct packed_git *found_pack,
> +                               off_t found_offset)
>  {
>         struct object_entry *entry;
> -       struct packed_git *p, *found_pack = NULL;
> -       off_t found_offset = 0;
> -       int ix;
> -       unsigned hash = name_hash(name);
> +       struct packed_git *p;
> +       khiter_t ix;
> +       int hash_ret;
>
> -       ix = nr_objects ? locate_object_entry_hash(sha1) : -1;
> -       if (ix >= 0) {
> +       ix = kh_put_sha1(packed_objects, sha1, &hash_ret);

You don't need to call locate_object_entry() to check for collisions
because kh_put_sha1() takes care of that?

> +       if (hash_ret == 0) {
>                 if (exclude) {
> -                       entry = objects + object_ix[ix] - 1;
> +                       entry = kh_value(packed_objects, ix);

Superficial change: using kh_value(), because we stripped out the
chaining logic.

> @@ -966,19 +965,30 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>                 entry->in_pack_offset = found_offset;
>         }
>
> -       if (object_ix_hashsz * 3 <= nr_objects * 4)
> -               rehash_objects();
> -       else
> -               object_ix[-1 - ix] = nr_objects;
> +       kh_value(packed_objects, ix) = entry;
> +       kh_key(packed_objects, ix) = entry->idx.sha1;
> +       objects[nr_objects++] = entry;

Wait, what?  Why didn't you use kh_put_sha1()?

I didn't look very carefully, but the patch seems to be okay overall.
On the issue of which hashtable replacement to use (why khash, and not
something else?), I briefly looked at linux.git's linux/hashtable.h
and git.git's hash.h; both of them are chaining hashes.  From a brief
look at khash.h, it seems to be somewhat less naive and sane: my only
concern is that it is written entirely in using CPP macros which is a
great for syntax/performance, but not-so-great for debugging.  I don't
know if there's a better off-the-shelf implementation out there, but I
haven't been looking for one either.  By the way, it's MIT license
authored by an anonymous person (sources at:
https://github.com/attractivechaos/klib/blob/master/khash.h), but I
don't know if that's a problem.

Thanks.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-25  5:42   ` Shawn Pearce
@ 2013-06-25 19:33     ` Vicent Martí
  2013-06-25 21:17       ` Junio C Hamano
  2013-06-26  5:11       ` Jeff King
  0 siblings, 2 replies; 64+ messages in thread
From: Vicent Martí @ 2013-06-25 19:33 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Colby Ranger, git

On Tue, Jun 25, 2013 at 7:42 AM, Shawn Pearce <spearce@spearce.org> wrote:
> I very much hate seeing a file format that is supposed to be portable
> that supports both big-endian and little-endian encoding.

Well, the bitmap index is not supposed to be portable, as it doesn't
get sent over the wire in any situation. Regardless, the format is
portable because it supports both encodings and clearly defines which
one the current file is using. I think that's a good tradeoff!

> Such a specification forces everyone to implement two code paths to handle
> reading data from the file, on the off-chance they are on the wrong
> platform.

Extra code paths have never been an issue in the
JGitAbstractFactoryGeneratorOfGit, har har har. Ah. I'm such a funny
guy when it comes to Java.

Anyway, I designed this keeping JGit in mind. In this specific case,
it doesn't force you to add any new code paths. The endianness changes
only affect the serialization format of the bitmaps, which is not part
of Git or JGit itself but of the Javaewah/libewok library. The
interface for reading on that library has already been wisely
abstracted on JGit
(https://github.com/eclipse/jgit/blob/master/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/file/PackBitmapIndexV1.java#L133),
so changing the byte order simply means changing the SimpleDataInput
to a LE one.

I agree this is not ideal, or elegant, but I'm having a hard time
making an argument for sacrificing objective speed for the sake of
subjective "simplicity".

> What is wrong with picking one encoding and sticking to it?

It prevents us from making this optimally fast on the machines where
it needs to be.

Regardless, I must admit I haven't generated numbers for this in a
while (the BE->LE switch is one of the first optimizations I did). I'm
going to try to re-implement full NWO loading and see how much slower
it is, before I continue arguing for/against it.

If I can get it within a reasonable margin (say, 15%) of the current
implementation, I'd definitely be in favor of sticking to only NWO on
the whole file. If it's slower than that, well, Git has never
compromised on speed, and I don't think there's a point to be made for
starting to do that now.

>> +                       - BITMAP_OPT_HASH_CACHE (0x8)
>> +                       If present, a hash cache for finding delta bases will be available
>> +                       right after the header block in this index. See the following
>> +                       section for details.
>> +
>> +               4-byte entry count (network byte order)
>> +
>> +                       The total count of entries (bitmapped commits) in this bitmap index.
>> +
>> +               20-byte checksum
>> +
>> +                       The SHA1 checksum of the pack this bitmap index belongs to.
>> +
>> +       - An OPTIONAL delta cache follows the header.
>
> Some may find the name "delta cache" confusing as it does not cache
> deltas of objects. May I suggest "path hash cache" as an alternative
> name?

Definitely, this is a typo.

>> +               The cache is formed by `n` 4-byte hashes in a row, where `n` is
>> +               the amount of objects in the indexed packfile. Note that this amount
>> +               is the **total number of objects** and is not related to the
>> +               number of commits that have been selected and indexed in the
>> +               bitmap index.
>> +
>> +               The hashes are stored in Network Byte Order and they are the same
>> +               values generated by a normal revision walk during the `pack-objects`
>> +               phase.
>
> I find it interesting this is network byte order and not big-endian or
> little-endian based on the flag in the header.

As stated before, the flag in the header only affects the
Javaewah/libewok interface. Everything Git-related in the bitmap index
is in NWO, like it is customary in Git.

>
>> +               The `n`nth hash in the cache is the name hash for the `n`th object
>> +               in the index for the indexed packfile.
>> +
>> +               [RATIONALE]:
>> +
>> +               The bitmap index allows us to skip the Counting Objects phase
>> +               during `pack-objects` and yield all the OIDs that would be reachable
>> +               ("WANTS") when generating the pack.
>> +
>> +               This optimization, however, means that we're adding objects to the
>> +               packfile straight from the packfile index, and hence we are lacking
>> +               path information for the objects that would normally be generated
>> +               during the "Counting Objects" phase.
>> +
>> +               This path information for each object is hashed and used as a very
>> +               effective way to find good delta bases when compressing the packfile;
>> +               without these hashes, the resulting packfiles are much less optimal.
>> +
>> +               By storing all the hashes in a cache together with the bitmapsin
>> +               the bitmap index, we can yield not only the SHA1 of all the reachable
>> +               objects, but also their hashes, and allow Git to be much smarter when
>> +               finding delta bases for packing.
>> +
>> +               If the delta cache is not available, the bitmap index will obviously
>> +               be smaller in disk, but the packfiles generated using this index will
>> +               be between 20% and 30% bigger, because of the lack of name/path
>> +               information when finding delta bases.
>
> JGit does not encode this because we were afraid of freezing the hash
> function into the file format. Indeed we are not certain JGit even
> uses the same path hash function as C Git does, because C Git's
> implementation is covered by the GPL and JGit prefers to license its
> work under BSD.
>
> If the path hash is going to become part of the format, the algorithm
> for computing the hash should also be specified in the format so that
> non-GPL implementations have an opportunity to be compatible.

Very valid point. I would hope that whoever wrote the original hash
will let us re-license it under BSD. It would be nice to define this
hash function as part of the format.

> One way we side-stepped the size inflation problem in JGit was to only
> use the bitmap index information when sending data on the wire to a
> client. Here delta reuse plays a significant factor in building the
> pack, and we don't have to be as accurate on matching deltas. During
> the equivalent of `git repack` bitmaps are not used, allowing the
> traditional graph enumeration algorithm to generate path hash
> information.

OH BOY HERE WE GO. This is worth its own thread, lots to discuss here.
I think peff will have a patchset regarding this to upstream soon,
we'll get back to it later.

>
>> +       - 4 EWAH bitmaps that act as type indexes
>> +
>> +               Type indexes are serialized after the hash cache in the shape
>> +               of four EWAH bitmaps stored consecutively (see Appendix A for
>> +               the serialization format of an EWAH bitmap).
>> +
>> +               There is a bitmap for each Git object type, stored in the following
>> +               order:
>> +
>> +                       - Commits
>> +                       - Trees
>> +                       - Blobs
>> +                       - Tags
>> +
>> +               In each bitmap, the `n`th bit is set to true if the `n`th object
>> +               in the packfile index is of that type.
>> +
>> +               The obvious consequence is that the XOR of all 4 bitmaps will result
>> +               in a full set (all bits sets), and the AND of all 4 bitmaps will
>> +               result in an empty bitmap (no bits set).
>
> Instead of XOR did you mean OR here?

Nope, I think XOR makes it more obvious: if the same bit is set on two
bitmaps, it would be cleared when XORed together, and hence all the
bits wouldn't be set. An OR would hide this case.

>
>> +       - N EWAH bitmaps, one for each indexed commit
>> +
>> +               Where `N` is the total amount of entries in this bitmap index.
>> +               See Appendix A for the serialization format of an EWAH bitmap.
>> +
>> +       - An entry index with `N` entries for the indexed commits
>> +
>> +               Index entries are stored consecutively, and each entry has the
>> +               following format:
>> +
>> +               - 20-byte SHA1
>> +                       The SHA1 of the commit that this bitmap indexes
>> +
>> +               - 4-byte offset (Network Byte Order)
>> +                       The offset **from the beginning of the file** where the
>> +                       bitmap for this commit is stored.
>
> Eh, another network byte order field in a file that also has selective
> ordering. *sigh*

:D :D :D again only the Javaewah interface.

>
>> +               - 1-byte XOR-offset
>> +                       The xor offset used to compress this bitmap. For an entry
>> +                       in position `x`, a XOR offset of `y` means that the actual
>> +                       bitmap representing for this commit is composed by XORing the
>> +                       bitmap for this entry with the bitmap in entry `x-y` (i.e.
>> +                       the bitmap `y` entries before this one).
>> +
>> +                       Note that this compression can be recursive. In order to
>> +                       XOR this entry with a previous one, the previous entry needs
>> +                       to be decompressed first, and so on.
>> +
>> +                       The hard-limit for this offset is 160 (an entry can only be
>> +                       xor'ed against one of the 160 entries preceding it). This
>> +                       number is always positivea, and hence entries are always xor'ed
>> +                       with **previous** bitmaps, not bitmaps that will come afterwards
>> +                       in the index.
>
> What order are these entries in? Sorted by SHA-1 or random?

Not-specified, since the whole index will be loaded in a hash table
like JGit does. In practice, it's toposorted because that makes for
better XOR bases.

> Colby found that doing an XOR against the descendant commit yielded
> very small bitmaps, so JGit tries to XOR-compress bitmaps along common
> linear slices of history. This is trivial in Linus' kernel tree where
> there is effectively only one history, but its more relevant with
> long-running side branches that have release tags that may not have
> fully merged into "master".

Indeed, indeed. We try to do the same.

>> +  This full parse, however, requires prohibitive loading times in LE
>> +  machines (i.e. all modern server hardware): a repository like
>> +  `torvalds/linux` can have about 8mb of bitmap indexes, resulting
>> +  in roughly 400ms of parse time.
>
> This makes me wonder what the JGit parse time is. It is ugly if we are
> spending 400ms to load the bitmap index for the kernel repository.

It's very bad. Of course you don't notice it because you're running in
`daemon` mode. It takes about 3s to load indexes for the
`torvalds/linux` network in my machine.

>
>> +  This is not an issue in JGit, which is capable of serving repositories
>> +  from a single-process daemon running on the JVM, but `git-daemon` in
>> +  git has been implemented with a process-based design (a new
>> +  `pack-objects` is spawned for each request), and the boot times
>> +  of parsing the bitmap index every time `pack-objects` is spawned can
>> +  seriously slow down requests (particularly for small fetches, where we'd
>> +  spend about 1.5s booting up and 300ms performing the Counting Objects
>> +  phase).
>
> There are other strategies that Git could use to handle request
> processing at scale. But I guess its reasonable to assume these aren't
> viable for Git for a number of reasons. E.g. "long tail" access effect
> that many servers have, where most requests are to a large number of
> repositories that themselves receive very few requests, an environment
> that does not lend itself to caching.
>
>> +  By storing the bitmaps in Little-Endian, we're able to `mmap` their
>> +  compressed data straight in memory without parsing it beforehand, and
>> +  since most queries don't require accessing all the serialized bitmaps,
>> +  we'll only page in the minimal amount of bitmaps necessary to perform
>> +  the reachability analysis as they are accessed.
>
> FWIW the .idx and .pack file formats `mmap` the compressed data
> straight into memory without parsing it beforehand, and do not use
> little-endian byte order. It is possible to have a single compressed
> file format definition that is portable to all architectures, and is
> accessed by mmap, at scale, with reasonable efficiency.
>
>> +- An index of all the bitmapped commits is written at the end of the packfile,
>> +  instead of interpersed with the serialized bitmaps in the middle of the
>> +  file.
>
> This is probably a mistake in the JGit design. Your approach is
> slightly more complex, but in general I agree with having a table of
> the SHA-1s isolated from the bitmaps themselves so that a reader can
> access specific bitmaps at random without needing to wade through all
> compressed bitmaps.
>
> I would have proposed putting the table at the start of the file, not
> the end. The writer making the file can completely serialize the
> bitmaps into memory before writing them to disk, and thus knows the
> full layout of the resulting file. If the bitmaps don't fit in RAM at
> writing time, game over, the optimization of having a very compact
> representation of the graph is no longer helping you.
>
>> +  Again, the old design implied a full parse of the whole bitmap index
>> +  (which JGit can afford because its daemon is single-process), but it made
>> +  impossible `mmaping` the bitmap index file and accessing only the parts
>> +  required to actually solve the query.
>> +
>> +  With an index at the end of the file, we can load only this index in memory,
>> +  allowing for very efficient access to all the available bitmaps lazily (we
>> +  have their offsets in the mmaped file).
>> +
>> +- The ordering of the objects in each bitmap has changed from
>> +  packfile-order (the nth bit in the bitmap is the nth object in the
>> +  packfile) to index-order (the nth bit in the bitmap is the nth object
>> +  in the INDEX of the packfile).
>
> Did you notice an increase in bitmap size when you did this? Colby
> tested both orderings and we observed the bitmaps were quantifiably
> smaller when using the pack file ordering, due to the pack file
> locality rules and the EWAH compression. Using the pack file ordering
> was a very conscious design decision.
>
>> +  There is not a noticeable performance difference when actually converting
>> +  from bitmap position to SHA1 and from SHA1 to bitmap position, but when
>> +  using packfile ordering like JGit does, queries need to go through the
>> +  reverse index (pack-revindex.c).
>> +
>> +  Generating this reverse index at runtime is **not** free (around 900ms
>> +  generation time for a repository like `torvalds/linux`), and once again,
>> +  this generation time needs to happen every time `pack-objects` is
>> +  spawned.
>
> Did you know the packer needs the reverse index in order to compute
> the end offset of an object it will copy as-is during delta reuse? How
> have you avoided making the reverse index?

I'm aware of that, but there are lots of other operations that can be
optimized with bitmaps that don't require a reverse index.

> Again this is why we chose to pin the JGit bitmap on the reverse index
> being present. It already had to be present to support as-is reuse.
> Once we knew we had to have that reverse index it was OK to rely on it
> to get better compression on the bitmaps, and thus make them take up
> less memory when loaded into a server. Even if you mmap a file you
> want it to be small so it is more likely to retain in the kernel
> buffer cache across process invocations.

Maybe this applies to the JVM (where you have to load the whole
index), but bitmap indexes are consistently one order of magnitude
smaller than packfile indexes (regardless of whether they use index
ordering or packfile ordering), and the packfile indexes are always
mapped on memory: This has never been an issue in Git, and I don't see
why it would become an issue now.

Pinning the bitmap index on the reverse index adds complexity (lookups
are two-step: first find the entry in the reverse index, and then find
the SHA1 in the index) and is measurably slower, in both loading and
lookup times. Since Git doesn't have a memory problem, it's very hard
to make an argument for design that is more complex and runs slower to
save memory.

To sum it up: I'd like to see this format be strictly in Network Byte
Order, and I'm going to try to make it run fast enough in that
encoding. Having the entry index at the end of the file and having the
bitmaps in index-order are Good Ideas (TM) because they are measurably
simpler and faster than their counterpoints. Do show code & benchmarks
if you think otherwise, though.

strawberry and watermelon kisses,
vmg

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-25 19:33     ` Vicent Martí
@ 2013-06-25 21:17       ` Junio C Hamano
  2013-06-25 22:08         ` Vicent Martí
  2013-06-26  5:11       ` Jeff King
  1 sibling, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2013-06-25 21:17 UTC (permalink / raw)
  To: Vicent Martí; +Cc: Shawn Pearce, Colby Ranger, git

Vicent Martí <tanoku@gmail.com> writes:

>>> +               There is a bitmap for each Git object type, stored in the following
>>> +               order:
>>> +
>>> +                       - Commits
>>> +                       - Trees
>>> +                       - Blobs
>>> +                       - Tags
>>> +
>>> +               In each bitmap, the `n`th bit is set to true if the `n`th object
>>> +               in the packfile index is of that type.
>>> +
>>> +               The obvious consequence is that the XOR of all 4 bitmaps will result
>>> +               in a full set (all bits sets), and the AND of all 4 bitmaps will
>>> +               result in an empty bitmap (no bits set).
>>
>> Instead of XOR did you mean OR here?
>
> Nope, I think XOR makes it more obvious: if the same bit is set on two
> bitmaps, it would be cleared when XORed together, and hence all the
> bits wouldn't be set. An OR would hide this case.

What case are you talking about?

The n-th object must be one of these four types and can never be of
more than one type at the same time, so a natural expectation from
the reader is "If you OR them together, you will get the same set".
If you say "If you XOR them", that forces the reader to wonder when
these bitmaps ever can overlap at the same bit position.

> To sum it up: I'd like to see this format be strictly in Network Byte
> Order,

Good.

I've been wondering what you meant by "cannot be mmap-ed" from the
very beginning.  We mmapped the index for a long time, and it is
defined in terms of network byte order.  Of course, pack .idx files
are in network byte order, too, and we mmap them without problems.
It seems that it primarily came from your fear that using network
byte order may be unnecessarily hard to perform well, and it would
be a good thing to do to try to do so first instead of punting from
the beginning.

> and I'm going to try to make it run fast enough in that
> encoding.

Hmph.  Is it an option to start from what JGit does, so that people
can use both JGit and your code on the same repository?  And then if
you do not succeed, after trying to optimize in-core processing
using that on-disk format to make it fast enough, start thinking
about tweaking the on-disk format?

Thanks.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-25 21:17       ` Junio C Hamano
@ 2013-06-25 22:08         ` Vicent Martí
  2013-06-27  1:11           ` Shawn Pearce
  0 siblings, 1 reply; 64+ messages in thread
From: Vicent Martí @ 2013-06-25 22:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn Pearce, Colby Ranger, git

On Tue, Jun 25, 2013 at 11:17 PM, Junio C Hamano <gitster@pobox.com> wrote:
> What case are you talking about?
>
> The n-th object must be one of these four types and can never be of
> more than one type at the same time, so a natural expectation from
> the reader is "If you OR them together, you will get the same set".
> If you say "If you XOR them", that forces the reader to wonder when
> these bitmaps ever can overlap at the same bit position.

I guess this is just wording. I don't particularly care about the
distinction, but I'll change it to OR.

>
>> To sum it up: I'd like to see this format be strictly in Network Byte
>> Order,
>
> Good.
>
> I've been wondering what you meant by "cannot be mmap-ed" from the
> very beginning.  We mmapped the index for a long time, and it is
> defined in terms of network byte order.  Of course, pack .idx files
> are in network byte order, too, and we mmap them without problems.
> It seems that it primarily came from your fear that using network
> byte order may be unnecessarily hard to perform well, and it would
> be a good thing to do to try to do so first instead of punting from
> the beginning.

It cannot be mmapped not particularly because of endianness issues,
but because the original format is not indexed and requires a full
parse of the whole index before it can be accessed programatically.
The wrong endianness just increases the parse time.

>
>> and I'm going to try to make it run fast enough in that
>> encoding.
>
> Hmph.  Is it an option to start from what JGit does, so that people
> can use both JGit and your code on the same repository?  And then if
> you do not succeed, after trying to optimize in-core processing
> using that on-disk format to make it fast enough, start thinking
> about tweaking the on-disk format?

I'm afraid this is not an option. I have an old patchset that
implements JGit v1 bitmap loading (and in fact that's how I initially
developed these series -- by loading the bitmaps from JGit for
debugging), but I discarded it because it simply doesn't pan out in
production. ~3 seconds time to spawn `upload-pack` is not an option
for us. I did not develop a tweaked on-disk format out of boredom.

I could dig up the patch if you're particularly interested in
backwards compatibility, but since it was several times slower than
the current iteration, I have no interest (time, actually) to maintain
it, brush it up, and so on. I have already offered myself to port the
v2 format to JGit as soon as it's settled. It sounds like a better
investment of all our times.

Following up on Shawn's comments, I removed the little-endian support
from the on-disk format and implemented lazy loading of the bitmaps to
make up for it. The result is decent (slowed down from 250ms to 300ms)
and it lets us keep the whole format as NWO on disk. I think it's a
good tradeback.

The relevant commits are available on my fork of Git (I'll be sending
v2 of the patchset once I finish tackling the other reviews):

    https://github.com/vmg/git/commit/d6cdd4329a547580bbc0143764c726c48b887271
    https://github.com/vmg/git/commit/d8ec342fee87425e05c0db1e1630db8424612c71

As it stands right now, the only two changes from v1 of the on-disk format are:

- There is an index at the end. This is a good idea.
- The bitmaps are sorted in packfile-index order, not in packfile
order. This is a good idea.

As always, all your feedback is appreciated, but please keep in mind I
have strict performance concerns.

German kisses,
vmg

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-25 15:58   ` Thomas Rast
@ 2013-06-25 22:30     ` Vicent Martí
  2013-06-26 23:12       ` Thomas Rast
  0 siblings, 1 reply; 64+ messages in thread
From: Vicent Martí @ 2013-06-25 22:30 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git

On Tue, Jun 25, 2013 at 5:58 PM, Thomas Rast <trast@inf.ethz.ch> wrote:
>
>> This is the technical documentation and design rationale for the new
>> Bitmap v2 on-disk format.
>
> Hrmpf, that's what I get for reading the series in order...
>
>> +                     The folowing flags are supported:
>                               ^^
>
> typos marked by ^
>
>> +             By storing all the hashes in a cache together with the bitmapsin
>                                                                              ^^
>
>> +             The obvious consequence is that the XOR of all 4 bitmaps will result
>> +             in a full set (all bits sets), and the AND of all 4 bitmaps will
>                                            ^
>
>> +             - 1-byte XOR-offset
>> +                     The xor offset used to compress this bitmap. For an entry
>> +                     in position `x`, a XOR offset of `y` means that the actual
>> +                     bitmap representing for this commit is composed by XORing the
>> +                     bitmap for this entry with the bitmap in entry `x-y` (i.e.
>> +                     the bitmap `y` entries before this one).
>> +
>> +                     Note that this compression can be recursive. In order to
>> +                     XOR this entry with a previous one, the previous entry needs
>> +                     to be decompressed first, and so on.
>> +
>> +                     The hard-limit for this offset is 160 (an entry can only be
>> +                     xor'ed against one of the 160 entries preceding it). This
>> +                     number is always positivea, and hence entries are always xor'ed
>                                                  ^
>
>> +                     with **previous** bitmaps, not bitmaps that will come afterwards
>> +                     in the index.
>
> Clever.  Why 160 though?

JGit implementation detail. It's the equivalent of the delta-window in
`pack-objects` for example.

HINT HINT: in practice, JGit only looks 16 positions behind to find
deltas, and we do the same. So the practical limit is 16. harhar

>
>> +             - 2 bytes of RESERVED data (used right now for better packing).
>
> What do they mean?
>
>> +  With an index at the end of the file, we can load only this index in memory,
>> +  allowing for very efficient access to all the available bitmaps lazily (we
>> +  have their offsets in the mmaped file).
>
> Is there anything preventing you from mmap()ing the index also?

Yeah, this format allows you to easily do a SHA1 bsearch with custom
step to lookup entries on the bitmap index, except for the fact that
the index is not sorted by SHA1, so you'd need a linear search
instead. :)

I decided against it because during most complex invocations of
`pack-objects`, we perform a couple thousand commit lookups to see if
they have a bitmap in the index, so it makes a lot of sense to load
the index tightly in a hash table before hand (which takes very little
time, to be fair). We more-than-make up for the loading time by having
much much faster lookups. I felt it was the right tradeoff (JGit does
the same, but in their case, because they cannot mmap. :p)

>> +== Appendix A: Serialization format for an EWAH bitmap
>> +
>> +Ewah bitmaps are serialized in the protocol as the JAVAEWAH
>> +library, making them backwards compatible with the JGit
>> +implementation:
>> +
>> +     - 4-byte number of bits of the resulting UNCOMPRESSED bitmap
>> +
>> +     - 4-byte number of words of the COMPRESSED bitmap, when stored
>> +
>> +     - N x 8-byte words, as specified by the previous field
>> +
>> +             This is the actual content of the compressed bitmap.
>> +
>> +     - 4-byte position of the current RLW for the compressed
>> +             bitmap
>> +
>> +Note that the byte order for this serialization is not defined by
>> +default. The byte order for all the content in a serialized EWAH
>> +bitmap can be known by the byte order flags in the header of the
>> +bitmap index file.
>
> Please document the RLW format here.

Har har. I was going to comment on your review of the Ewah patchset,
but might as well do it here: the only thing I know about Ewah bitmaps
is that they work. And I know this because I did extensive fuzz
testing of my C port. Unfortunately, the original Java code I ported
from has 0 comments, so any documentation here would have to be
reverse-engineered.

Personally, I'd lean towards considering Ewah an external dependency
(black box); the headers for the library are commented accordingly,
clearly explaining the interfaces while hiding implementation details.
Of course, you're welcome to help me reverse engineer the
implementation, but I'm not sure this would be of much value. It'd be
better to make sure it passes the extensive test suite of the Java
version, and assume that Mr Lemire designed a sound format for the
bitmaps.

Swiss kisses,
vmg

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/16] pack-objects: use a faster hash table
  2013-06-24 23:23 ` [PATCH 03/16] pack-objects: use a faster hash table Vicent Marti
  2013-06-25 14:03   ` Thomas Rast
  2013-06-25 17:58   ` Ramkumar Ramachandra
@ 2013-06-25 22:48   ` Junio C Hamano
  2013-06-25 23:09     ` Vicent Martí
  2 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2013-06-25 22:48 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti <tanoku@gmail.com> writes:

> @@ -901,19 +896,19 @@ static int no_try_delta(const char *path)
>  	return 0;
>  }
>  
> -static int add_object_entry(const unsigned char *sha1, enum object_type type,
> -			    const char *name, int exclude)
> +static int add_object_entry_1(const unsigned char *sha1, enum object_type type,
> +			    uint32_t hash, int exclude, struct packed_git *found_pack,
> +				off_t found_offset)
>  {
>  	struct object_entry *entry;
> -	struct packed_git *p, *found_pack = NULL;
> -	off_t found_offset = 0;
> -	int ix;
> -	unsigned hash = name_hash(name);
> +	struct packed_git *p;
> +	khiter_t ix;
> +	int hash_ret;
>  
> -	ix = nr_objects ? locate_object_entry_hash(sha1) : -1;
> -	if (ix >= 0) {
> +	ix = kh_put_sha1(packed_objects, sha1, &hash_ret);
> +	if (hash_ret == 0) {
>  		if (exclude) {
> -			entry = objects + object_ix[ix] - 1;
> +			entry = kh_value(packed_objects, ix);
>  			if (!entry->preferred_base)
>  				nr_result--;
>  			entry->preferred_base = 1;

After this, the function returns.  The original did not add to the
table the object name we are looking at, but the new code first adds
it to the table with the unconditional kh_put_sha1() above.  Is a
call to kh_del_sha1() missing here ...

> @@ -921,38 +916,42 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>  		return 0;
>  	}
>  
> -	if (!exclude && local && has_loose_object_nonlocal(sha1))
> +	if (!exclude && local && has_loose_object_nonlocal(sha1)) {
> +		kh_del_sha1(packed_objects, ix);
>  		return 0;

... like this one, which seems to compensate for "ahh, after all we
realize we do not want to add this one to the table"?

> @@ -966,19 +965,30 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>  		entry->in_pack_offset = found_offset;
>  	}
>  
> -	if (object_ix_hashsz * 3 <= nr_objects * 4)
> -		rehash_objects();
> -	else
> -		object_ix[-1 - ix] = nr_objects;
> +	kh_value(packed_objects, ix) = entry;
> +	kh_key(packed_objects, ix) = entry->idx.sha1;
> +	objects[nr_objects++] = entry;
>  
>  	display_progress(progress_state, nr_objects);
>  
> -	if (name && no_try_delta(name))
> -		entry->no_try_delta = 1;
> -
>  	return 1;
>  }
>  
> +static int add_object_entry(const unsigned char *sha1, enum object_type type,
> +			    const char *name, int exclude)
> +{
> +	if (add_object_entry_1(sha1, type, name_hash(name), exclude, NULL, 0)) {
> +		struct object_entry *entry = objects[nr_objects - 1];
> +
> +		if (name && no_try_delta(name))
> +			entry->no_try_delta = 1;
> +
> +		return 1;
> +	}
> +
> +	return 0;
> +}

It is somewhat unclear what we are getting from the split of the
main part of this function into *_1(), other than the *_1() function
now has a very deep indentation inside "if (!found_pack)", which is
always true because the caller always passes NULL to found_pack.
Perhaps this is an unrelated refactoring that is needed for later
steps and does not have anything to do with the use of new hash
function?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 08/16] ewah: compressed bitmap implementation
  2013-06-25  1:10   ` Junio C Hamano
@ 2013-06-25 22:51     ` Junio C Hamano
  0 siblings, 0 replies; 64+ messages in thread
From: Junio C Hamano @ 2013-06-25 22:51 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Vicent Marti <tanoku@gmail.com> writes:
>
>> The library is re-licensed under the GPLv2 with the permission of Daniel
>> Lemire, the original author. The source code for the C version can
>> be found on GitHub:
>>
>> 	https://github.com/vmg/libewok
>>
>> The original Java implementation can also be found on GitHub:
>>
>> 	https://github.com/lemire/javaewah
>> ---
>
> Please make sure that all patches are properly signed off.
>
>>  Makefile           |    6 +
>>  ewah/bitmap.c      |  229 +++++++++++++++++
>>  ewah/ewah_bitmap.c |  703 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  ewah/ewah_io.c     |  199 +++++++++++++++
>>  ewah/ewah_rlw.c    |  124 +++++++++
>>  ewah/ewok.h        |  194 +++++++++++++++
>>  ewah/ewok_rlw.h    |  114 +++++++++
>
> This is lovely.  A few comments after an initial quick scan-through.
>
>  - The code and the headers are well commented, which is good.
>
>  - What's __builtin_popcountll() doing there in a presumably generic
>    codepath?
>
>  - Two variants of "bitmap" are given different and easy to
>    understand type names (vanilla one is "bitmap", the clever one is
>    "ewah_bitmap"), but at many places, a pointer to ewah_bitmap is
>    simply called "bitmap" or "bitmap_i" without "ewah" anywhere,
>    which waas confusing to read.  Especially, the "NAND" operation
>    for bitmap takes two bitmaps, while "OR" takes one bitmap and
>    ewah_bitmap.  That is fine as long as the combination is
>    convenient for callers, but I wished the ewah variables be called
>    with "ewah" somewhere in their names.
>
>  - I compile with "-Werror -Wdeclaration-after-statement"; some
>    places seem to trigger it.
>
>  - Some "extern" declarations in *.c sources were irritating;
>    shouldn't they be declared in *.h file and included?
>
>  - There are some instances of "if (condition) stmt;" on a single
>    line; looked irritating.   
>
>  - "bool" is not a C type we use (and not a particularly good type
>    in C++, either).

One more.

  - Use of unnecessary float (e.g. "oldval *= 1.5") were moderately
    annoying.


> That is it for now. I am looking forward to read through the users
> of the library ;-)
>
> Thanks for working on this.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 13/16] repack: consider bitmaps when performing repacks
  2013-06-24 23:23 ` [PATCH 13/16] repack: consider bitmaps when performing repacks Vicent Marti
@ 2013-06-25 23:00   ` Junio C Hamano
  2013-06-25 23:16     ` Vicent Martí
  0 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2013-06-25 23:00 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti <tanoku@gmail.com> writes:

> @@ -156,6 +156,11 @@ do
>  	fullbases="$fullbases pack-$name"
>  	chmod a-w "$PACKTMP-$name.pack"
>  	chmod a-w "$PACKTMP-$name.idx"
> +
> +	test -f "$PACKTMP-$name.bitmap" &&
> +	chmod a-w "$PACKTMP-$name.bitmap" &&
> +	mv -f "$PACKTMP-$name.bitmap" "$PACKDIR/pack-$name.bitmap"

If we see a temporary bitmap but somehow failed to move it to the
final name, should we _ignore_ that error, or should we die, like
the next two lines do?

>  	mv -f "$PACKTMP-$name.pack" "$PACKDIR/pack-$name.pack" &&
>  	mv -f "$PACKTMP-$name.idx"  "$PACKDIR/pack-$name.idx" ||
>  	exit

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 10/16] pack-objects: use bitmaps when packing objects
  2013-06-24 23:23 ` [PATCH 10/16] pack-objects: use bitmaps when packing objects Vicent Marti
  2013-06-25 12:48   ` Ramkumar Ramachandra
  2013-06-25 15:58   ` Thomas Rast
@ 2013-06-25 23:06   ` Junio C Hamano
  2013-06-25 23:14     ` Vicent Martí
  2 siblings, 1 reply; 64+ messages in thread
From: Junio C Hamano @ 2013-06-25 23:06 UTC (permalink / raw)
  To: Vicent Marti; +Cc: git

Vicent Marti <tanoku@gmail.com> writes:

> @@ -83,6 +84,9 @@ static struct progress *progress_state;
>  static int pack_compression_level = Z_DEFAULT_COMPRESSION;
>  static int pack_compression_seen;
>  
> +static int bitmap_support;
> +static int use_bitmap_index;

OK.

> @@ -2131,6 +2135,10 @@ static int git_pack_config(const char *k, const char *v, void *cb)
>  		cache_max_small_delta_size = git_config_int(k, v);
>  		return 0;
>  	}
> +	if (!strcmp(k, "pack.usebitmaps")) {
> +		bitmap_support = git_config_bool(k, v);
> +		return 0;
> +	}

Hmph, so bitmap_support, not use_bitmap_index, keeps track of the
user request?  Somewhat confusing.

>  	if (!strcmp(k, "pack.threads")) {
>  		delta_search_threads = git_config_int(k, v);
>  		if (delta_search_threads < 0)
> @@ -2366,8 +2374,24 @@ static void get_object_list(int ac, const char **av)
>  			die("bad revision '%s'", line);
>  	}
>  
> +	if (use_bitmap_index) {
> +		uint32_t size_hint;
> +
> +		if (!prepare_bitmap_walk(&revs, &size_hint)) {
> +			khint_t new_hash_size = (size_hint * (1.0 / __ac_HASH_UPPER)) + 0.5;

What is __ac_HASH_UPPER?  That is a very unusual name for a variable
or a constant.  Also it is mildly annoying to see unnecessary use of
float like this.

> +			kh_resize_sha1(packed_objects, new_hash_size);
> +
> +			nr_alloc = (size_hint + 63) & ~63;
> +			objects = xrealloc(objects, nr_alloc * sizeof(struct object_entry *));
> +
> +			traverse_bitmap_commit_list(&add_object_entry_1);
> +			return;
> +		}
> +	}
> +
>  	if (prepare_revision_walk(&revs))
>  		die("revision walk setup failed");
> +
>  	mark_edges_uninteresting(revs.commits, &revs, show_edge);
>  	traverse_commit_list(&revs, show_commit, show_object, NULL);
>  
> @@ -2495,6 +2519,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
>  			    N_("pack compression level")),
>  		OPT_SET_INT(0, "keep-true-parents", &grafts_replace_parents,
>  			    N_("do not hide commits by grafts"), 0),
> +		OPT_BOOL(0, "bitmaps", &bitmap_support,
> +			 N_("enable support for bitmap optimizations")),

Please match this with the name of configuration variable, i.e. --use-bitmaps

>  		OPT_END(),
>  	};
>  
> @@ -2561,6 +2587,11 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
>  	if (keep_unreachable && unpack_unreachable)
>  		die("--keep-unreachable and --unpack-unreachable are incompatible.");
>  
> +	if (bitmap_support) {
> +		if (use_internal_rev_list && pack_to_stdout)
> +			use_bitmap_index = 1;

OK, so only when some internal condition is met, the user request to
use bitmap is honored and the deision is kept in use_bitmap_index.

It may be easier to read if you get rid of bitmap_support, set
user_bitmap_index directly from the command line and config, and did
this here instead:

	if (!(use_internal_rev_list && pack_to_stdout))
		use_bitmap_index = 0;

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/16] pack-objects: use a faster hash table
  2013-06-25 22:48   ` Junio C Hamano
@ 2013-06-25 23:09     ` Vicent Martí
  0 siblings, 0 replies; 64+ messages in thread
From: Vicent Martí @ 2013-06-25 23:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Jun 26, 2013 at 12:48 AM, Junio C Hamano <gitster@pobox.com> wrote:
> After this, the function returns.  The original did not add to the
> table the object name we are looking at, but the new code first adds
> it to the table with the unconditional kh_put_sha1() above.  Is a
> call to kh_del_sha1() missing here ...

No, this is not the case. That's the return case for when *the object
was found because it already existed in the hash table* (hence we
access it if we're excluding it, to tag it as excluded). We don't want
to remove it from the hash table because we're not the ones we
inserted it.

We only call `kh_del_sha1` in the cases where:

    1. The object wasn't found.
    2. We inserted its key on the hash table.
    3. We later learnt that we don't really want to pack this object.

>
>> @@ -921,38 +916,42 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>>               return 0;
>>       }
>>
>> -     if (!exclude && local && has_loose_object_nonlocal(sha1))
>> +     if (!exclude && local && has_loose_object_nonlocal(sha1)) {
>> +             kh_del_sha1(packed_objects, ix);
>>               return 0;
>
> ... like this one, which seems to compensate for "ahh, after all we
> realize we do not want to add this one to the table"?
>
>> @@ -966,19 +965,30 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>>               entry->in_pack_offset = found_offset;
>>       }
>>
>> -     if (object_ix_hashsz * 3 <= nr_objects * 4)
>> -             rehash_objects();
>> -     else
>> -             object_ix[-1 - ix] = nr_objects;
>> +     kh_value(packed_objects, ix) = entry;
>> +     kh_key(packed_objects, ix) = entry->idx.sha1;
>> +     objects[nr_objects++] = entry;
>>
>>       display_progress(progress_state, nr_objects);
>>
>> -     if (name && no_try_delta(name))
>> -             entry->no_try_delta = 1;
>> -
>>       return 1;
>>  }
>>
>> +static int add_object_entry(const unsigned char *sha1, enum object_type type,
>> +                         const char *name, int exclude)
>> +{
>> +     if (add_object_entry_1(sha1, type, name_hash(name), exclude, NULL, 0)) {
>> +             struct object_entry *entry = objects[nr_objects - 1];
>> +
>> +             if (name && no_try_delta(name))
>> +                     entry->no_try_delta = 1;
>> +
>> +             return 1;
>> +     }
>> +
>> +     return 0;
>> +}
>
> It is somewhat unclear what we are getting from the split of the
> main part of this function into *_1(), other than the *_1() function
> now has a very deep indentation inside "if (!found_pack)", which is
> always true because the caller always passes NULL to found_pack.
> Perhaps this is an unrelated refactoring that is needed for later
> steps and does not have anything to do with the use of new hash
> function?

Yes, apologies for not making this clear. By refactoring into `_1`,
you can see how `traverse_bitmap_commit_list` can use the `_1` version
directly as a callback, to insert objects straight into the packing
list without looking them up. This is very efficient because we can
pass the whole API straight from the bitmap code:

1. The SHA1: we find it by simply looking up the `nth` sha1 on the
pack index (if we are yielding bit `n`)
2. The object type: we find it because we have type indexes that let
us know the type of any given bit in the bitmap by and-ing it with the
index.
3. The hash for its name: we can look it up from the name hash cache
in the new bitmap format.
4. Exclude flag: we never exclude when working with bitmaps
5. found_pack: all the bitmapped objects come from the same pack!
6. found_offset: we find it by simply looking up the `nth` offset on
the pack index (if we are yielding bit `n`)

Boom! We filled the callback just from the data in a bitmap. Ain't that nice?

Let me amend the commit message.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 10/16] pack-objects: use bitmaps when packing objects
  2013-06-25 23:06   ` Junio C Hamano
@ 2013-06-25 23:14     ` Vicent Martí
  0 siblings, 0 replies; 64+ messages in thread
From: Vicent Martí @ 2013-06-25 23:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Jun 26, 2013 at 1:06 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> @@ -83,6 +84,9 @@ static struct progress *progress_state;
>>  static int pack_compression_level = Z_DEFAULT_COMPRESSION;
>>  static int pack_compression_seen;
>>
>> +static int bitmap_support;
>> +static int use_bitmap_index;
>
> OK.
>
>> @@ -2131,6 +2135,10 @@ static int git_pack_config(const char *k, const char *v, void *cb)
>>               cache_max_small_delta_size = git_config_int(k, v);
>>               return 0;
>>       }
>> +     if (!strcmp(k, "pack.usebitmaps")) {
>> +             bitmap_support = git_config_bool(k, v);
>> +             return 0;
>> +     }
>
> Hmph, so bitmap_support, not use_bitmap_index, keeps track of the
> user request?  Somewhat confusing.
>
>>       if (!strcmp(k, "pack.threads")) {
>>               delta_search_threads = git_config_int(k, v);
>>               if (delta_search_threads < 0)
>> @@ -2366,8 +2374,24 @@ static void get_object_list(int ac, const char **av)
>>                       die("bad revision '%s'", line);
>>       }
>>
>> +     if (use_bitmap_index) {
>> +             uint32_t size_hint;
>> +
>> +             if (!prepare_bitmap_walk(&revs, &size_hint)) {
>> +                     khint_t new_hash_size = (size_hint * (1.0 / __ac_HASH_UPPER)) + 0.5;
>
> What is __ac_HASH_UPPER?  That is a very unusual name for a variable
> or a constant.  Also it is mildly annoying to see unnecessary use of
> float like this.

See the updated patch at:

https://github.com/vmg/git/blob/vmg/bitmaps-master/builtin/pack-objects.c#L2422

>
>> +                     kh_resize_sha1(packed_objects, new_hash_size);
>> +
>> +                     nr_alloc = (size_hint + 63) & ~63;
>> +                     objects = xrealloc(objects, nr_alloc * sizeof(struct object_entry *));
>> +
>> +                     traverse_bitmap_commit_list(&add_object_entry_1);
>> +                     return;
>> +             }
>> +     }
>> +
>>       if (prepare_revision_walk(&revs))
>>               die("revision walk setup failed");
>> +
>>       mark_edges_uninteresting(revs.commits, &revs, show_edge);
>>       traverse_commit_list(&revs, show_commit, show_object, NULL);
>>
>> @@ -2495,6 +2519,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
>>                           N_("pack compression level")),
>>               OPT_SET_INT(0, "keep-true-parents", &grafts_replace_parents,
>>                           N_("do not hide commits by grafts"), 0),
>> +             OPT_BOOL(0, "bitmaps", &bitmap_support,
>> +                      N_("enable support for bitmap optimizations")),
>
> Please match this with the name of configuration variable, i.e. --use-bitmaps
>
>>               OPT_END(),
>>       };
>>
>> @@ -2561,6 +2587,11 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
>>       if (keep_unreachable && unpack_unreachable)
>>               die("--keep-unreachable and --unpack-unreachable are incompatible.");
>>
>> +     if (bitmap_support) {
>> +             if (use_internal_rev_list && pack_to_stdout)
>> +                     use_bitmap_index = 1;
>
> OK, so only when some internal condition is met, the user request to
> use bitmap is honored and the deision is kept in use_bitmap_index.
>
> It may be easier to read if you get rid of bitmap_support, set
> user_bitmap_index directly from the command line and config, and did
> this here instead:
>
>         if (!(use_internal_rev_list && pack_to_stdout))
>                 use_bitmap_index = 0;

Yeah, I'm not particularly happy with the way these flags are
implemented. I'll update this.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 13/16] repack: consider bitmaps when performing repacks
  2013-06-25 23:00   ` Junio C Hamano
@ 2013-06-25 23:16     ` Vicent Martí
  0 siblings, 0 replies; 64+ messages in thread
From: Vicent Martí @ 2013-06-25 23:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, Jun 26, 2013 at 1:00 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> @@ -156,6 +156,11 @@ do
>>       fullbases="$fullbases pack-$name"
>>       chmod a-w "$PACKTMP-$name.pack"
>>       chmod a-w "$PACKTMP-$name.idx"
>> +
>> +     test -f "$PACKTMP-$name.bitmap" &&
>> +     chmod a-w "$PACKTMP-$name.bitmap" &&
>> +     mv -f "$PACKTMP-$name.bitmap" "$PACKDIR/pack-$name.bitmap"
>
> If we see a temporary bitmap but somehow failed to move it to the
> final name, should we _ignore_ that error, or should we die, like
> the next two lines do?

I obviously decided against dying (as you can see on the patch, har
har), because the bitmap is not required for the proper operation of
the Git repository, unlike the packfile and the index.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists
  2013-06-25 16:22   ` Thomas Rast
@ 2013-06-26  1:45     ` Vicent Martí
  2013-06-26 23:13       ` Thomas Rast
  2013-06-26  5:22     ` Jeff King
  1 sibling, 1 reply; 64+ messages in thread
From: Vicent Martí @ 2013-06-26  1:45 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git

I'm afraid I cannot reproduce the segfault locally (assuming you're
performing the rev-list on the git/git repository). Could you please
send me more information, and a core dump if possible?

On Tue, Jun 25, 2013 at 6:22 PM, Thomas Rast <trast@inf.ethz.ch> wrote:
> Vicent Marti <tanoku@gmail.com> writes:
>
>> Calling `git rev-list --use-bitmaps [committish]` is the equivalent
>> of `git rev-list --objects`, but the rev list is performed based on
>> a bitmap result instead of using a manual counting objects phase.
>
> Why would we ever want to not --use-bitmaps, once it actually works?
> I.e., shouldn't this be the default if pack.usebitmaps is set (or
> possibly even core.usebitmaps for these things)?
>
>> These are some example timings for `torvalds/linux`:
>>
>>       $ time ../git/git rev-list --objects master > /dev/null
>>
>>       real    0m25.567s
>>       user    0m25.148s
>>       sys     0m0.384s
>>
>>       $ time ../git/git rev-list --use-bitmaps master > /dev/null
>>
>>       real    0m0.393s
>>       user    0m0.356s
>>       sys     0m0.036s
>
> I see your badass numbers, and raise you a critical issue:
>
>   $ time git rev-list --use-bitmaps --count --left-right origin/pu...origin/next
>   Segmentation fault
>
>   real    0m0.408s
>   user    0m0.383s
>   sys     0m0.022s
>
> It actually seems to be related solely to having negated commits in the
> walk:
>
>   thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count origin/pu
>   32315
>
>   real    0m0.041s
>   user    0m0.034s
>   sys     0m0.006s
>   thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count origin/pu ^origin/next
>   Segmentation fault
>
>   real    0m0.460s
>   user    0m0.214s
>   sys     0m0.244s
>
> I also can't help noticing that the time spent generating the segfault
> would have sufficed to generate the answer "the old way" as well:
>
>   $ time git rev-list --count --left-right origin/pu...origin/next
>   189     125
>
>   real    0m0.409s
>   user    0m0.386s
>   sys     0m0.022s
>
> Can we use the same trick to speed up merge base computation and then
> --left-right?  The latter is a component of __git_ps1 and can get
> somewhat slow in some cases, so it would be nice to make it really fast,
> too.
>
> --
> Thomas Rast
> trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/16] pack-objects: use a faster hash table
  2013-06-25 14:03   ` Thomas Rast
@ 2013-06-26  2:14     ` Jeff King
  2013-06-26  4:47       ` Jeff King
  0 siblings, 1 reply; 64+ messages in thread
From: Jeff King @ 2013-06-26  2:14 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Vicent Marti, git

On Tue, Jun 25, 2013 at 04:03:22PM +0200, Thomas Rast wrote:

> > The big win here, however, is in the massively reduced amount of hash
> > collisions (as you can see from the huge reduction of time spent in
> > `hashcmp` after the change). These greatly improved lookup times
> > will result critical once we implement the writing algorithm for bitmap
> > indxes in a later patch of this series.
> 
> Is that reduction in collisions purely because it uses quadratic
> probing, or is there some other magic trick involved?  Is the same also
> applicable to the other users of the "big" object hash table?  (I assume
> Peff has already tried applying it there, but I'm still curious...)

I haven't done any actual timings yet.

The general code is quite similar to our object.c hash table, with the
exception that it does quadratic probing.  I did try quadratic probing
on our object.c hash once and didn't see much improvement (similarly,
Junio tried cuckoo hashing, but the numbers were not that exciting).

It's possible that the hash table in pack-objects did not behave as well
as the one in object.c. It looks like we grow it when the table is 3/4
full, which is a little high (we grow at 1/2 in object.c).  Quadratic
probing should help when the hash table is close to full, so it would
probably help. However, I also note that khash keeps its hash tables
only half full, so that may be the real source of the performance
improvement.

So I suspect two things (but as I said, haven't verified):

  1. You could speed up pack-objects just by keeping the table half full
     rather than 3/4 full.

  2. You would see little to no speedup by moving object.c to khash, as
     it is adding only quadratic probing. With quadratic probing, you
     could potentially tweak the kh_put_* to resize less aggressively
     (say, 2/3) and save some memory without loss of performance.

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 03/16] pack-objects: use a faster hash table
  2013-06-26  2:14     ` Jeff King
@ 2013-06-26  4:47       ` Jeff King
  0 siblings, 0 replies; 64+ messages in thread
From: Jeff King @ 2013-06-26  4:47 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Vicent Marti, git

On Tue, Jun 25, 2013 at 10:14:17PM -0400, Jeff King wrote:

> So I suspect two things (but as I said, haven't verified):
> 
>   1. You could speed up pack-objects just by keeping the table half full
>      rather than 3/4 full.

I wasn't able to show any measurable speedup with this. I tried to make
as specific a measurement as I could, by adding a "counting only" option
like this:

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index fc12df8..a0438d0 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2452,6 +2452,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	const char *rp_av[6];
 	int rp_ac = 0;
 	int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0;
+	int counting_only = 0;
 	struct option pack_objects_options[] = {
 		OPT_SET_INT('q', "quiet", &progress,
 			    N_("do not show progress meter"), 0),
@@ -2515,6 +2516,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			    N_("pack compression level")),
 		OPT_SET_INT(0, "keep-true-parents", &grafts_replace_parents,
 			    N_("do not hide commits by grafts"), 0),
+		OPT_BOOL(0, "counting-only", &counting_only,
+			 N_("exit after counting objects phase")),
 		OPT_END(),
 	};
 
@@ -2600,6 +2603,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 		for_each_ref(add_ref_tag, NULL);
 	stop_progress(&progress_state);
 
+	if (counting_only)
+		return 0;
 	if (non_empty && !nr_result)
 		return 0;
 	if (nr_result)

and even doing the whole object traversal ahead of time to just focus on
the object-entry hash, like this:

  git rev-list --objects --all >objects.out
  time git pack-objects --counting-only --stdout <objects.out

Tweaking the hash size didn't have any effect, but using Vicent's khash
patch actually made it about 5% slower. So I wonder if I'm even
measuring the right thing. Vicent, how did you get the timings you
showed in the commit message?

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-25 19:33     ` Vicent Martí
  2013-06-25 21:17       ` Junio C Hamano
@ 2013-06-26  5:11       ` Jeff King
  2013-06-26 18:41         ` Colby Ranger
  2013-06-27  1:29         ` Shawn Pearce
  1 sibling, 2 replies; 64+ messages in thread
From: Jeff King @ 2013-06-26  5:11 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Vicent Martí, Colby Ranger, git

On Tue, Jun 25, 2013 at 09:33:11PM +0200, Vicent Martí wrote:

> > One way we side-stepped the size inflation problem in JGit was to only
> > use the bitmap index information when sending data on the wire to a
> > client. Here delta reuse plays a significant factor in building the
> > pack, and we don't have to be as accurate on matching deltas. During
> > the equivalent of `git repack` bitmaps are not used, allowing the
> > traditional graph enumeration algorithm to generate path hash
> > information.
> 
> OH BOY HERE WE GO. This is worth its own thread, lots to discuss here.
> I think peff will have a patchset regarding this to upstream soon,
> we'll get back to it later.

We do the same thing (only use bitmaps during on-the-wire fetches).  But
there a few problems with assuming delta reuse.

For us (GitHub), the foremost one is that we pack many "forks" of a
repository together into a single packfile. That means when you clone
torvalds/linux, an object you want may be stored in the on-disk pack
with a delta against an object that you are not going to get. So we have
to throw out that delta and find a new one.

I'm dealing with that by adding an option to respect "islands" during
packing, where an island is a set of common objects (we split it by
fork, since we expect those objects to be fetched together, but you
could use other criteria). The rule is that an object cannot delta
against another object that is not in all of its islands. So everybody
can delta against shared history, but objects in your fork can only
delta against other objects in the fork.  You are guaranteed to be able
to reuse such deltas during a full clone of a fork, and the on-disk pack
size does not suffer all that much (because there is usually a good
alternate delta base within your reachable history).

So with that series, we can get good reuse for clones. But there are
still two cases worth considering:

  1. When you fetch a subset of the commits, git marks only the edges as
     preferred bases, and does not walk the full object graph down to
     the roots. So any object you want that is delta'd against something
     older will not get reused. If you have reachability bitmaps, I
     don't think there is any reason that we cannot use the entire
     object graph (starting at the "have" tips, of course) as preferred
     bases.

  2. The server is not necessarily fully packed. In an active repo, you
     may have a large "base" pack with bitmaps, with several recently
     pushed packs on top. You still need to delta the recently pushed
     objects against the base objects.

I don't have measurements on how much the deltas suffer in those two
cases. I know they suffered quite badly for clones without the name
hashes in our alternates repos, but that part should go away with my
patch series.

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists
  2013-06-25 16:22   ` Thomas Rast
  2013-06-26  1:45     ` Vicent Martí
@ 2013-06-26  5:22     ` Jeff King
  1 sibling, 0 replies; 64+ messages in thread
From: Jeff King @ 2013-06-26  5:22 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Vicent Marti, git

On Tue, Jun 25, 2013 at 09:22:28AM -0700, Thomas Rast wrote:

> Vicent Marti <tanoku@gmail.com> writes:
> 
> > Calling `git rev-list --use-bitmaps [committish]` is the equivalent
> > of `git rev-list --objects`, but the rev list is performed based on
> > a bitmap result instead of using a manual counting objects phase.
> 
> Why would we ever want to not --use-bitmaps, once it actually works?
> I.e., shouldn't this be the default if pack.usebitmaps is set (or
> possibly even core.usebitmaps for these things)?

If you are using bitmaps, you cannot produce the same output as
"--objects"; the latter prints the path at which each object is found.
In the JGit bitmap format, we have no information at all; in Vicent's
"v2", we have only a hash of that pathname.

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-26  5:11       ` Jeff King
@ 2013-06-26 18:41         ` Colby Ranger
  2013-06-26 22:33           ` Colby Ranger
  2013-06-27  1:29         ` Shawn Pearce
  1 sibling, 1 reply; 64+ messages in thread
From: Colby Ranger @ 2013-06-26 18:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn Pearce, Vicent Martí, git

> Pinning the bitmap index on the reverse index adds complexity (lookups
> are two-step: first find the entry in the reverse index, and then find
> the SHA1 in the index) and is measurably slower, in both loading and
> lookup times. Since Git doesn't have a memory problem, it's very hard
> to make an argument for design that is more complex and runs slower to
> save memory.

Sorting by SHA1 will generate a random distribution. This will require
you to inflate the entire bitmap on every fetch request, in order to
do the "contains" operation.  Sorting by pack offset allows us to
inflate only the bits we need as we are walking the graph, since they
are usually at the start of the bitmap.

What is the general size in bytes of the SHA1 sorted bitmaps?  If they
are much larger, the size of the bitmap has an impact on how fast you
can perform bitwise operations on them, which is important for fetch
when doing wants AND NOT haves.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-26 18:41         ` Colby Ranger
@ 2013-06-26 22:33           ` Colby Ranger
  2013-06-27  0:53             ` Colby Ranger
  0 siblings, 1 reply; 64+ messages in thread
From: Colby Ranger @ 2013-06-26 22:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn Pearce, Vicent Martí, git

>> Pinning the bitmap index on the reverse index adds complexity (lookups
>> are two-step: first find the entry in the reverse index, and then find
>> the SHA1 in the index) and is measurably slower, in both loading and
>> lookup times. Since Git doesn't have a memory problem, it's very hard
>> to make an argument for design that is more complex and runs slower to
>> save memory.
>
> Sorting by SHA1 will generate a random distribution. This will require
> you to inflate the entire bitmap on every fetch request, in order to
> do the "contains" operation.  Sorting by pack offset allows us to
> inflate only the bits we need as we are walking the graph, since they
> are usually at the start of the bitmap.
>
> What is the general size in bytes of the SHA1 sorted bitmaps?  If they
> are much larger, the size of the bitmap has an impact on how fast you
> can perform bitwise operations on them, which is important for fetch
> when doing wants AND NOT haves.

Furthermore, JGit primarily operates on the bitmap representation,
rarely converting bitmap id -> SHA1 during clone. When the bitmap of
objects to include in the output pack contains all of the objects in
the bitmap'd pack, we only do the translation of the bitmap ids of new
objects, not in the bitmap index, and it is just a lookup in an array.
Those objects are put at the front of the stream. The rest of the
objects are streamed directly from the pack, with some header munging,
since it is guaranteed to be a fully connected pack. Most of the time
this works because JGit creates 2 packs during GC: a heads pack, which
is bitmap'd, and an everything else pack.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-25 22:30     ` Vicent Martí
@ 2013-06-26 23:12       ` Thomas Rast
  2013-06-26 23:19         ` Thomas Rast
  0 siblings, 1 reply; 64+ messages in thread
From: Thomas Rast @ 2013-06-26 23:12 UTC (permalink / raw)
  To: Vicent Martí; +Cc: git, Colby Ranger

Vicent Martí <tanoku@gmail.com> writes:

> On Tue, Jun 25, 2013 at 5:58 PM, Thomas Rast <trast@inf.ethz.ch> wrote:
>>
>> Please document the RLW format here.
>
> Har har. I was going to comment on your review of the Ewah patchset,
> but might as well do it here: the only thing I know about Ewah bitmaps
> is that they work. And I know this because I did extensive fuzz
> testing of my C port. Unfortunately, the original Java code I ported
> from has 0 comments, so any documentation here would have to be
> reverse-engineered.

I think the below would be a reasonable documentation, to be appended
after your description of the EWAH format.  Maybe Colby can correct me
if I got anything wrong.  You can basically read this off from the
implementation of ewah_each_bit() and the helper functions it uses.

-- 8< --
The compressed bitmap is stored in a form of run-length encoding, as
follows.  It consists of a concatenation of an arbitrary number of
chunks.  Each chunk consists of one or more 64-bit words

     H  L_1  L_2  L_3 .... L_M

H is called RLW (run length word).  It consists of (from lower to higher
order bits):

     - 1 bit: the repeated bit B

     - 32 bits: repetition count K (unsigned)

     - 31 bits: literal word count M (unsigned)

The bitstream represented by the above chunk is then:

     - K repetitions of B

     - The bits stored in `L_1` through `L_M`.  Within a word, bits at
       lower order come earlier in the stream than those at higher
       order.

The next word after `L_M` (if any) must again be a RLW, for the next
chunk.  For efficient appending to the bitstream, the EWAH stores a
format to the last RLW in the stream.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists
  2013-06-26  1:45     ` Vicent Martí
@ 2013-06-26 23:13       ` Thomas Rast
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Rast @ 2013-06-26 23:13 UTC (permalink / raw)
  To: Vicent Martí; +Cc: git

Vicent Martí <tanoku@gmail.com> writes:

> I'm afraid I cannot reproduce the segfault locally (assuming you're
> performing the rev-list on the git/git repository). Could you please
> send me more information, and a core dump if possible?

Sure, but isn't the core dump useless if you don't have the same
executable?  And since I'm building "custom" git, you won't have that.

Here's a semi-full backtrace (I left out the spammy output in the
outermost frames).  Some variables in #2 and #3 seem to have gone off
the rails.

#0  0x00007ffff72b06fb in __memset_sse2 () from /lib64/libc.so.6
No symbol table info available.
#1  0x000000000054c31c in bitmap_set (self=0x89c360, pos=18446744072278122040) at ewah/bitmap.c:46
        old_size = 7666
        block = 288230376129345656
#2  0x00000000004e6c70 in add_to_include_set (data=0x7fffffffcd00, sha1=0x85c014 "\230\062˝M\311i\373\372\317\321\370\224\017\313\336\301\213\271\060", bitmap_pos=-1431429576) at pack-bitmap.c:428
        hash_pos = 512
#3  0x00000000004e6cd6 in should_include (commit=0x85c010, _data=0x7fffffffcd00) at pack-bitmap.c:443
        data = 0x7fffffffcd00
        bitmap_pos = -1431429576
#4  0x000000000050cf1d in add_parents_to_list (revs=0x7fffffffce30, commit=0x85c010, list=0x7fffffffce30, cache_ptr=0x0) at revision.c:784
        parent = 0x88c260
        left_flag = 32767
        cached_base = 0x0
#5  0x0000000000512b66 in get_revision_1 (revs=0x7fffffffce30) at revision.c:2857
        entry = 0x8f9ce0
        commit = 0x85c010
#6  0x0000000000512dcf in get_revision_internal (revs=0x7fffffffce30) at revision.c:2964
        c = 0x0
        l = 0x1000
#7  0x0000000000512fe1 in get_revision (revs=0x7fffffffce30) at revision.c:3040
        c = 0xb92608
        reversed = 0x89c360
#8  0x00000000004d2a24 in traverse_commit_list (revs=0x7fffffffce30, show_commit=0x4e6b72 <show_commit>, show_object=0x4e6afa <show_object>, data=0x89c360) at list-objects.c:179
        i = -1
        commit = 0xb92608
        base = {
          alloc = 4097, 
          len = 0, 
          buf = 0x87bbe0 ""
        }
#9  0x00000000004e6fa4 in find_objects (revs=0x7fffffffce30, roots=0x0, seen=0x85b760) at pack-bitmap.c:549
        incdata = {
          base = 0x89c360, 
          seen = 0x85b760
        }
        base = 0x89c360
        needs_walk = true
        not_mapped = 0x8f9dc0
#10 0x00000000004e747b in prepare_bitmap_walk (revs=0x7fffffffce30, result_size=0x0) at pack-bitmap.c:679
        i = 2
        pending_nr = 2
        pending_alloc = 64
        pending_e = 0x853e10
        wants = 0x8545b0
        haves = 0x854820
        wants_bitmap = 0x0
        haves_bitmap = 0x85b760
#11 0x0000000000474bb3 in cmd_rev_list (argc=2, argv=0x7fffffffd6e8, prefix=0x0) at builtin/rev-list.c:356
#12 0x0000000000405820 in run_builtin (p=0x7c3ef8 <commands.20770+2040>, argc=4, argv=0x7fffffffd6e8) at git.c:291
#13 0x00000000004059b3 in handle_internal_command (argc=4, argv=0x7fffffffd6e8) at git.c:454
#14 0x0000000000405b87 in main (argc=4, av=0x7fffffffd6e8) at git.c:544


This is with a version of your series that you can find at

  https://github.com/trast/git.git vm/ewah

I am'd your patches on top of Junio's master at the time, except for the
parts to the Makefile that did not apply, which I fixed up manually.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-26 23:12       ` Thomas Rast
@ 2013-06-26 23:19         ` Thomas Rast
  0 siblings, 0 replies; 64+ messages in thread
From: Thomas Rast @ 2013-06-26 23:19 UTC (permalink / raw)
  To: Vicent Martí; +Cc: git, Colby Ranger

Thomas Rast <trast@inf.ethz.ch> writes:

[...]
> The next word after `L_M` (if any) must again be a RLW, for the next
> chunk.  For efficient appending to the bitstream, the EWAH stores a
> format to the last RLW in the stream.
  ^^^^^^

I have no idea what Freud did there, but "pointer" or some such is
probably a saner choice.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-26 22:33           ` Colby Ranger
@ 2013-06-27  0:53             ` Colby Ranger
  2013-06-27  1:32               ` Shawn Pearce
  0 siblings, 1 reply; 64+ messages in thread
From: Colby Ranger @ 2013-06-27  0:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn Pearce, Vicent Martí, git

> +  Generating this reverse index at runtime is **not** free (around 900ms
> +  generation time for a repository like `torvalds/linux`), and once again,
> +  this generation time needs to happen every time `pack-objects` is
> +  spawned.

If generating the reverse index is expensive, it is probably
worthwhile to create a ".revidx" or extend the ".idx" with the
information sorted by offset.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-25 22:08         ` Vicent Martí
@ 2013-06-27  1:11           ` Shawn Pearce
  2013-06-27  2:36             ` Vicent Martí
  0 siblings, 1 reply; 64+ messages in thread
From: Shawn Pearce @ 2013-06-27  1:11 UTC (permalink / raw)
  To: Vicent Martí; +Cc: Junio C Hamano, Colby Ranger, git

On Tue, Jun 25, 2013 at 4:08 PM, Vicent Martí <tanoku@gmail.com> wrote:
> On Tue, Jun 25, 2013 at 11:17 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> What case are you talking about?
>>
>> The n-th object must be one of these four types and can never be of
>> more than one type at the same time, so a natural expectation from
>> the reader is "If you OR them together, you will get the same set".
>> If you say "If you XOR them", that forces the reader to wonder when
>> these bitmaps ever can overlap at the same bit position.
>
> I guess this is just wording. I don't particularly care about the
> distinction, but I'll change it to OR.

Hmm, OK. If you think XOR and OR are the same operation, I also have a
bridge to sell you. Its in Brooklyn. Its a great value.

The correct operation is OR. Not XOR. OR. Drop the X.

> It cannot be mmapped not particularly because of endianness issues,
> but because the original format is not indexed and requires a full
> parse of the whole index before it can be accessed programatically.
> The wrong endianness just increases the parse time.

Wrong endianness has nothing to do with the parse time. Modern CPUs
can flip a word around very quickly. In JGit we chose to parse the
file at load time because its simpler than having an additional index
segment, and we do what you did which is to toss the object SHA-1s
into a hashtable for fast lookup. By the time we look for the SHA-1s
and toss them into a hashtable we can stride through the file and find
the bitmap regions. Simple.

In other words, the least complex solution possible that still
provides good performance. I'd say we have pretty good performance.

>>> and I'm going to try to make it run fast enough in that
>>> encoding.
>>
>> Hmph.  Is it an option to start from what JGit does, so that people
>> can use both JGit and your code on the same repository?

I'm afraid I agree here with Junio. The JGit format is already
shipping in JGit 3.0, Gerrit Code Review 2.6, and in heavy production
use for almost a year on android.googlesource.com, and Google's own
internal Git trees.

I would prefer to see a series adding bitmap support to C Git start
with the existing format, make it run, taking advantage of the
optimizations JGit uses (many of which you ignored and tried to "fix"
in other ways), and then look at improving the file format itself if
load time is still the largest low hanging fruit in upload-pack. I'm
guessing its not. You futzed around with the object table, but JGit
sped itself up considerably by simply not using the object table when
the bitmap is used. I think there are several such optimizations you
missed in your rush to redefine the file format.

>>  And then if
>> you do not succeed, after trying to optimize in-core processing
>> using that on-disk format to make it fast enough, start thinking
>> about tweaking the on-disk format?
>
> I'm afraid this is not an option. I have an old patchset that
> implements JGit v1 bitmap loading (and in fact that's how I initially
> developed these series -- by loading the bitmaps from JGit for
> debugging), but I discarded it because it simply doesn't pan out in
> production. ~3 seconds time to spawn `upload-pack` is not an option
> for us. I did not develop a tweaked on-disk format out of boredom.

I think your code or experiments are bogus. Even on our systems with
JGit a cold start for the Linux kernel doesn't take 3s. And this is
JGit where Java is slow because "Jesus it has a lot of factories", and
without mmap'ing the file into the server's address space. Hell the
file has to come over the network from a remote disk array.

> I could dig up the patch if you're particularly interested in
> backwards compatibility, but since it was several times slower than
> the current iteration, I have no interest (time, actually) to maintain
> it, brush it up, and so on. I have already offered myself to port the
> v2 format to JGit as soon as it's settled. It sounds like a better
> investment of all our times.

Actually, I think the format you propose here is inferior to the JGit
format. In particular the idx-ordering means the EWAH code is useless.
You might as well not use the EWAH format and just store 2.6M bits per
commit. The idx-ordering also makes *much* harder to emit a pack file
a reasonable order for the client. Colby and I tried idx-ordering and
discarded it when it didn't perform as well as the pack-ordering that
JGit uses.

> Following up on Shawn's comments, I removed the little-endian support
> from the on-disk format and implemented lazy loading of the bitmaps to
> make up for it. The result is decent (slowed down from 250ms to 300ms)
> and it lets us keep the whole format as NWO on disk. I think it's a
> good tradeback.

The maintenance burden of two endian formats in a single file is too
high to justify. I'm glad to see you saw that.

> As it stands right now, the only two changes from v1 of the on-disk format are:
>
> - There is an index at the end. This is a good idea.

I don't think the index is necessary if you plan to build a hashtable
at runtime anyway. If you mmap the file you can quickly skip over a
bitmap and find the next SHA-1 using this thing called "pointer
arithmetic". I am not sure if you are familiar with the term, perhaps
you could search the web for it.

> - The bitmaps are sorted in packfile-index order, not in packfile
> order. This is a good idea.

As Colby and I have repeatedly tried to explain, this is not a good idea.

> German kisses,

Strawberry and now German kisses? What's next, Mango kisses?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-26  5:11       ` Jeff King
  2013-06-26 18:41         ` Colby Ranger
@ 2013-06-27  1:29         ` Shawn Pearce
  1 sibling, 0 replies; 64+ messages in thread
From: Shawn Pearce @ 2013-06-27  1:29 UTC (permalink / raw)
  To: Jeff King; +Cc: Vicent Martí, Colby Ranger, git

On Tue, Jun 25, 2013 at 11:11 PM, Jeff King <peff@peff.net> wrote:
> On Tue, Jun 25, 2013 at 09:33:11PM +0200, Vicent Martí wrote:
>
>> > One way we side-stepped the size inflation problem in JGit was to only
>> > use the bitmap index information when sending data on the wire to a
>> > client. Here delta reuse plays a significant factor in building the
>> > pack, and we don't have to be as accurate on matching deltas. During
>> > the equivalent of `git repack` bitmaps are not used, allowing the
>> > traditional graph enumeration algorithm to generate path hash
>> > information.
>>
>> OH BOY HERE WE GO. This is worth its own thread, lots to discuss here.
>> I think peff will have a patchset regarding this to upstream soon,
>> we'll get back to it later.
>
> We do the same thing (only use bitmaps during on-the-wire fetches).  But
> there a few problems with assuming delta reuse.
>
> For us (GitHub), the foremost one is that we pack many "forks" of a
> repository together into a single packfile. That means when you clone
> torvalds/linux, an object you want may be stored in the on-disk pack
> with a delta against an object that you are not going to get. So we have
> to throw out that delta and find a new one.

Gerrit Code Review ran into the same problem a few years ago with the
refs/changes namespace. Objects reachable from a branch were often
delta compressed against dropped code review revisions, making for
some slow transfers. We fixed this by creating a pack of everything
reachable from refs/heads/* and then another pack of the other stuff.

I would encourage you to do what you suggest...

> I'm dealing with that by adding an option to respect "islands" during
> packing, where an island is a set of common objects (we split it by
> fork, since we expect those objects to be fetched together, but you
> could use other criteria). The rule is that an object cannot delta
> against another object that is not in all of its islands. So everybody
> can delta against shared history, but objects in your fork can only
> delta against other objects in the fork.  You are guaranteed to be able
> to reuse such deltas during a full clone of a fork, and the on-disk pack
> size does not suffer all that much (because there is usually a good
> alternate delta base within your reachable history).

Yes, exactly. I want to do the same thing on our servers, as we have
many forks of some popular open source repositories that are also not
small (Linux kernel, WebKit). Unfortunately Google has not had the
time to develop the necessary support into JGit.

> So with that series, we can get good reuse for clones. But there are
> still two cases worth considering:
>
>   1. When you fetch a subset of the commits, git marks only the edges as
>      preferred bases, and does not walk the full object graph down to
>      the roots. So any object you want that is delta'd against something
>      older will not get reused. If you have reachability bitmaps, I
>      don't think there is any reason that we cannot use the entire
>      object graph (starting at the "have" tips, of course) as preferred
>      bases.

In JGit we use the reachability bitmap to provide proof a client has
an object. Even if its not in the edges. This allows us much better
delta reuse, as often frequently deltas will be available pointing to
something behind the edge, but that the client certainly has given the
edges we know about.

We also use the reachability bitmap to provide proof a client does not
need an object. We found a reduction in number of objects transferred
because the "want AND NOT have" subtracted out a number of objects not
in the edge. Apparently merges, reverts and cherry-picks happen often
enough in the repositories we host that this particular optimization
helps reduce data transfer, and work at both server and client ends of
the connection. Its a nice freebie the bitmap algorithm gives us.

>   2. The server is not necessarily fully packed. In an active repo, you
>      may have a large "base" pack with bitmaps, with several recently
>      pushed packs on top. You still need to delta the recently pushed
>      objects against the base objects.

Yes, this is unfortunate. One way we avoid this in JGit is to keep
everything in pack files, rather than exploding loose. The
reachability bitmap often proves the client has the delta base the
pusher used to make the object, allowing us to reuse the delta. It may
not be the absolute best delta in the world, but reuse is faster than
inflate()+delta()+deflate(), and the delta is probably "good enough"
until the server can do a real GC in the background.

We combine small packs from pushes together by almost literally just
concat'ing the packs together and creating a new .idx. Newer pushed
data is put in front of the older data, the pack is clustered by
"commit, tree, blob" ordering, duplicates are removed, and its written
back to disk. Typically we complete this "pack concat" operation mere
seconds after a push finishes, so readers have very few packs to deal
with.

> I don't have measurements on how much the deltas suffer in those two
> cases. I know they suffered quite badly for clones without the name
> hashes in our alternates repos, but that part should go away with my
> patch series.

JGit doesn't poke objects into the object table (or even the object
list) when a bitmap is used. We spool the bits out of the bitmap in
bitmap order and write them to the wire in that order. Its way faster,
but depends on the bitmap being in pack-ordering. So clones are crazy
fast even though we don't have the path-hash table.

JGit also has another optimization where we figure out based on the
bitmap if the client needs *everything* in this pack. Which given a
pack created only for refs/heads/* is the common case for a clone. If
the client is getting all objects we essentially just do a sendfile()
for the region starting at offset 12 through end-20. It can't be a
sendfile() syscall because it has to be computed into the trailer
SHA-1 the client sees, but its a crazy tight IO copy loop with no Git
smarts beyond the SHA-1 updating.

Like I said, there are a ton of optimizations you guys missed. And we
think they make a bigger difference than screwing around with
little-endian format to favor x86 CPUs.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-27  0:53             ` Colby Ranger
@ 2013-06-27  1:32               ` Shawn Pearce
  0 siblings, 0 replies; 64+ messages in thread
From: Shawn Pearce @ 2013-06-27  1:32 UTC (permalink / raw)
  To: Colby Ranger; +Cc: Jeff King, Vicent Martí, git

On Wed, Jun 26, 2013 at 6:53 PM, Colby Ranger <cranger@google.com> wrote:
>> +  Generating this reverse index at runtime is **not** free (around 900ms
>> +  generation time for a repository like `torvalds/linux`), and once again,
>> +  this generation time needs to happen every time `pack-objects` is
>> +  spawned.

900ms is fishy. Creating the revidx should not take that long. But if it is...

> If generating the reverse index is expensive, it is probably
> worthwhile to create a ".revidx" or extend the ".idx" with the
> information sorted by offset.

Colby is probably right that a cached copy of the revidx would help.
Or update the .idx format to have two additional sections that stores
the length of each packed object and the delta base of each packed
object, allowing pack-objects to avoid creating the revidx. This would
be an additional ~8 bytes per object, so ~19.8M for the Linux kernel
(given ~2.6M objects).

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-27  1:11           ` Shawn Pearce
@ 2013-06-27  2:36             ` Vicent Martí
  2013-06-27  2:45               ` Jeff King
  0 siblings, 1 reply; 64+ messages in thread
From: Vicent Martí @ 2013-06-27  2:36 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Junio C Hamano, Colby Ranger, git

That was a very rude reply. :(

Please refrain from interacting with me in the ML in the future. I'l
do accordingly.

Thanks!
vmg

On Thu, Jun 27, 2013 at 3:11 AM, Shawn Pearce <spearce@spearce.org> wrote:
> On Tue, Jun 25, 2013 at 4:08 PM, Vicent Martí <tanoku@gmail.com> wrote:
>> On Tue, Jun 25, 2013 at 11:17 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> What case are you talking about?
>>>
>>> The n-th object must be one of these four types and can never be of
>>> more than one type at the same time, so a natural expectation from
>>> the reader is "If you OR them together, you will get the same set".
>>> If you say "If you XOR them", that forces the reader to wonder when
>>> these bitmaps ever can overlap at the same bit position.
>>
>> I guess this is just wording. I don't particularly care about the
>> distinction, but I'll change it to OR.
>
> Hmm, OK. If you think XOR and OR are the same operation, I also have a
> bridge to sell you. Its in Brooklyn. Its a great value.
>
> The correct operation is OR. Not XOR. OR. Drop the X.
>
>> It cannot be mmapped not particularly because of endianness issues,
>> but because the original format is not indexed and requires a full
>> parse of the whole index before it can be accessed programatically.
>> The wrong endianness just increases the parse time.
>
> Wrong endianness has nothing to do with the parse time. Modern CPUs
> can flip a word around very quickly. In JGit we chose to parse the
> file at load time because its simpler than having an additional index
> segment, and we do what you did which is to toss the object SHA-1s
> into a hashtable for fast lookup. By the time we look for the SHA-1s
> and toss them into a hashtable we can stride through the file and find
> the bitmap regions. Simple.
>
> In other words, the least complex solution possible that still
> provides good performance. I'd say we have pretty good performance.
>
>>>> and I'm going to try to make it run fast enough in that
>>>> encoding.
>>>
>>> Hmph.  Is it an option to start from what JGit does, so that people
>>> can use both JGit and your code on the same repository?
>
> I'm afraid I agree here with Junio. The JGit format is already
> shipping in JGit 3.0, Gerrit Code Review 2.6, and in heavy production
> use for almost a year on android.googlesource.com, and Google's own
> internal Git trees.
>
> I would prefer to see a series adding bitmap support to C Git start
> with the existing format, make it run, taking advantage of the
> optimizations JGit uses (many of which you ignored and tried to "fix"
> in other ways), and then look at improving the file format itself if
> load time is still the largest low hanging fruit in upload-pack. I'm
> guessing its not. You futzed around with the object table, but JGit
> sped itself up considerably by simply not using the object table when
> the bitmap is used. I think there are several such optimizations you
> missed in your rush to redefine the file format.
>
>>>  And then if
>>> you do not succeed, after trying to optimize in-core processing
>>> using that on-disk format to make it fast enough, start thinking
>>> about tweaking the on-disk format?
>>
>> I'm afraid this is not an option. I have an old patchset that
>> implements JGit v1 bitmap loading (and in fact that's how I initially
>> developed these series -- by loading the bitmaps from JGit for
>> debugging), but I discarded it because it simply doesn't pan out in
>> production. ~3 seconds time to spawn `upload-pack` is not an option
>> for us. I did not develop a tweaked on-disk format out of boredom.
>
> I think your code or experiments are bogus. Even on our systems with
> JGit a cold start for the Linux kernel doesn't take 3s. And this is
> JGit where Java is slow because "Jesus it has a lot of factories", and
> without mmap'ing the file into the server's address space. Hell the
> file has to come over the network from a remote disk array.
>
>> I could dig up the patch if you're particularly interested in
>> backwards compatibility, but since it was several times slower than
>> the current iteration, I have no interest (time, actually) to maintain
>> it, brush it up, and so on. I have already offered myself to port the
>> v2 format to JGit as soon as it's settled. It sounds like a better
>> investment of all our times.
>
> Actually, I think the format you propose here is inferior to the JGit
> format. In particular the idx-ordering means the EWAH code is useless.
> You might as well not use the EWAH format and just store 2.6M bits per
> commit. The idx-ordering also makes *much* harder to emit a pack file
> a reasonable order for the client. Colby and I tried idx-ordering and
> discarded it when it didn't perform as well as the pack-ordering that
> JGit uses.
>
>> Following up on Shawn's comments, I removed the little-endian support
>> from the on-disk format and implemented lazy loading of the bitmaps to
>> make up for it. The result is decent (slowed down from 250ms to 300ms)
>> and it lets us keep the whole format as NWO on disk. I think it's a
>> good tradeback.
>
> The maintenance burden of two endian formats in a single file is too
> high to justify. I'm glad to see you saw that.
>
>> As it stands right now, the only two changes from v1 of the on-disk format are:
>>
>> - There is an index at the end. This is a good idea.
>
> I don't think the index is necessary if you plan to build a hashtable
> at runtime anyway. If you mmap the file you can quickly skip over a
> bitmap and find the next SHA-1 using this thing called "pointer
> arithmetic". I am not sure if you are familiar with the term, perhaps
> you could search the web for it.
>
>> - The bitmaps are sorted in packfile-index order, not in packfile
>> order. This is a good idea.
>
> As Colby and I have repeatedly tried to explain, this is not a good idea.
>
>> German kisses,
>
> Strawberry and now German kisses? What's next, Mango kisses?

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-27  2:36             ` Vicent Martí
@ 2013-06-27  2:45               ` Jeff King
  2013-06-27 16:07                 ` Shawn Pearce
  0 siblings, 1 reply; 64+ messages in thread
From: Jeff King @ 2013-06-27  2:45 UTC (permalink / raw)
  To: Vicent Martí; +Cc: Shawn Pearce, Junio C Hamano, Colby Ranger, git

On Thu, Jun 27, 2013 at 04:36:54AM +0200, Vicent Martí wrote:

> That was a very rude reply. :(
> 
> Please refrain from interacting with me in the ML in the future. I'l
> do accordingly.

I agree that the pointer arithmetic thing may have been a little much,
but I think there are some points we need to address in Shawn's email.

In particular, it seems like the slowness we saw with the v1 bitmap
format is not what Shawn and Colby have experienced. So it's possible
that our test setup is bad or different. Or maybe the C v1 reading
implementation had some problems that are fixable. It's hard to say
because we haven't shown any code that can be timed and compared.

And the pack-order versus idx-order for the bitmaps is still up in the
air. Do we have numbers on the on-disk sizes of the resulting EWAHs? The
pack-order ones should be more amenable to run-length encoding,
especially as you get further down into history (the tip ones would
mostly be 1's, no matter how you order them).

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 07/16] compat: add endinanness helpers
  2013-06-25 13:25     ` Vicent Martí
@ 2013-06-27  5:56       ` Peter Krefting
  0 siblings, 0 replies; 64+ messages in thread
From: Peter Krefting @ 2013-06-27  5:56 UTC (permalink / raw)
  To: Vicent Martí; +Cc: Git Mailing List

Vicent Martí:

> I'm aware of that, but Git needs to build with glibc 2.7+ (or was it 
> 2.6?), hence the need for this compat layer.

Right. But perhaps the compatibility layer could provide the 
functionality with the names available in the later glibc versions 
(and on *BSD)? That would make it easier to read the code that is 
using it.

-- 
\\// Peter - http://www.softwolves.pp.se/

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-27  2:45               ` Jeff King
@ 2013-06-27 16:07                 ` Shawn Pearce
  2013-06-27 17:17                   ` Jeff King
  2013-07-01 18:47                   ` Colby Ranger
  0 siblings, 2 replies; 64+ messages in thread
From: Shawn Pearce @ 2013-06-27 16:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Vicent Martí, Junio C Hamano, Colby Ranger, git

On Wed, Jun 26, 2013 at 7:45 PM, Jeff King <peff@peff.net> wrote:
>
> In particular, it seems like the slowness we saw with the v1 bitmap
> format is not what Shawn and Colby have experienced. So it's possible
> that our test setup is bad or different. Or maybe the C v1 reading
> implementation had some problems that are fixable. It's hard to say
> because we haven't shown any code that can be timed and compared.

Right, the format and implementation in JGit can do "Counting objects"
in 87ms for the Linux kernel history. But I think we are comparing
apples to steaks here, Vincent is (rightfully) concerned about process
startup performance, whereas our timings were assuming the process was
already running.

It would help everyone to understand the issues involved if we are at
least looking at the same file format.

> And the pack-order versus idx-order for the bitmaps is still up in the
> air. Do we have numbers on the on-disk sizes of the resulting EWAHs?

I did not see any presented in this thread, and I am very interested
in this aspect of the series. The path hash cache should be taking
about 9.9M of disk space, but I recall reading the bitmap file is 8M.
I don't understand.

Colby and I were very concerned about the size of the EWAH compressed
bitmaps because we wanted hundreds of them for a large history like
the kernel, and we wanted to minimize the amount of memory consumed by
the bitmap index when loaded into the process.

In the JGit implementation our copy of Linus' tree has 3.1M objects,
an 81.5 MiB idx file, and a 3.8 MiB bitmap file. We were trying to
keep the overhead below 10% of the idx file, and I think we have
succeeded on that. With 3.1M objects the v2 bitmap proposed in this
thread needs at least 11.8M, or 14+% overhead just for the path hash
cache.

The path hash cache may still be required, Colby and I have been
debating the merits of having the data available for delta compression
vs. the increase in memory required to hold it.

> The
> pack-order ones should be more amenable to run-length encoding,
> especially as you get further down into history (the tip ones would
> mostly be 1's, no matter how you order them).

This is also true for the type bitmaps, but especially so in the JGit
file ordering where we always write all trees before any blobs. The
type bitmaps are very compact and basically amount to defining a
single range in the file. This takes only a few words in the EWAH
compressed format.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-27 16:07                 ` Shawn Pearce
@ 2013-06-27 17:17                   ` Jeff King
  2013-07-01 18:47                   ` Colby Ranger
  1 sibling, 0 replies; 64+ messages in thread
From: Jeff King @ 2013-06-27 17:17 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Vicent Martí, Junio C Hamano, Colby Ranger, git

On Thu, Jun 27, 2013 at 09:07:38AM -0700, Shawn O. Pearce wrote:

> > And the pack-order versus idx-order for the bitmaps is still up in the
> > air. Do we have numbers on the on-disk sizes of the resulting EWAHs?
> 
> I did not see any presented in this thread, and I am very interested
> in this aspect of the series. The path hash cache should be taking
> about 9.9M of disk space, but I recall reading the bitmap file is 8M.
> I don't understand.

I don't know there the 8M number came from, or if it was on the kernel
repo. My bitmap-enabled pack of linux-2.6 (about 3.2M objects) using
Vicent's patches looks like:

  $ du -sh *
  42M     pack-9ea76831aec6c49c5ff42509a2a2ce97da13c5ad.bitmap
  87M     pack-9ea76831aec6c49c5ff42509a2a2ce97da13c5ad.idx
  630M    pack-9ea76831aec6c49c5ff42509a2a2ce97da13c5ad.pack

Packing the same repo with "jgit debug-gc" (jgit 3.0.0) yields:

  $ du -sh *
  3.0M    pack-2478783825733a1f1012f0087a0b5a92aa7437d8.bitmap
  82M     pack-2478783825733a1f1012f0087a0b5a92aa7437d8.idx
  585M    pack-2478783825733a1f1012f0087a0b5a92aa7437d8.pack
  4.8M    pack-f61fb76112372288923be7a0464476892dfebe3e.idx
  97M     pack-f61fb76112372288923be7a0464476892dfebe3e.pack

If we assume that 12M of that is name-hash, that's still an order of
magnitude larger. For reference, jgit created 327 bitmaps (according to
its progress eye candy), and Vicent's patches generated 385. So that
explains some of the increase, but the per-bitmap size is still much
larger.

> The path hash cache may still be required, Colby and I have been
> debating the merits of having the data available for delta compression
> vs. the increase in memory required to hold it.

I guess this is not an option for JGit, but for C git, an mmap-able
name-hash file means we can just fault in the pages mentioning objects
we actually need it for. And its use can be completely optional; in
fact, it doesn't even need to be inside the .bitmap file (though I
cannot think of a reason it would be useful outside of having bitmaps).

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-06-27 16:07                 ` Shawn Pearce
  2013-06-27 17:17                   ` Jeff King
@ 2013-07-01 18:47                   ` Colby Ranger
  2013-07-01 19:13                     ` Shawn Pearce
  2013-07-07  9:46                     ` Jeff King
  1 sibling, 2 replies; 64+ messages in thread
From: Colby Ranger @ 2013-07-01 18:47 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Jeff King, Vicent Martí, Junio C Hamano, git

> Right, the format and implementation in JGit can do "Counting objects"
> in 87ms for the Linux kernel history.

Actually, that was the timing when I first pushed the change. With the
improvements submitted throughout the year, we can do counting in
50ms, on my same machine.

> But I think we are comparing
> apples to steaks here, Vincent is (rightfully) concerned about process
> startup performance, whereas our timings were assuming the process was
> already running.
>

I did some timing on loading the reverse index for the kernel and it
is pretty slow (~1200ms). I just submitted a fix to do a bucket sort
and reduced that to ~450ms, which is still slow but much better:
https://eclipse.googlesource.com/jgit/jgit/+/6cc532a43cf28403cb623d3df8600a2542a40a43%5E%21/

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-07-01 18:47                   ` Colby Ranger
@ 2013-07-01 19:13                     ` Shawn Pearce
  2013-07-07  9:46                     ` Jeff King
  1 sibling, 0 replies; 64+ messages in thread
From: Shawn Pearce @ 2013-07-01 19:13 UTC (permalink / raw)
  To: Colby Ranger; +Cc: Jeff King, Vicent Martí, Junio C Hamano, git

On Mon, Jul 1, 2013 at 11:47 AM, Colby Ranger <cranger@google.com> wrote:
>> But I think we are comparing
>> apples to steaks here, Vincent is (rightfully) concerned about process
>> startup performance, whereas our timings were assuming the process was
>> already running.
>>
>
> I did some timing on loading the reverse index for the kernel and it
> is pretty slow (~1200ms). I just submitted a fix to do a bucket sort
> and reduced that to ~450ms, which is still slow but much better:
> https://eclipse.googlesource.com/jgit/jgit/+/6cc532a43cf28403cb623d3df8600a2542a40a43%5E%21/

A reverse index that is hot in RAM would obviously load in about 0ms.
But a cold load of a reverse index that uses only 4 bytes per object
(as Colby did here) for 3.1M objects could take ~590ms to read from
disk, assuming spinning media moving 20 MiB/s. If 8 byte offsets were
also stored this could be more like 1700ms.

Numbers obviously get better if the spinning media can transfer at 40
MiB/s, now its more like 295ms for 4 bytes/object and 885ms for 12
bytes/object.

I think its still reasonable to compute the reverse index on the fly.
But JGit certainly does have the benefit of reusing it across requests
by relying on process memory based caches. C Git needs to rely on the
kernel buffer cache, which requires this data be written out to a file
to be shared.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-07-01 18:47                   ` Colby Ranger
  2013-07-01 19:13                     ` Shawn Pearce
@ 2013-07-07  9:46                     ` Jeff King
  2013-07-07 17:27                       ` Shawn Pearce
  1 sibling, 1 reply; 64+ messages in thread
From: Jeff King @ 2013-07-07  9:46 UTC (permalink / raw)
  To: Colby Ranger; +Cc: Shawn Pearce, Vicent Martí, Junio C Hamano, git

On Mon, Jul 01, 2013 at 11:47:32AM -0700, Colby Ranger wrote:

> > But I think we are comparing
> > apples to steaks here, Vincent is (rightfully) concerned about process
> > startup performance, whereas our timings were assuming the process was
> > already running.
> >
> 
> I did some timing on loading the reverse index for the kernel and it
> is pretty slow (~1200ms). I just submitted a fix to do a bucket sort
> and reduced that to ~450ms, which is still slow but much better:

On my machine, loading the kernel revidx in C git is about ~830ms. I
switched the qsort() call to a radix/bucket sort, and have it down to
~200ms. So definitely much better, though that still leaves a bit to be
desired for quick commands. E.g., "git rev-list --count A..B" should
become fairly instantaneous with bitmaps, but in many cases the revindex
loading will take longer than it would have to simply do the actual
traversal.

-Peff

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH 09/16] documentation: add documentation for the bitmap format
  2013-07-07  9:46                     ` Jeff King
@ 2013-07-07 17:27                       ` Shawn Pearce
  0 siblings, 0 replies; 64+ messages in thread
From: Shawn Pearce @ 2013-07-07 17:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Colby Ranger, Vicent Martí, Junio C Hamano, git

On Sun, Jul 7, 2013 at 2:46 AM, Jeff King <peff@peff.net> wrote:
> On Mon, Jul 01, 2013 at 11:47:32AM -0700, Colby Ranger wrote:
>
>> > But I think we are comparing
>> > apples to steaks here, Vincent is (rightfully) concerned about process
>> > startup performance, whereas our timings were assuming the process was
>> > already running.
>> >
>>
>> I did some timing on loading the reverse index for the kernel and it
>> is pretty slow (~1200ms). I just submitted a fix to do a bucket sort
>> and reduced that to ~450ms, which is still slow but much better:
>
> On my machine, loading the kernel revidx in C git is about ~830ms. I
> switched the qsort() call to a radix/bucket sort, and have it down to
> ~200ms. So definitely much better,

This is a very nice reduction. pack-objects would benefit from it even
without bitmaps. Since it doesn't require a data format change this is
a pretty harmless patch to include in Git. We may later conclude
caching the revidx is worthwhile, but until then a bucket sort doesn't
hurt. :-)

> though that still leaves a bit to be
> desired for quick commands. E.g., "git rev-list --count A..B" should
> become fairly instantaneous with bitmaps, but in many cases the revindex
> loading will take longer than it would have to simply do the actual
> traversal.

Yea, we don't know of a way around this. In a few cases the bitmap
code in JGit is slower than the naive traversal, but these are only on
small segments of history. I wonder if you could guess which algorithm
to use by looking at the offsets of A and B using the idx file. If
they are near each other in the pack, run the naive algorithm without
bitmaps and revidx. If they are farther apart assume the bitmap would
help more than traversal and use bitmap+revidx.

Working out what the correct "distance" should be before switching
algorithms is hard. A and B could be megabytes apart in the pack but A
could be B's grandparent and traversed in milliseconds. I wonder how
often that is in practice, certainly if A and B are within a few
hundred kilobytes of each other the naive traversal should be almost
instant.

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2013-07-07 17:27 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-24 23:22 [PATCH 00/16] Speed up Counting Objects with bitmap data Vicent Marti
2013-06-24 23:22 ` [PATCH 01/16] list-objects: mark tree as unparsed when we free its buffer Vicent Marti
2013-06-24 23:22 ` [PATCH 02/16] sha1_file: refactor into `find_pack_object_pos` Vicent Marti
2013-06-25 13:59   ` Thomas Rast
2013-06-24 23:23 ` [PATCH 03/16] pack-objects: use a faster hash table Vicent Marti
2013-06-25 14:03   ` Thomas Rast
2013-06-26  2:14     ` Jeff King
2013-06-26  4:47       ` Jeff King
2013-06-25 17:58   ` Ramkumar Ramachandra
2013-06-25 22:48   ` Junio C Hamano
2013-06-25 23:09     ` Vicent Martí
2013-06-24 23:23 ` [PATCH 04/16] pack-objects: make `pack_name_hash` global Vicent Marti
2013-06-24 23:23 ` [PATCH 05/16] revision: allow setting custom limiter function Vicent Marti
2013-06-24 23:23 ` [PATCH 06/16] sha1_file: export `git_open_noatime` Vicent Marti
2013-06-24 23:23 ` [PATCH 07/16] compat: add endinanness helpers Vicent Marti
2013-06-25 13:08   ` Peter Krefting
2013-06-25 13:25     ` Vicent Martí
2013-06-27  5:56       ` Peter Krefting
2013-06-24 23:23 ` [PATCH 08/16] ewah: compressed bitmap implementation Vicent Marti
2013-06-25  1:10   ` Junio C Hamano
2013-06-25 22:51     ` Junio C Hamano
2013-06-25 15:38   ` Thomas Rast
2013-06-24 23:23 ` [PATCH 09/16] documentation: add documentation for the bitmap format Vicent Marti
2013-06-25  5:42   ` Shawn Pearce
2013-06-25 19:33     ` Vicent Martí
2013-06-25 21:17       ` Junio C Hamano
2013-06-25 22:08         ` Vicent Martí
2013-06-27  1:11           ` Shawn Pearce
2013-06-27  2:36             ` Vicent Martí
2013-06-27  2:45               ` Jeff King
2013-06-27 16:07                 ` Shawn Pearce
2013-06-27 17:17                   ` Jeff King
2013-07-01 18:47                   ` Colby Ranger
2013-07-01 19:13                     ` Shawn Pearce
2013-07-07  9:46                     ` Jeff King
2013-07-07 17:27                       ` Shawn Pearce
2013-06-26  5:11       ` Jeff King
2013-06-26 18:41         ` Colby Ranger
2013-06-26 22:33           ` Colby Ranger
2013-06-27  0:53             ` Colby Ranger
2013-06-27  1:32               ` Shawn Pearce
2013-06-27  1:29         ` Shawn Pearce
2013-06-25 15:58   ` Thomas Rast
2013-06-25 22:30     ` Vicent Martí
2013-06-26 23:12       ` Thomas Rast
2013-06-26 23:19         ` Thomas Rast
2013-06-24 23:23 ` [PATCH 10/16] pack-objects: use bitmaps when packing objects Vicent Marti
2013-06-25 12:48   ` Ramkumar Ramachandra
2013-06-25 15:58   ` Thomas Rast
2013-06-25 23:06   ` Junio C Hamano
2013-06-25 23:14     ` Vicent Martí
2013-06-24 23:23 ` [PATCH 11/16] rev-list: add bitmap mode to speed up lists Vicent Marti
2013-06-25 16:22   ` Thomas Rast
2013-06-26  1:45     ` Vicent Martí
2013-06-26 23:13       ` Thomas Rast
2013-06-26  5:22     ` Jeff King
2013-06-24 23:23 ` [PATCH 12/16] pack-objects: implement bitmap writing Vicent Marti
2013-06-24 23:23 ` [PATCH 13/16] repack: consider bitmaps when performing repacks Vicent Marti
2013-06-25 23:00   ` Junio C Hamano
2013-06-25 23:16     ` Vicent Martí
2013-06-24 23:23 ` [PATCH 14/16] sha1_file: implement `nth_packed_object_info` Vicent Marti
2013-06-24 23:23 ` [PATCH 15/16] write-bitmap: implement new git command to write bitmaps Vicent Marti
2013-06-24 23:23 ` [PATCH 16/16] rev-list: Optimize --count using bitmaps too Vicent Marti
2013-06-25 16:05 ` [PATCH 00/16] Speed up Counting Objects with bitmap data Thomas Rast

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for the project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git