git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] Clean up and optimize tree walking some more
@ 2007-03-21 17:07 Linus Torvalds
  2007-03-21 17:07 ` [PATCH 1/3] Remove "pathlen" from "struct name_entry" Linus Torvalds
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Linus Torvalds @ 2007-03-21 17:07 UTC (permalink / raw
  To: Junio C Hamano, Git Mailing List


This series of three patches improves blame performance on the eclipse 
tree by about 15% in my tests by improving the tree walk a bit.

  [ Before-best-of-five: 11.649s
    After-best-of-five:   9.675s ]

The first two patches are just boring cleanups: the first removes an 
unnecessary field from the "name_entry" structure, because I wanted to 
embed it into the "tree_desc" one and it was just totally redundant and I 
felt bad about growing tree_desc more than necessary. No real code 
changes, just replacing the use of "pathlen" with the helper function we 
introduced ealier ("tree_entry_len()").

The second one just makes sure that we always initialize the tree_desc 
structure with a helper function rather than doing it by hand. Again, this 
doesn't actually change any code, although I changed the name of the "buf" 
entry to "buffer", and the type of "size", so that we get nice compiler 
warnings if they are used the old way by mistake.

The second patch is the largest of the lot, but really doesn't do 
anything interesting, just preparatory cleanup.

The third patch is the one that actually changes any code, and is fairly 
straightforward: it just switches around where we actually do the tree 
entry parsing, which is now possible thanks to patch #2. By doing it 
up-front, we only need to do it once (we used to have to do it both when 
doing the "extract" *and* when doing the "update" op - now we do it only 
once per entry, and "extract" is just about looking at the cached 
contents).

The resulting diffstat of the tree patches ends up removing a few more 
lines than it adds (not by a lot), but perhaps more importantly (even more 
than the performance advantage) the code looks nicer, I think.

 builtin-fsck.c         |    5 +-
 builtin-grep.c         |   13 +++--
 builtin-pack-objects.c |    8 +--
 builtin-read-tree.c    |    3 +-
 builtin-reflog.c       |   10 ++--
 fetch.c                |    3 +-
 http-push.c            |    3 +-
 list-objects.c         |    3 +-
 merge-tree.c           |    9 ++--
 reachable.c            |    3 +-
 revision.c             |   12 ++---
 tree-diff.c            |   22 +++++----
 tree-walk.c            |  123 ++++++++++++++++++++++--------------------------
 tree-walk.h            |   20 ++++++--
 tree.c                 |   18 +++----
 unpack-trees.c         |    3 +-
 16 files changed, 125 insertions(+), 133 deletions(-)

I'm pretty sure this is all good, and it obviously passes all the tests, 
but more importantly none of the changes were really very complicated, and 
patch#2 (which is the big one) was set up so that the compiler would not 
even compile code that wasn't properly converted, so it should all be 
good.

			Linus

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/3] Remove "pathlen" from "struct name_entry"
  2007-03-21 17:07 [PATCH 0/3] Clean up and optimize tree walking some more Linus Torvalds
@ 2007-03-21 17:07 ` Linus Torvalds
  2007-03-21 17:08 ` [PATCH 2/3] Initialize tree descriptors with a helper function rather than by hand Linus Torvalds
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2007-03-21 17:07 UTC (permalink / raw
  To: Junio C Hamano, Git Mailing List


Since we have the "tree_entry_len()" helper function these days, and
don't need to do a full strlen(), there's no point in saving the path
length - it's just redundant information.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 builtin-grep.c         |    2 +-
 builtin-pack-objects.c |    2 +-
 merge-tree.c           |    9 +++++----
 tree-walk.c            |    6 ++----
 tree-walk.h            |    1 -
 tree.c                 |    9 +++++----
 6 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/builtin-grep.c b/builtin-grep.c
index 4510d35..1348cc9 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -378,7 +378,7 @@ static int grep_tree(struct grep_opt *opt, const char **paths,
 			 * decide if we want to descend into "abc"
 			 * directory.
 			 */
-			strcpy(path_buf + len + entry.pathlen, "/");
+			strcpy(path_buf + len + tree_entry_len(entry.path, entry.sha1), "/");
 
 		if (!pathspec_matches(paths, down))
 			;
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 73d448b..9231b65 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -854,7 +854,7 @@ static void add_pbase_object(struct tree_desc *tree,
 		unsigned long size;
 		enum object_type type;
 
-		if (entry.pathlen != cmplen ||
+		if (tree_entry_len(entry.path, entry.sha1) != cmplen ||
 		    memcmp(entry.path, name, cmplen) ||
 		    !has_sha1_file(entry.sha1) ||
 		    (type = sha1_object_info(entry.sha1, &size)) < 0)
diff --git a/merge-tree.c b/merge-tree.c
index b2867ba..3b8d9e6 100644
--- a/merge-tree.c
+++ b/merge-tree.c
@@ -188,7 +188,7 @@ static void resolve(const char *base, struct name_entry *branch1, struct name_en
 
 static int unresolved_directory(const char *base, struct name_entry n[3])
 {
-	int baselen;
+	int baselen, pathlen;
 	char *newbase;
 	struct name_entry *p;
 	struct tree_desc t[3];
@@ -205,10 +205,11 @@ static int unresolved_directory(const char *base, struct name_entry n[3])
 	if (!S_ISDIR(p->mode))
 		return 0;
 	baselen = strlen(base);
-	newbase = xmalloc(baselen + p->pathlen + 2);
+	pathlen = tree_entry_len(p->path, p->sha1);
+	newbase = xmalloc(baselen + pathlen + 2);
 	memcpy(newbase, base, baselen);
-	memcpy(newbase + baselen, p->path, p->pathlen);
-	memcpy(newbase + baselen + p->pathlen, "/", 2);
+	memcpy(newbase + baselen, p->path, pathlen);
+	memcpy(newbase + baselen + pathlen, "/", 2);
 
 	buf0 = fill_tree_descriptor(t+0, n[0].sha1);
 	buf1 = fill_tree_descriptor(t+1, n[1].sha1);
diff --git a/tree-walk.c b/tree-walk.c
index a4a4e2a..1869bae 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -20,8 +20,8 @@ void *fill_tree_descriptor(struct tree_desc *desc, const unsigned char *sha1)
 static int entry_compare(struct name_entry *a, struct name_entry *b)
 {
 	return base_name_compare(
-			a->path, a->pathlen, a->mode,
-			b->path, b->pathlen, b->mode);
+			a->path, tree_entry_len(a->path, a->sha1), a->mode,
+			b->path, tree_entry_len(b->path, b->sha1), b->mode);
 }
 
 static void entry_clear(struct name_entry *a)
@@ -32,7 +32,6 @@ static void entry_clear(struct name_entry *a)
 static void entry_extract(struct tree_desc *t, struct name_entry *a)
 {
 	a->sha1 = tree_entry_extract(t, &a->path, &a->mode);
-	a->pathlen = tree_entry_len(a->path, a->sha1);
 }
 
 void update_tree_entry(struct tree_desc *desc)
@@ -93,7 +92,6 @@ int tree_entry(struct tree_desc *desc, struct name_entry *entry)
 
 	entry->path = path;
 	len = strlen(path);
-	entry->pathlen = len;
 
 	path += len + 1;
 	entry->sha1 = (const unsigned char *) path;
diff --git a/tree-walk.h b/tree-walk.h
index a0d7afd..149393a 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -10,7 +10,6 @@ struct name_entry {
 	const unsigned char *sha1;
 	const char *path;
 	unsigned int mode;
-	int pathlen;
 };
 
 static inline int tree_entry_len(const char *name, const unsigned char *sha1)
diff --git a/tree.c b/tree.c
index 24f8fb6..705a481 100644
--- a/tree.c
+++ b/tree.c
@@ -101,14 +101,15 @@ int read_tree_recursive(struct tree *tree,
 		if (S_ISDIR(entry.mode)) {
 			int retval;
 			char *newbase;
+			unsigned int pathlen = tree_entry_len(entry.path, entry.sha1);
 
-			newbase = xmalloc(baselen + 1 + entry.pathlen);
+			newbase = xmalloc(baselen + 1 + pathlen);
 			memcpy(newbase, base, baselen);
-			memcpy(newbase + baselen, entry.path, entry.pathlen);
-			newbase[baselen + entry.pathlen] = '/';
+			memcpy(newbase + baselen, entry.path, pathlen);
+			newbase[baselen + pathlen] = '/';
 			retval = read_tree_recursive(lookup_tree(entry.sha1),
 						     newbase,
-						     baselen + entry.pathlen + 1,
+						     baselen + pathlen + 1,
 						     stage, match, fn);
 			free(newbase);
 			if (retval)
-- 
1.5.1.rc1.13.g0872-dirty

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/3] Initialize tree descriptors with a helper function rather than by hand.
  2007-03-21 17:07 [PATCH 0/3] Clean up and optimize tree walking some more Linus Torvalds
  2007-03-21 17:07 ` [PATCH 1/3] Remove "pathlen" from "struct name_entry" Linus Torvalds
@ 2007-03-21 17:08 ` Linus Torvalds
  2007-03-21 17:09 ` [PATCH 3/3] Switch over tree descriptors to contain a pre-parsed entry Linus Torvalds
  2007-03-21 18:32 ` Resend: [PATCH 0/3] Clean up and optimize tree walking some more Linus Torvalds
  3 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2007-03-21 17:08 UTC (permalink / raw
  To: Junio C Hamano, Git Mailing List


This removes slightly more lines than it adds, but the real reason for
doing this is that future optimizations will require more setup of the
tree descriptor, and so we want to do it in one place.

Also renamed the "desc.buf" field to "desc.buffer" just to trigger
compiler errors for old-style manual initializations, making sure I
didn't miss anything.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 builtin-fsck.c         |    5 ++---
 builtin-grep.c         |   11 +++++++----
 builtin-pack-objects.c |    6 ++----
 builtin-read-tree.c    |    3 +--
 builtin-reflog.c       |   10 +++++-----
 fetch.c                |    3 +--
 http-push.c            |    3 +--
 list-objects.c         |    3 +--
 reachable.c            |    3 +--
 revision.c             |   12 +++++-------
 tree-diff.c            |   22 ++++++++++++----------
 tree-walk.c            |   24 +++++++++++++++---------
 tree-walk.h            |    5 +++--
 tree.c                 |    9 +++------
 unpack-trees.c         |    3 +--
 15 files changed, 60 insertions(+), 62 deletions(-)

diff --git a/builtin-fsck.c b/builtin-fsck.c
index b8e71b6..21f1f9e 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -227,8 +227,7 @@ static int fsck_tree(struct tree *item)
 	const char *o_name;
 	const unsigned char *o_sha1;
 
-	desc.buf = item->buffer;
-	desc.size = item->size;
+	init_tree_desc(&desc, item->buffer, item->size);
 
 	o_mode = 0;
 	o_name = NULL;
@@ -242,7 +241,7 @@ static int fsck_tree(struct tree *item)
 
 		if (strchr(name, '/'))
 			has_full_path = 1;
-		has_zero_pad |= *(char *)desc.buf == '0';
+		has_zero_pad |= *(char *)desc.buffer == '0';
 		update_tree_entry(&desc);
 
 		switch (mode) {
diff --git a/builtin-grep.c b/builtin-grep.c
index 1348cc9..981f3d4 100644
--- a/builtin-grep.c
+++ b/builtin-grep.c
@@ -388,11 +388,13 @@ static int grep_tree(struct grep_opt *opt, const char **paths,
 			enum object_type type;
 			struct tree_desc sub;
 			void *data;
-			data = read_sha1_file(entry.sha1, &type, &sub.size);
+			unsigned long size;
+
+			data = read_sha1_file(entry.sha1, &type, &size);
 			if (!data)
 				die("unable to read tree (%s)",
 				    sha1_to_hex(entry.sha1));
-			sub.buf = data;
+			init_tree_desc(&sub, data, size);
 			hit |= grep_tree(opt, paths, &sub, tree_name, down);
 			free(data);
 		}
@@ -408,12 +410,13 @@ static int grep_object(struct grep_opt *opt, const char **paths,
 	if (obj->type == OBJ_COMMIT || obj->type == OBJ_TREE) {
 		struct tree_desc tree;
 		void *data;
+		unsigned long size;
 		int hit;
 		data = read_object_with_reference(obj->sha1, tree_type,
-						  &tree.size, NULL);
+						  &size, NULL);
 		if (!data)
 			die("unable to read tree (%s)", sha1_to_hex(obj->sha1));
-		tree.buf = data;
+		init_tree_desc(&tree, data, size);
 		hit = grep_tree(opt, paths, &tree, name, "");
 		free(data);
 		return hit;
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 9231b65..b5f9648 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -873,8 +873,7 @@ static void add_pbase_object(struct tree_desc *tree,
 			tree = pbase_tree_get(entry.sha1);
 			if (!tree)
 				return;
-			sub.buf = tree->tree_data;
-			sub.size = tree->tree_size;
+			init_tree_desc(&sub, tree->tree_data, tree->tree_size);
 
 			add_pbase_object(&sub, down, downlen, fullname);
 			pbase_tree_put(tree);
@@ -937,8 +936,7 @@ static void add_preferred_base_object(const char *name, unsigned hash)
 		}
 		else {
 			struct tree_desc tree;
-			tree.buf = it->pcache.tree_data;
-			tree.size = it->pcache.tree_size;
+			init_tree_desc(&tree, it->pcache.tree_data, it->pcache.tree_size);
 			add_pbase_object(&tree, name, cmplen, name);
 		}
 	}
diff --git a/builtin-read-tree.c b/builtin-read-tree.c
index e477155..82df941 100644
--- a/builtin-read-tree.c
+++ b/builtin-read-tree.c
@@ -55,8 +55,7 @@ static void prime_cache_tree_rec(struct cache_tree *it, struct tree *tree)
 	int cnt;
 
 	hashcpy(it->sha1, tree->object.sha1);
-	desc.buf = tree->buffer;
-	desc.size = tree->size;
+	init_tree_desc(&desc, tree->buffer, tree->size);
 	cnt = 0;
 	while (tree_entry(&desc, &entry)) {
 		if (!S_ISDIR(entry.mode))
diff --git a/builtin-reflog.c b/builtin-reflog.c
index 186aabc..4c39f1d 100644
--- a/builtin-reflog.c
+++ b/builtin-reflog.c
@@ -52,18 +52,18 @@ static int tree_is_complete(const unsigned char *sha1)
 	if (tree->object.flags & INCOMPLETE)
 		return 0;
 
-	desc.buf = tree->buffer;
-	desc.size = tree->size;
-	if (!desc.buf) {
+	if (!tree->buffer) {
 		enum object_type type;
-		void *data = read_sha1_file(sha1, &type, &desc.size);
+		unsigned long size;
+		void *data = read_sha1_file(sha1, &type, &size);
 		if (!data) {
 			tree->object.flags |= INCOMPLETE;
 			return 0;
 		}
-		desc.buf = data;
 		tree->buffer = data;
+		tree->size = size;
 	}
+	init_tree_desc(&desc, tree->buffer, tree->size);
 	complete = 1;
 	while (tree_entry(&desc, &entry)) {
 		if (!has_sha1_file(entry.sha1) ||
diff --git a/fetch.c b/fetch.c
index f69be82..8e29d31 100644
--- a/fetch.c
+++ b/fetch.c
@@ -42,8 +42,7 @@ static int process_tree(struct tree *tree)
 	if (parse_tree(tree))
 		return -1;
 
-	desc.buf = tree->buffer;
-	desc.size = tree->size;
+	init_tree_desc(&desc, tree->buffer, tree->size);
 	while (tree_entry(&desc, &entry)) {
 		struct object *obj = NULL;
 
diff --git a/http-push.c b/http-push.c
index cbb02d3..724720c 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1750,8 +1750,7 @@ static struct object_list **process_tree(struct tree *tree,
 	me.elem = name;
 	me.elem_len = strlen(name);
 
-	desc.buf = tree->buffer;
-	desc.size = tree->size;
+	init_tree_desc(&desc, tree->buffer, tree->size);
 
 	while (tree_entry(&desc, &entry)) {
 		if (S_ISDIR(entry.mode))
diff --git a/list-objects.c b/list-objects.c
index f1fa21c..2ba2c95 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -49,8 +49,7 @@ static void process_tree(struct rev_info *revs,
 	me.elem = name;
 	me.elem_len = strlen(name);
 
-	desc.buf = tree->buffer;
-	desc.size = tree->size;
+	init_tree_desc(&desc, tree->buffer, tree->size);
 
 	while (tree_entry(&desc, &entry)) {
 		if (S_ISDIR(entry.mode))
diff --git a/reachable.c b/reachable.c
index 01760d7..ff3dd34 100644
--- a/reachable.c
+++ b/reachable.c
@@ -42,8 +42,7 @@ static void process_tree(struct tree *tree,
 	me.elem = name;
 	me.elem_len = strlen(name);
 
-	desc.buf = tree->buffer;
-	desc.size = tree->size;
+	init_tree_desc(&desc, tree->buffer, tree->size);
 
 	while (tree_entry(&desc, &entry)) {
 		if (S_ISDIR(entry.mode))
diff --git a/revision.c b/revision.c
index c680dcb..adc381c 100644
--- a/revision.c
+++ b/revision.c
@@ -62,8 +62,7 @@ void mark_tree_uninteresting(struct tree *tree)
 	if (parse_tree(tree) < 0)
 		die("bad tree %s", sha1_to_hex(obj->sha1));
 
-	desc.buf = tree->buffer;
-	desc.size = tree->size;
+	init_tree_desc(&desc, tree->buffer, tree->size);
 	while (tree_entry(&desc, &entry)) {
 		if (S_ISDIR(entry.mode))
 			mark_tree_uninteresting(lookup_tree(entry.sha1));
@@ -275,18 +274,17 @@ int rev_same_tree_as_empty(struct rev_info *revs, struct tree *t1)
 {
 	int retval;
 	void *tree;
+	unsigned long size;
 	struct tree_desc empty, real;
 
 	if (!t1)
 		return 0;
 
-	tree = read_object_with_reference(t1->object.sha1, tree_type, &real.size, NULL);
+	tree = read_object_with_reference(t1->object.sha1, tree_type, &size, NULL);
 	if (!tree)
 		return 0;
-	real.buf = tree;
-
-	empty.buf = "";
-	empty.size = 0;
+	init_tree_desc(&real, tree, size);
+	init_tree_desc(&empty, "", 0);
 
 	tree_difference = REV_TREE_SAME;
 	revs->pruning.has_changes = 0;
diff --git a/tree-diff.c b/tree-diff.c
index b2f35dc..3678805 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -154,12 +154,13 @@ static void show_entry(struct diff_options *opt, const char *prefix, struct tree
 		char *newbase = malloc_base(base, baselen, path, pathlen);
 		struct tree_desc inner;
 		void *tree;
+		unsigned long size;
 
-		tree = read_sha1_file(sha1, &type, &inner.size);
+		tree = read_sha1_file(sha1, &type, &size);
 		if (!tree || type != OBJ_TREE)
 			die("corrupt tree sha %s", sha1_to_hex(sha1));
 
-		inner.buf = tree;
+		init_tree_desc(&inner, tree, size);
 		show_tree(opt, prefix, &inner, newbase, baselen + 1 + pathlen);
 
 		free(tree);
@@ -227,16 +228,17 @@ int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const cha
 {
 	void *tree1, *tree2;
 	struct tree_desc t1, t2;
+	unsigned long size1, size2;
 	int retval;
 
-	tree1 = read_object_with_reference(old, tree_type, &t1.size, NULL);
+	tree1 = read_object_with_reference(old, tree_type, &size1, NULL);
 	if (!tree1)
 		die("unable to read source tree (%s)", sha1_to_hex(old));
-	tree2 = read_object_with_reference(new, tree_type, &t2.size, NULL);
+	tree2 = read_object_with_reference(new, tree_type, &size2, NULL);
 	if (!tree2)
 		die("unable to read destination tree (%s)", sha1_to_hex(new));
-	t1.buf = tree1;
-	t2.buf = tree2;
+	init_tree_desc(&t1, tree1, size1);
+	init_tree_desc(&t2, tree2, size2);
 	retval = diff_tree(&t1, &t2, base, opt);
 	free(tree1);
 	free(tree2);
@@ -247,15 +249,15 @@ int diff_root_tree_sha1(const unsigned char *new, const char *base, struct diff_
 {
 	int retval;
 	void *tree;
+	unsigned long size;
 	struct tree_desc empty, real;
 
-	tree = read_object_with_reference(new, tree_type, &real.size, NULL);
+	tree = read_object_with_reference(new, tree_type, &size, NULL);
 	if (!tree)
 		die("unable to read root tree (%s)", sha1_to_hex(new));
-	real.buf = tree;
+	init_tree_desc(&real, tree, size);
 
-	empty.size = 0;
-	empty.buf = "";
+	init_tree_desc(&empty, "", 0);
 	retval = diff_tree(&empty, &real, base, opt);
 	free(tree);
 	return retval;
diff --git a/tree-walk.c b/tree-walk.c
index 1869bae..c65492c 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -2,6 +2,12 @@
 #include "tree-walk.h"
 #include "tree.h"
 
+void init_tree_desc(struct tree_desc *desc, const void *buffer, unsigned long size)
+{
+	desc->buffer = buffer;
+	desc->size = size;
+}
+
 void *fill_tree_descriptor(struct tree_desc *desc, const unsigned char *sha1)
 {
 	unsigned long size = 0;
@@ -12,8 +18,7 @@ void *fill_tree_descriptor(struct tree_desc *desc, const unsigned char *sha1)
 		if (!buf)
 			die("unable to read tree %s", sha1_to_hex(sha1));
 	}
-	desc->size = size;
-	desc->buf = buf;
+	init_tree_desc(desc, buf, size);
 	return buf;
 }
 
@@ -36,13 +41,13 @@ static void entry_extract(struct tree_desc *t, struct name_entry *a)
 
 void update_tree_entry(struct tree_desc *desc)
 {
-	const void *buf = desc->buf;
+	const void *buf = desc->buffer;
 	unsigned long size = desc->size;
 	int len = strlen(buf) + 1 + 20;
 
 	if (size < len)
 		die("corrupt tree file");
-	desc->buf = (char *) buf + len;
+	desc->buffer = (char *) buf + len;
 	desc->size = size - len;
 }
 
@@ -62,7 +67,7 @@ static const char *get_mode(const char *str, unsigned int *modep)
 
 const unsigned char *tree_entry_extract(struct tree_desc *desc, const char **pathp, unsigned int *modep)
 {
-	const void *tree = desc->buf;
+	const void *tree = desc->buffer;
 	unsigned long size = desc->size;
 	int len = strlen(tree)+1;
 	const unsigned char *sha1 = (unsigned char *) tree + len;
@@ -79,7 +84,7 @@ const unsigned char *tree_entry_extract(struct tree_desc *desc, const char **pat
 
 int tree_entry(struct tree_desc *desc, struct name_entry *entry)
 {
-	const void *tree = desc->buf;
+	const void *tree = desc->buffer;
 	const char *path;
 	unsigned long len, size = desc->size;
 
@@ -101,7 +106,7 @@ int tree_entry(struct tree_desc *desc, struct name_entry *entry)
 	if (len > size)
 		die("corrupt tree file");
 
-	desc->buf = path;
+	desc->buffer = path;
 	desc->size = size - len;
 	return 1;
 }
@@ -196,10 +201,11 @@ int get_tree_entry(const unsigned char *tree_sha1, const char *name, unsigned ch
 {
 	int retval;
 	void *tree;
+	unsigned long size;
 	struct tree_desc t;
 	unsigned char root[20];
 
-	tree = read_object_with_reference(tree_sha1, tree_type, &t.size, root);
+	tree = read_object_with_reference(tree_sha1, tree_type, &size, root);
 	if (!tree)
 		return -1;
 
@@ -208,7 +214,7 @@ int get_tree_entry(const unsigned char *tree_sha1, const char *name, unsigned ch
 		return 0;
 	}
 
-	t.buf = tree;
+	init_tree_desc(&t, tree, size);
 	retval = find_tree_entry(&t, name, sha1, mode);
 	free(tree);
 	return retval;
diff --git a/tree-walk.h b/tree-walk.h
index 149393a..ca0c29f 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -2,8 +2,8 @@
 #define TREE_WALK_H
 
 struct tree_desc {
-	const void *buf;
-	unsigned long size;
+	const void *buffer;
+	unsigned int size;
 };
 
 struct name_entry {
@@ -18,6 +18,7 @@ static inline int tree_entry_len(const char *name, const unsigned char *sha1)
 }
 
 void update_tree_entry(struct tree_desc *);
+void init_tree_desc(struct tree_desc *desc, const void *buf, unsigned long size);
 const unsigned char *tree_entry_extract(struct tree_desc *, const char **, unsigned int *);
 
 /* Helper function that does both of the above and returns true for success */
diff --git a/tree.c b/tree.c
index 705a481..d188c0f 100644
--- a/tree.c
+++ b/tree.c
@@ -83,8 +83,7 @@ int read_tree_recursive(struct tree *tree,
 	if (parse_tree(tree))
 		return -1;
 
-	desc.buf = tree->buffer;
-	desc.size = tree->size;
+	init_tree_desc(&desc, tree->buffer, tree->size);
 
 	while (tree_entry(&desc, &entry)) {
 		if (!match_tree_entry(base, baselen, entry.path, entry.mode, match))
@@ -152,16 +151,14 @@ static void track_tree_refs(struct tree *item)
 	struct name_entry entry;
 
 	/* Count how many entries there are.. */
-	desc.buf = item->buffer;
-	desc.size = item->size;
+	init_tree_desc(&desc, item->buffer, item->size);
 	while (tree_entry(&desc, &entry))
 		n_refs++;
 
 	/* Allocate object refs and walk it again.. */
 	i = 0;
 	refs = alloc_object_refs(n_refs);
-	desc.buf = item->buffer;
-	desc.size = item->size;
+	init_tree_desc(&desc, item->buffer, item->size);
 	while (tree_entry(&desc, &entry)) {
 		struct object *obj;
 
diff --git a/unpack-trees.c b/unpack-trees.c
index 2e2232c..ee10eea 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -27,8 +27,7 @@ static struct tree_entry_list *create_tree_entry_list(struct tree *tree)
 	if (!tree->object.parsed)
 		parse_tree(tree);
 
-	desc.buf = tree->buffer;
-	desc.size = tree->size;
+	init_tree_desc(&desc, tree->buffer, tree->size);
 
 	while (tree_entry(&desc, &one)) {
 		struct tree_entry_list *entry;
-- 
1.5.1.rc1.13.g0872-dirty

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/3] Switch over tree descriptors to contain a pre-parsed entry
  2007-03-21 17:07 [PATCH 0/3] Clean up and optimize tree walking some more Linus Torvalds
  2007-03-21 17:07 ` [PATCH 1/3] Remove "pathlen" from "struct name_entry" Linus Torvalds
  2007-03-21 17:08 ` [PATCH 2/3] Initialize tree descriptors with a helper function rather than by hand Linus Torvalds
@ 2007-03-21 17:09 ` Linus Torvalds
  2007-03-21 18:32 ` Resend: [PATCH 0/3] Clean up and optimize tree walking some more Linus Torvalds
  3 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2007-03-21 17:09 UTC (permalink / raw
  To: Junio C Hamano, Git Mailing List


This makes the tree descriptor contain a "struct name_entry" as part of 
it, and it gets filled in so that it always contains a valid entry. On 
some benchmarks, it improves performance by up to 15%.

That makes tree entry "extract" trivial, and means that we only actually
need to decode each tree entry just once: we decode the first one when
we initialize the tree descriptor, and each subsequent one when doing
"update_tree_entry()".  In particular, this means that we don't need to
do strlen() both at extract time _and_ at update time.

Finally, it also allows more sharing of code (entry_extract(), that
wanted a "struct name_entry", just got totally trivial, along with the
"tree_entry()" function).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 tree-walk.c |  101 +++++++++++++++++++++++++---------------------------------
 tree-walk.h |   18 +++++++---
 2 files changed, 57 insertions(+), 62 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index c65492c..3cb757b 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -2,10 +2,42 @@
 #include "tree-walk.h"
 #include "tree.h"
 
+static const char *get_mode(const char *str, unsigned int *modep)
+{
+	unsigned char c;
+	unsigned int mode = 0;
+
+	while ((c = *str++) != ' ') {
+		if (c < '0' || c > '7')
+			return NULL;
+		mode = (mode << 3) + (c - '0');
+	}
+	*modep = mode;
+	return str;
+}
+
+static void decode_tree_entry(struct tree_desc *desc, const void *buf, unsigned long size)
+{
+	const char *path;
+	unsigned int mode, len;
+
+	path = get_mode(buf, &mode);
+	if (!path)
+		die("corrupt tree file");
+	len = strlen(path) + 1;
+
+	/* Initialize the descriptor entry */
+	desc->entry.path = path;
+	desc->entry.mode = mode;
+	desc->entry.sha1 = (const unsigned char *)(path + len);
+}
+
 void init_tree_desc(struct tree_desc *desc, const void *buffer, unsigned long size)
 {
 	desc->buffer = buffer;
 	desc->size = size;
+	if (size)
+		decode_tree_entry(desc, buffer, size);
 }
 
 void *fill_tree_descriptor(struct tree_desc *desc, const unsigned char *sha1)
@@ -36,78 +68,33 @@ static void entry_clear(struct name_entry *a)
 
 static void entry_extract(struct tree_desc *t, struct name_entry *a)
 {
-	a->sha1 = tree_entry_extract(t, &a->path, &a->mode);
+	*a = t->entry;
 }
 
 void update_tree_entry(struct tree_desc *desc)
 {
 	const void *buf = desc->buffer;
+	const void *end = desc->entry.sha1 + 20;
 	unsigned long size = desc->size;
-	int len = strlen(buf) + 1 + 20;
+	unsigned long len = end - buf;
 
 	if (size < len)
 		die("corrupt tree file");
-	desc->buffer = (char *) buf + len;
-	desc->size = size - len;
-}
-
-static const char *get_mode(const char *str, unsigned int *modep)
-{
-	unsigned char c;
-	unsigned int mode = 0;
-
-	while ((c = *str++) != ' ') {
-		if (c < '0' || c > '7')
-			return NULL;
-		mode = (mode << 3) + (c - '0');
-	}
-	*modep = mode;
-	return str;
-}
-
-const unsigned char *tree_entry_extract(struct tree_desc *desc, const char **pathp, unsigned int *modep)
-{
-	const void *tree = desc->buffer;
-	unsigned long size = desc->size;
-	int len = strlen(tree)+1;
-	const unsigned char *sha1 = (unsigned char *) tree + len;
-	const char *path;
-	unsigned int mode;
-
-	path = get_mode(tree, &mode);
-	if (!path || size < len + 20)
-		die("corrupt tree file");
-	*pathp = path;
-	*modep = canon_mode(mode);
-	return sha1;
+	buf = end;
+	size -= len;
+	desc->buffer = buf;
+	desc->size = size;
+	if (size)
+		decode_tree_entry(desc, buf, size);
 }
 
 int tree_entry(struct tree_desc *desc, struct name_entry *entry)
 {
-	const void *tree = desc->buffer;
-	const char *path;
-	unsigned long len, size = desc->size;
-
-	if (!size)
+	if (!desc->size)
 		return 0;
 
-	path = get_mode(tree, &entry->mode);
-	if (!path)
-		die("corrupt tree file");
-
-	entry->path = path;
-	len = strlen(path);
-
-	path += len + 1;
-	entry->sha1 = (const unsigned char *) path;
-
-	path += 20;
-	len = path - (char *) tree;
-	if (len > size)
-		die("corrupt tree file");
-
-	desc->buffer = path;
-	desc->size = size - len;
+	*entry = desc->entry;
+	update_tree_entry(desc);
 	return 1;
 }
 
diff --git a/tree-walk.h b/tree-walk.h
index ca0c29f..43458cf 100644
--- a/tree-walk.h
+++ b/tree-walk.h
@@ -1,17 +1,25 @@
 #ifndef TREE_WALK_H
 #define TREE_WALK_H
 
-struct tree_desc {
-	const void *buffer;
-	unsigned int size;
-};
-
 struct name_entry {
 	const unsigned char *sha1;
 	const char *path;
 	unsigned int mode;
 };
 
+struct tree_desc {
+	const void *buffer;
+	struct name_entry entry;
+	unsigned int size;
+};
+
+static inline const unsigned char *tree_entry_extract(struct tree_desc *desc, const char **pathp, unsigned int *modep)
+{
+	*pathp = desc->entry.path;
+	*modep = canon_mode(desc->entry.mode);
+	return desc->entry.sha1;
+}
+
 static inline int tree_entry_len(const char *name, const unsigned char *sha1)
 {
 	return (char *)sha1 - (char *)name - 1;
-- 
1.5.1.rc1.13.g0872-dirty

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Resend: [PATCH 0/3] Clean up and optimize tree walking some more
  2007-03-21 17:07 [PATCH 0/3] Clean up and optimize tree walking some more Linus Torvalds
                   ` (2 preceding siblings ...)
  2007-03-21 17:09 ` [PATCH 3/3] Switch over tree descriptors to contain a pre-parsed entry Linus Torvalds
@ 2007-03-21 18:32 ` Linus Torvalds
  3 siblings, 0 replies; 5+ messages in thread
From: Linus Torvalds @ 2007-03-21 18:32 UTC (permalink / raw
  To: Junio C Hamano, Git Mailing List


Hmm.. There may be something wrong with my isp and/or email setup, but I 
never saw this one come back to me, so I'm re-sending just in case it got 
lost.. 

		Linus

On Wed, 21 Mar 2007, Linus Torvalds wrote:
> 
> This series of three patches improves blame performance on the eclipse 
> tree by about 15% in my tests by improving the tree walk a bit.
> 
>   [ Before-best-of-five: 11.649s
>     After-best-of-five:   9.675s ]
> 
> The first two patches are just boring cleanups: the first removes an 
> unnecessary field from the "name_entry" structure, because I wanted to 
> embed it into the "tree_desc" one and it was just totally redundant and I 
> felt bad about growing tree_desc more than necessary. No real code 
> changes, just replacing the use of "pathlen" with the helper function we 
> introduced ealier ("tree_entry_len()").
> 
> The second one just makes sure that we always initialize the tree_desc 
> structure with a helper function rather than doing it by hand. Again, this 
> doesn't actually change any code, although I changed the name of the "buf" 
> entry to "buffer", and the type of "size", so that we get nice compiler 
> warnings if they are used the old way by mistake.
> 
> The second patch is the largest of the lot, but really doesn't do 
> anything interesting, just preparatory cleanup.
> 
> The third patch is the one that actually changes any code, and is fairly 
> straightforward: it just switches around where we actually do the tree 
> entry parsing, which is now possible thanks to patch #2. By doing it 
> up-front, we only need to do it once (we used to have to do it both when 
> doing the "extract" *and* when doing the "update" op - now we do it only 
> once per entry, and "extract" is just about looking at the cached 
> contents).
> 
> The resulting diffstat of the tree patches ends up removing a few more 
> lines than it adds (not by a lot), but perhaps more importantly (even more 
> than the performance advantage) the code looks nicer, I think.
> 
>  builtin-fsck.c         |    5 +-
>  builtin-grep.c         |   13 +++--
>  builtin-pack-objects.c |    8 +--
>  builtin-read-tree.c    |    3 +-
>  builtin-reflog.c       |   10 ++--
>  fetch.c                |    3 +-
>  http-push.c            |    3 +-
>  list-objects.c         |    3 +-
>  merge-tree.c           |    9 ++--
>  reachable.c            |    3 +-
>  revision.c             |   12 ++---
>  tree-diff.c            |   22 +++++----
>  tree-walk.c            |  123 ++++++++++++++++++++++--------------------------
>  tree-walk.h            |   20 ++++++--
>  tree.c                 |   18 +++----
>  unpack-trees.c         |    3 +-
>  16 files changed, 125 insertions(+), 133 deletions(-)
> 
> I'm pretty sure this is all good, and it obviously passes all the tests, 
> but more importantly none of the changes were really very complicated, and 
> patch#2 (which is the big one) was set up so that the compiler would not 
> even compile code that wasn't properly converted, so it should all be 
> good.
> 
> 			Linus
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2007-03-21 18:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-21 17:07 [PATCH 0/3] Clean up and optimize tree walking some more Linus Torvalds
2007-03-21 17:07 ` [PATCH 1/3] Remove "pathlen" from "struct name_entry" Linus Torvalds
2007-03-21 17:08 ` [PATCH 2/3] Initialize tree descriptors with a helper function rather than by hand Linus Torvalds
2007-03-21 17:09 ` [PATCH 3/3] Switch over tree descriptors to contain a pre-parsed entry Linus Torvalds
2007-03-21 18:32 ` Resend: [PATCH 0/3] Clean up and optimize tree walking some more Linus Torvalds

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).