git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] find_containing_dir(): allocate strbuf less extravagantly
@ 2012-05-22 13:16 mhagger
  2012-05-22 17:34 ` Jeff King
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: mhagger @ 2012-05-22 13:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Michael Haggerty

From: Michael Haggerty <mhagger@alum.mit.edu>

It might seem like allocating a fixed-length buffer of uninitialized
memory should be pretty cheap even if the buffer is of size PATH_MAX.
But empirically, it is measurably faster to allocated only the strlen
of the input string.

Thanks to Peff for pointing out a performance regression in this area
that might be fixed by this patch.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
I am not able to reproduce performance regressions as bad as those
observed by Peff.  It seems to depend on the amount of memory
pressure.  The smaller regression that I *did* see is fixed by this
patch, reducing the time for "git fetch . refs/*:refs/*" from 10.1 s
to 9.3 s.  The change is sensible in any case, but we will have to
wait for Peff's verdict about whether it fixes the whole problem for
him, too.

 refs.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs.c b/refs.c
index d6bdb47..fffbb17 100644
--- a/refs.c
+++ b/refs.c
@@ -383,7 +383,7 @@ static struct ref_dir *find_containing_dir(struct ref_dir *dir,
 {
 	struct strbuf dirname;
 	const char *slash;
-	strbuf_init(&dirname, PATH_MAX);
+	strbuf_init(&dirname, strlen(refname));
 	for (slash = strchr(refname, '/'); slash; slash = strchr(slash + 1, '/')) {
 		struct ref_dir *subdir;
 		strbuf_add(&dirname,
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] find_containing_dir(): allocate strbuf less extravagantly
  2012-05-22 13:16 [PATCH] find_containing_dir(): allocate strbuf less extravagantly mhagger
@ 2012-05-22 17:34 ` Jeff King
  2012-05-22 18:50 ` [PATCH 1/3] refs: convert parameter of search_ref_dir() to length-limited string René Scharfe
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2012-05-22 17:34 UTC (permalink / raw)
  To: mhagger; +Cc: Junio C Hamano, git

On Tue, May 22, 2012 at 03:16:06PM +0200, mhagger@alum.mit.edu wrote:

> From: Michael Haggerty <mhagger@alum.mit.edu>
> 
> It might seem like allocating a fixed-length buffer of uninitialized
> memory should be pretty cheap even if the buffer is of size PATH_MAX.
> But empirically, it is measurably faster to allocated only the strlen
> of the input string.
> 
> Thanks to Peff for pointing out a performance regression in this area
> that might be fixed by this patch.
> 
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
> I am not able to reproduce performance regressions as bad as those
> observed by Peff.  It seems to depend on the amount of memory
> pressure.  The smaller regression that I *did* see is fixed by this
> patch, reducing the time for "git fetch . refs/*:refs/*" from 10.1 s
> to 9.3 s.  The change is sensible in any case, but we will have to
> wait for Peff's verdict about whether it fixes the whole problem for
> him, too.

I see a similar small improvement. So I think it is worth doing, but it
does not fix the main regression I found.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] refs: convert parameter of search_ref_dir() to length-limited string
  2012-05-22 13:16 [PATCH] find_containing_dir(): allocate strbuf less extravagantly mhagger
  2012-05-22 17:34 ` Jeff King
@ 2012-05-22 18:50 ` René Scharfe
  2012-05-22 18:50 ` [PATCH 2/3] refs: convert parameter of create_dir_entry() " René Scharfe
  2012-05-22 18:50 ` [PATCH 3/3] refs: use strings directly in find_containing_dir() René Scharfe
  3 siblings, 0 replies; 12+ messages in thread
From: René Scharfe @ 2012-05-22 18:50 UTC (permalink / raw)
  To: mhagger; +Cc: Jeff King, Junio C Hamano, git

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
 refs.c |   13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/refs.c b/refs.c
index d6bdb47..c028333 100644
--- a/refs.c
+++ b/refs.c
@@ -319,19 +319,19 @@ static void sort_ref_dir(struct ref_dir *dir);
  * (non-recursively), sorting dir if necessary.  Return NULL if no
  * such entry is found.  dir must already be complete.
  */
-static struct ref_entry *search_ref_dir(struct ref_dir *dir, const char *refname)
+static struct ref_entry *search_ref_dir(struct ref_dir *dir,
+					const char *refname, size_t len)
 {
 	struct ref_entry *e, **r;
-	int len;
 
 	if (refname == NULL || !dir->nr)
 		return NULL;
 
 	sort_ref_dir(dir);
 
-	len = strlen(refname) + 1;
-	e = xmalloc(sizeof(struct ref_entry) + len);
+	e = xmalloc(sizeof(struct ref_entry) + len + 1);
 	memcpy(e->name, refname, len);
+	e->name[len] = '\0';
 
 	r = bsearch(&e, dir->entries, dir->nr, sizeof(*dir->entries), ref_entry_cmp);
 
@@ -353,7 +353,8 @@ static struct ref_entry *search_ref_dir(struct ref_dir *dir, const char *refname
 static struct ref_dir *search_for_subdir(struct ref_dir *dir,
 					 const char *subdirname, int mkdir)
 {
-	struct ref_entry *entry = search_ref_dir(dir, subdirname);
+	size_t len = strlen(subdirname);
+	struct ref_entry *entry = search_ref_dir(dir, subdirname, len);
 	if (!entry) {
 		if (!mkdir)
 			return NULL;
@@ -412,7 +413,7 @@ static struct ref_entry *find_ref(struct ref_dir *dir, const char *refname)
 	dir = find_containing_dir(dir, refname, 0);
 	if (!dir)
 		return NULL;
-	entry = search_ref_dir(dir, refname);
+	entry = search_ref_dir(dir, refname, strlen(refname));
 	return (entry && !(entry->flag & REF_DIR)) ? entry : NULL;
 }
 
-- 
1.7.10.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] refs: convert parameter of create_dir_entry() to length-limited string
  2012-05-22 13:16 [PATCH] find_containing_dir(): allocate strbuf less extravagantly mhagger
  2012-05-22 17:34 ` Jeff King
  2012-05-22 18:50 ` [PATCH 1/3] refs: convert parameter of search_ref_dir() to length-limited string René Scharfe
@ 2012-05-22 18:50 ` René Scharfe
  2012-05-22 18:50 ` [PATCH 3/3] refs: use strings directly in find_containing_dir() René Scharfe
  3 siblings, 0 replies; 12+ messages in thread
From: René Scharfe @ 2012-05-22 18:50 UTC (permalink / raw)
  To: mhagger; +Cc: Jeff King, Junio C Hamano, git

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
 refs.c |   18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/refs.c b/refs.c
index c028333..c5e167b 100644
--- a/refs.c
+++ b/refs.c
@@ -294,12 +294,13 @@ static void clear_ref_dir(struct ref_dir *dir)
  * "refs/heads/") or "" for the top-level directory.
  */
 static struct ref_entry *create_dir_entry(struct ref_cache *ref_cache,
-					  const char *dirname, int incomplete)
+					  const char *dirname, size_t len,
+					  int incomplete)
 {
 	struct ref_entry *direntry;
-	int len = strlen(dirname);
 	direntry = xcalloc(1, sizeof(struct ref_entry) + len + 1);
-	memcpy(direntry->name, dirname, len + 1);
+	memcpy(direntry->name, dirname, len);
+	direntry->name[len] = '\0';
 	direntry->u.subdir.ref_cache = ref_cache;
 	direntry->flag = REF_DIR | (incomplete ? REF_INCOMPLETE : 0);
 	return direntry;
@@ -364,7 +365,7 @@ static struct ref_dir *search_for_subdir(struct ref_dir *dir,
 		 * therefore, create an empty record for it but mark
 		 * the record complete.
 		 */
-		entry = create_dir_entry(dir->ref_cache, subdirname, 0);
+		entry = create_dir_entry(dir->ref_cache, subdirname, len, 0);
 		add_entry_to_dir(dir, entry);
 	}
 	return get_ref_dir(entry);
@@ -823,7 +824,7 @@ static struct ref_dir *get_packed_refs(struct ref_cache *refs)
 		const char *packed_refs_file;
 		FILE *f;
 
-		refs->packed = create_dir_entry(refs, "", 0);
+		refs->packed = create_dir_entry(refs, "", 0, 0);
 		if (*refs->name)
 			packed_refs_file = git_path_submodule(refs->name, "packed-refs");
 		else
@@ -888,7 +889,8 @@ static void read_loose_refs(const char *dirname, struct ref_dir *dir)
 		} else if (S_ISDIR(st.st_mode)) {
 			strbuf_addch(&refname, '/');
 			add_entry_to_dir(dir,
-					 create_dir_entry(refs, refname.buf, 1));
+					 create_dir_entry(refs, refname.buf,
+							  refname.len, 1));
 		} else {
 			if (*refs->name) {
 				hashclr(sha1);
@@ -918,12 +920,12 @@ static struct ref_dir *get_loose_refs(struct ref_cache *refs)
 		 * are about to read the only subdirectory that can
 		 * hold references:
 		 */
-		refs->loose = create_dir_entry(refs, "", 0);
+		refs->loose = create_dir_entry(refs, "", 0, 0);
 		/*
 		 * Create an incomplete entry for "refs/":
 		 */
 		add_entry_to_dir(get_ref_dir(refs->loose),
-				 create_dir_entry(refs, "refs/", 1));
+				 create_dir_entry(refs, "refs/", 5, 1));
 	}
 	return get_ref_dir(refs->loose);
 }
-- 
1.7.10.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] refs: use strings directly in find_containing_dir()
  2012-05-22 13:16 [PATCH] find_containing_dir(): allocate strbuf less extravagantly mhagger
                   ` (2 preceding siblings ...)
  2012-05-22 18:50 ` [PATCH 2/3] refs: convert parameter of create_dir_entry() " René Scharfe
@ 2012-05-22 18:50 ` René Scharfe
  2012-05-22 21:27   ` Junio C Hamano
  3 siblings, 1 reply; 12+ messages in thread
From: René Scharfe @ 2012-05-22 18:50 UTC (permalink / raw)
  To: mhagger; +Cc: Jeff King, Junio C Hamano, git

Convert the parameter subdirname of search_for_subdir() to a
length-limted string and then simply pass the interesting slice of the
refname from find_containing_dir(), thereby avoiding to duplicate the
string.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
---
Why allocate a NUL-terminated copy at all when we can teach the code to
stop after a given number of characters just as easily?  Alas, this
will still trigger an allocation in search_ref_dir() (see first patch).

 refs.c |   13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/refs.c b/refs.c
index c5e167b..96e943c 100644
--- a/refs.c
+++ b/refs.c
@@ -352,9 +352,9 @@ static struct ref_entry *search_ref_dir(struct ref_dir *dir,
  * directory cannot be found.  dir must already be complete.
  */
 static struct ref_dir *search_for_subdir(struct ref_dir *dir,
-					 const char *subdirname, int mkdir)
+					 const char *subdirname, size_t len,
+					 int mkdir)
 {
-	size_t len = strlen(subdirname);
 	struct ref_entry *entry = search_ref_dir(dir, subdirname, len);
 	if (!entry) {
 		if (!mkdir)
@@ -383,15 +383,11 @@ static struct ref_dir *search_for_subdir(struct ref_dir *dir,
 static struct ref_dir *find_containing_dir(struct ref_dir *dir,
 					   const char *refname, int mkdir)
 {
-	struct strbuf dirname;
 	const char *slash;
-	strbuf_init(&dirname, PATH_MAX);
 	for (slash = strchr(refname, '/'); slash; slash = strchr(slash + 1, '/')) {
+		size_t dirnamelen = slash - refname + 1;
 		struct ref_dir *subdir;
-		strbuf_add(&dirname,
-			   refname + dirname.len,
-			   (slash + 1) - (refname + dirname.len));
-		subdir = search_for_subdir(dir, dirname.buf, mkdir);
+		subdir = search_for_subdir(dir, refname, dirnamelen, mkdir);
 		if (!subdir) {
 			dir = NULL;
 			break;
@@ -399,7 +395,6 @@ static struct ref_dir *find_containing_dir(struct ref_dir *dir,
 		dir = subdir;
 	}
 
-	strbuf_release(&dirname);
 	return dir;
 }
 
-- 
1.7.10.2

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] refs: use strings directly in find_containing_dir()
  2012-05-22 18:50 ` [PATCH 3/3] refs: use strings directly in find_containing_dir() René Scharfe
@ 2012-05-22 21:27   ` Junio C Hamano
  2012-05-22 22:11     ` René Scharfe
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2012-05-22 21:27 UTC (permalink / raw)
  To: René Scharfe; +Cc: mhagger, Jeff King, git

René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:

> Why allocate a NUL-terminated copy at all when we can teach the code to
> stop after a given number of characters just as easily?  Alas, this
> will still trigger an allocation in search_ref_dir() (see first patch).

Yeah, but it is only because search_ref_dir() tries to use ref_entry_cmp(),
whose signature is geared more towards being used as a qsort(3) callback,
as the comparison function for bsearch(3).

A bsearch() callback takes two pointers, one is for the key and the other
for an array element, and there is no reason to require the two types be
the same.

In other words, something like this patch and we won't need an allocation
of the ref_entry that did not have to be a full ref_entry in the first
place (it only had to be something that supplies the "key" into a sorted
array).

 refs.c | 32 +++++++++++++++++++++++---------
 1 file changed, 23 insertions(+), 9 deletions(-)

diff --git a/refs.c b/refs.c
index 96e943c..52709ab 100644
--- a/refs.c
+++ b/refs.c
@@ -315,6 +315,23 @@ static int ref_entry_cmp(const void *a, const void *b)
 
 static void sort_ref_dir(struct ref_dir *dir);
 
+struct string_slice {
+	size_t len;
+	const char *str;
+};
+
+static int ref_entry_cmp_sslice(const void *key_, const void *ent_)
+{
+	struct string_slice *key = (struct string_slice *)key_;
+	struct ref_entry *ent = *(struct ref_entry **)ent_;
+	int entlen = strlen(ent->name);
+	int cmplen = key->len < entlen ? key->len : entlen;
+	int cmp = memcmp(key->str, ent->name, cmplen);
+	if (cmp)
+		return cmp;
+	return key->len - entlen;
+}
+
 /*
  * Return the entry with the given refname from the ref_dir
  * (non-recursively), sorting dir if necessary.  Return NULL if no
@@ -323,20 +340,17 @@ static void sort_ref_dir(struct ref_dir *dir);
 static struct ref_entry *search_ref_dir(struct ref_dir *dir,
 					const char *refname, size_t len)
 {
-	struct ref_entry *e, **r;
+	struct ref_entry **r;
+	struct string_slice key;
 
 	if (refname == NULL || !dir->nr)
 		return NULL;
 
 	sort_ref_dir(dir);
-
-	e = xmalloc(sizeof(struct ref_entry) + len + 1);
-	memcpy(e->name, refname, len);
-	e->name[len] = '\0';
-
-	r = bsearch(&e, dir->entries, dir->nr, sizeof(*dir->entries), ref_entry_cmp);
-
-	free(e);
+	key.len = len;
+	key.str = refname;
+	r = bsearch(&key, dir->entries, dir->nr, sizeof(*dir->entries),
+		    ref_entry_cmp_sslice);
 
 	if (r == NULL)
 		return NULL;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] refs: use strings directly in find_containing_dir()
  2012-05-22 21:27   ` Junio C Hamano
@ 2012-05-22 22:11     ` René Scharfe
  2012-05-22 22:18       ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: René Scharfe @ 2012-05-22 22:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: mhagger, Jeff King, git

Am 22.05.2012 23:27, schrieb Junio C Hamano:
> René Scharfe<rene.scharfe@lsrfire.ath.cx>  writes:
>
>> Why allocate a NUL-terminated copy at all when we can teach the code to
>> stop after a given number of characters just as easily?  Alas, this
>> will still trigger an allocation in search_ref_dir() (see first patch).
>
> Yeah, but it is only because search_ref_dir() tries to use ref_entry_cmp(),
> whose signature is geared more towards being used as a qsort(3) callback,
> as the comparison function for bsearch(3).
>
> A bsearch() callback takes two pointers, one is for the key and the other
> for an array element, and there is no reason to require the two types be
> the same.
>
> In other words, something like this patch and we won't need an allocation
> of the ref_entry that did not have to be a full ref_entry in the first
> place (it only had to be something that supplies the "key" into a sorted
> array).

Right, and this order (key-first) is documented for Linux, *BSD and by 
Microsoft, so we can probably rely on it.  The proposed asymmetry 
between sorting and lookup is a bit ... untidy, nevertheless.  But it's 
certainly worth it to avoid that ugly allocation.

Here's a random observation that led me to write the three patches: When 
running the following command under valgrind, it reports a few 
interesting numbers for total heap usage:

	$ git grep guess xdiff/xutils.c

   v1.7.8       591 allocs,    96 frees,   383,565 bytes allocated
   v1.7.9     2,940 allocs,   121 frees,   361,001 bytes allocated
   v1.7.10    3,002 allocs,   129 frees,   366,487 bytes allocated
   master     4,555 allocs, 1,586 frees, 2,380,265 bytes allocated
   3 patches  4,079 allocs, 1,110 frees,   430,093 bytes allocated
   4 patches  3,093 allocs,   124 frees,   377,749 bytes allocated

With your last patch, I think we're doing fine again, as the total
allocated size is within a few KB of the smallest one in this
arbitrary list of versions.

What has git grep to do with refs?  It checks if the path in the command
above is a ref, which makes it iterate over all of them..

René

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] refs: use strings directly in find_containing_dir()
  2012-05-22 22:11     ` René Scharfe
@ 2012-05-22 22:18       ` Junio C Hamano
  2012-05-23 16:20         ` René Scharfe
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2012-05-22 22:18 UTC (permalink / raw)
  To: René Scharfe; +Cc: mhagger, Jeff King, git

René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:

> What has git grep to do with refs?  It checks if the path in the command
> above is a ref, which makes it iterate over all of them..

Do you mean:

	/* Is it a rev? */
        get_sha1()
        -> ...
          -> get_sha1_basic()
            -> dwim_ref()

callpath?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] refs: use strings directly in find_containing_dir()
  2012-05-22 22:18       ` Junio C Hamano
@ 2012-05-23 16:20         ` René Scharfe
  2012-05-23 16:56           ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: René Scharfe @ 2012-05-23 16:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: mhagger, Jeff King, git

Am 23.05.2012 00:18, schrieb Junio C Hamano:
> René Scharfe<rene.scharfe@lsrfire.ath.cx>  writes:
>
>> What has git grep to do with refs?  It checks if the path in the command
>> above is a ref, which makes it iterate over all of them..
>
> Do you mean:
>
> 	/* Is it a rev? */
>          get_sha1()
>          ->  ...
>            ->  get_sha1_basic()
>              ->  dwim_ref()
>
> callpath?

Yes, indeed.  Hmm, this is done even if the paths come after a 
double-dash.  Anyway, I don't consider the check to be a performance 
issue, just a quick way to test the allocation count that i stumbled 
upon while working on the recent grep patches.

René

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] refs: use strings directly in find_containing_dir()
  2012-05-23 16:20         ` René Scharfe
@ 2012-05-23 16:56           ` Junio C Hamano
  2012-05-23 17:15             ` René Scharfe
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2012-05-23 16:56 UTC (permalink / raw)
  To: René Scharfe; +Cc: mhagger, Jeff King, git

René Scharfe <rene.scharfe@lsrfire.ath.cx> writes:

> Am 23.05.2012 00:18, schrieb Junio C Hamano:
>> René Scharfe<rene.scharfe@lsrfire.ath.cx>  writes:
>>
>>> What has git grep to do with refs?  It checks if the path in the command
>>> above is a ref, which makes it iterate over all of them..
>>
>> Do you mean:
>>
>> 	/* Is it a rev? */
>>          get_sha1()
>>          ->  ...
>>            ->  get_sha1_basic()
>>              ->  dwim_ref()
>>
>> callpath?
>
> Yes, indeed.  Hmm, this is done even if the paths come after a
> double-dash.  Anyway, I don't consider the check to be a performance
> issue, just a quick way to test the allocation count that i stumbled
> upon while working on the recent grep patches.

I was merely reacting "iterate over all of them"; dwim_ref() only checks
if .git/blah, .git/refs/heads/blah, .git/refs/tags/blah, etc.  exists and
the number of checks do not depend on the number of refs you have, so I
was wondering if I overlooked something that does for_each_ref() of
everything.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] refs: use strings directly in find_containing_dir()
  2012-05-23 16:56           ` Junio C Hamano
@ 2012-05-23 17:15             ` René Scharfe
  2012-05-24  4:34               ` Michael Haggerty
  0 siblings, 1 reply; 12+ messages in thread
From: René Scharfe @ 2012-05-23 17:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: mhagger, Jeff King, git

Am 23.05.2012 18:56, schrieb Junio C Hamano:
> René Scharfe<rene.scharfe@lsrfire.ath.cx>  writes:
> 
>> Am 23.05.2012 00:18, schrieb Junio C Hamano:
>>> René Scharfe<rene.scharfe@lsrfire.ath.cx>   writes:
>>>
>>>> What has git grep to do with refs?  It checks if the path in the command
>>>> above is a ref, which makes it iterate over all of them..
>>>
>>> Do you mean:
>>>
>>> 	/* Is it a rev? */
>>>           get_sha1()
>>>           ->   ...
>>>             ->   get_sha1_basic()
>>>               ->   dwim_ref()
>>>
>>> callpath?
>>
>> Yes, indeed.  Hmm, this is done even if the paths come after a
>> double-dash.  Anyway, I don't consider the check to be a performance
>> issue, just a quick way to test the allocation count that i stumbled
>> upon while working on the recent grep patches.
> 
> I was merely reacting "iterate over all of them"; dwim_ref() only checks
> if .git/blah, .git/refs/heads/blah, .git/refs/tags/blah, etc.  exists and
> the number of checks do not depend on the number of refs you have, so I
> was wondering if I overlooked something that does for_each_ref() of
> everything.

Yeah, for loose refs that's true. However, I have 470 packed refs, and
this command:

	$ valgrind --tool=exp-dhat ./git grep guess xdiff/xutils.c

reports (among other findings):

==28255== max-live:    30,334 in 470 blocks
==28255== tot-alloc:   30,334 in 470 blocks (avg size 64.54)
==28255== deaths:      none (none of these blocks were freed)
==28255== acc-ratios:  7.76 rd, 0.95 wr  (235,582 b-read, 28,924 b-written)
==28255==    at 0x402AEE8: malloc (in /usr/lib/valgrind/vgpreload_exp-dhat-x86-linux.so)
==28255==    by 0x813691D: xmalloc (wrapper.c:50)
==28255==    by 0x8106B1A: create_ref_entry.constprop.8 (refs.c:250)
==28255==    by 0x8107761: read_packed_refs (refs.c:817)
==28255==    by 0x810785F: get_packed_refs (refs.c:843)
==28255==    by 0x8107BE7: resolve_ref_unsafe (refs.c:1028)
==28255==    by 0x81090AA: dwim_ref (refs.c:1549)
==28255==    by 0x8122E06: get_sha1_1 (sha1_name.c:304)
==28255==    by 0x81237EF: get_sha1_with_context_1 (sha1_name.c:1044)
==28255==    by 0x8124016: get_sha1 (cache.h:795)
==28255==    by 0x75782F65: ???

René

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] refs: use strings directly in find_containing_dir()
  2012-05-23 17:15             ` René Scharfe
@ 2012-05-24  4:34               ` Michael Haggerty
  0 siblings, 0 replies; 12+ messages in thread
From: Michael Haggerty @ 2012-05-24  4:34 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, Jeff King, git

On 05/23/2012 07:15 PM, René Scharfe wrote:
> Am 23.05.2012 18:56, schrieb Junio C Hamano:
>> René Scharfe<rene.scharfe@lsrfire.ath.cx>   writes:
>>
>>> Am 23.05.2012 00:18, schrieb Junio C Hamano:
>>>> René Scharfe<rene.scharfe@lsrfire.ath.cx>    writes:
>>>>
>>>>> What has git grep to do with refs?  It checks if the path in the command
>>>>> above is a ref, which makes it iterate over all of them..
>>>>
>>>> Do you mean:
>>>>
>>>> 	/* Is it a rev? */
>>>>            get_sha1()
>>>>            ->    ...
>>>>              ->    get_sha1_basic()
>>>>                ->    dwim_ref()
>>>>
>>>> callpath?
>>>
>>> Yes, indeed.  Hmm, this is done even if the paths come after a
>>> double-dash.  Anyway, I don't consider the check to be a performance
>>> issue, just a quick way to test the allocation count that i stumbled
>>> upon while working on the recent grep patches.
>>
>> I was merely reacting "iterate over all of them"; dwim_ref() only checks
>> if .git/blah, .git/refs/heads/blah, .git/refs/tags/blah, etc.  exists and
>> the number of checks do not depend on the number of refs you have, so I
>> was wondering if I overlooked something that does for_each_ref() of
>> everything.
>
> Yeah, for loose refs that's true. However, I have 470 packed refs, and
> this command:
>
> 	$ valgrind --tool=exp-dhat ./git grep guess xdiff/xutils.c
>
> reports (among other findings):
>
> ==28255== max-live:    30,334 in 470 blocks
> ==28255== tot-alloc:   30,334 in 470 blocks (avg size 64.54)
> ==28255== deaths:      none (none of these blocks were freed)
> ==28255== acc-ratios:  7.76 rd, 0.95 wr  (235,582 b-read, 28,924 b-written)
> ==28255==    at 0x402AEE8: malloc (in /usr/lib/valgrind/vgpreload_exp-dhat-x86-linux.so)
> ==28255==    by 0x813691D: xmalloc (wrapper.c:50)
> ==28255==    by 0x8106B1A: create_ref_entry.constprop.8 (refs.c:250)
> ==28255==    by 0x8107761: read_packed_refs (refs.c:817)
> ==28255==    by 0x810785F: get_packed_refs (refs.c:843)
> ==28255==    by 0x8107BE7: resolve_ref_unsafe (refs.c:1028)
> ==28255==    by 0x81090AA: dwim_ref (refs.c:1549)
> ==28255==    by 0x8122E06: get_sha1_1 (sha1_name.c:304)
> ==28255==    by 0x81237EF: get_sha1_with_context_1 (sha1_name.c:1044)
> ==28255==    by 0x8124016: get_sha1 (cache.h:795)
> ==28255==    by 0x75782F65: ???

You are seeing all of the packed refs being *read*, which is triggered 
in this case by resolve_ref_unsafe() -> get_packed_refs().  This stack 
trace does not imply that all references are being iterated through as 
part of a search.  IOW, if dwim_ref() were called again, there would not 
be another 470 allocations.

This is similar to the old code, which also read all of the packed refs 
whenever any of them were accessed.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2012-05-24  4:41 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-05-22 13:16 [PATCH] find_containing_dir(): allocate strbuf less extravagantly mhagger
2012-05-22 17:34 ` Jeff King
2012-05-22 18:50 ` [PATCH 1/3] refs: convert parameter of search_ref_dir() to length-limited string René Scharfe
2012-05-22 18:50 ` [PATCH 2/3] refs: convert parameter of create_dir_entry() " René Scharfe
2012-05-22 18:50 ` [PATCH 3/3] refs: use strings directly in find_containing_dir() René Scharfe
2012-05-22 21:27   ` Junio C Hamano
2012-05-22 22:11     ` René Scharfe
2012-05-22 22:18       ` Junio C Hamano
2012-05-23 16:20         ` René Scharfe
2012-05-23 16:56           ` Junio C Hamano
2012-05-23 17:15             ` René Scharfe
2012-05-24  4:34               ` Michael Haggerty

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).