git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
@ 2017-03-28 19:07 git
  2017-03-28 19:07 ` [PATCH v3 1/2] read-cache: core.checksumindex git
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: git @ 2017-03-28 19:07 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Version 3 of this patch series simplifies this effort to just turn
on/off the hash verification using a "core.checksumindex" config variable.

I've preserved the original checksum validation code so that we can
force it on in fsck if desired.

It eliminates the original threading model completely.

Jeff Hostetler (2):
  read-cache: core.checksumindex
  test-core-checksum-index: core.checksumindex test helper

 Makefile                            |  1 +
 read-cache.c                        | 12 ++++++
 t/helper/.gitignore                 |  1 +
 t/helper/test-core-checksum-index.c | 77 +++++++++++++++++++++++++++++++++++++
 4 files changed, 91 insertions(+)
 create mode 100644 t/helper/test-core-checksum-index.c

-- 
2.9.3


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 1/2] read-cache: core.checksumindex
  2017-03-28 19:07 [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread git
@ 2017-03-28 19:07 ` git
  2017-03-28 19:07 ` [PATCH v3 2/2] test-core-checksum-index: core.checksumindex test helper git
  2017-03-28 19:16 ` [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread Jeff King
  2 siblings, 0 replies; 13+ messages in thread
From: git @ 2017-03-28 19:07 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach git to skip verification of the index SHA in verify_hdr()
in read_index().

This is a performance optimization.  The index file SHA verification
can be considered an ancient relic from the early days of git and only
useful for detecting disk corruption.  For small repositories, this
SHA calculation is not that significant, but for gigantic repositories
this calculation adds significant time to every command.

Added "core.checksumindex" to enable/disable the SHA verification.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 read-cache.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/read-cache.c b/read-cache.c
index 9054369..2ab4b74 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1376,12 +1376,24 @@ static int verify_hdr(struct cache_header *hdr, unsigned long size)
 	git_SHA_CTX c;
 	unsigned char sha1[20];
 	int hdr_version;
+	int do_checksum = 0;
 
 	if (hdr->hdr_signature != htonl(CACHE_SIGNATURE))
 		return error("bad signature");
 	hdr_version = ntohl(hdr->hdr_version);
 	if (hdr_version < INDEX_FORMAT_LB || INDEX_FORMAT_UB < hdr_version)
 		return error("bad index version %d", hdr_version);
+
+	/*
+	 * Since we run very early in command startup, git_config()
+	 * may not have been called yet and the various "core_*"
+	 * global variables haven't been set.  So look it up
+	 * explicitly.
+	 */
+	git_config_get_bool("core.checksumindex", &do_checksum);
+	if (!do_checksum)
+		return 0;
+
 	git_SHA1_Init(&c);
 	git_SHA1_Update(&c, hdr, size - 20);
 	git_SHA1_Final(sha1, &c);
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 2/2] test-core-checksum-index: core.checksumindex test helper
  2017-03-28 19:07 [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread git
  2017-03-28 19:07 ` [PATCH v3 1/2] read-cache: core.checksumindex git
@ 2017-03-28 19:07 ` git
  2017-03-28 19:16 ` [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread Jeff King
  2 siblings, 0 replies; 13+ messages in thread
From: git @ 2017-03-28 19:07 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Created test helper to measure read_index() with and without
core.checksumindex set and report performance.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 Makefile                            |  1 +
 t/helper/.gitignore                 |  1 +
 t/helper/test-core-checksum-index.c | 77 +++++++++++++++++++++++++++++++++++++
 3 files changed, 79 insertions(+)
 create mode 100644 t/helper/test-core-checksum-index.c

diff --git a/Makefile b/Makefile
index 9ec6065..6049427 100644
--- a/Makefile
+++ b/Makefile
@@ -607,6 +607,7 @@ PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS))
 TEST_PROGRAMS_NEED_X += test-chmtime
 TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-config
+TEST_PROGRAMS_NEED_X += test-core-checksum-index
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index d6e8b36..f651a6b 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -1,6 +1,7 @@
 /test-chmtime
 /test-ctype
 /test-config
+/test-core-checksum-index
 /test-date
 /test-delta
 /test-dump-cache-tree
diff --git a/t/helper/test-core-checksum-index.c b/t/helper/test-core-checksum-index.c
new file mode 100644
index 0000000..0452dc2
--- /dev/null
+++ b/t/helper/test-core-checksum-index.c
@@ -0,0 +1,77 @@
+#include "cache.h"
+#include "parse-options.h"
+
+uint64_t time_runs(int do_checksum, int count)
+{
+	uint64_t t0, t1;
+	uint64_t sum = 0;
+	uint64_t avg;
+	int i;
+
+	git_config_set_gently("core.checksumindex",
+		(do_checksum ? "true" : "false"));
+
+	for (i = 0; i < count; i++) {
+		t0 = getnanotime();
+		read_cache();
+		t1 = getnanotime();
+
+		sum += (t1 - t0);
+
+		printf("%f %d [cache_nr %d]\n",
+			   ((double)(t1 - t0))/1000000000,
+			   do_checksum,
+			   the_index.cache_nr);
+		fflush(stdout);
+
+		discard_cache();
+	}
+
+	avg = sum / count;
+	if (count > 1) {
+		printf("%f %d avg\n",
+			   (double)avg/1000000000,
+			   do_checksum);
+		fflush(stdout);
+	}
+
+	return avg;
+}
+
+int cmd_main(int argc, const char **argv)
+{
+	int original_do_checksum;
+	int count = 1;
+	const char *usage[] = {
+		"test-core-checksum-index",
+		NULL
+	};
+	struct option options[] = {
+		OPT_INTEGER('c', "count", &count, "number of passes"),
+		OPT_END(),
+	};
+	const char *prefix;
+	uint64_t avg_0, avg_1;
+
+	prefix = setup_git_directory();
+	argc = parse_options(argc, argv, prefix, options, usage, 0);
+
+	if (count < 1)
+		die("count must greater than zero");
+
+	/* throw away call to get the index in the disk cache */
+	read_cache();
+	discard_cache();
+
+	git_config_get_bool("core.checksumindex", &original_do_checksum);
+
+	avg_1 = time_runs(1, count);
+	avg_0 = time_runs(0, count);
+
+	git_config_set_gently("core.checksumindex",
+		((original_do_checksum) ? "true" : "false"));
+
+	if (avg_0 >= avg_1)
+		die("skipping index checksum verification did not help");
+	return 0;
+}
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
  2017-03-28 19:07 [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread git
  2017-03-28 19:07 ` [PATCH v3 1/2] read-cache: core.checksumindex git
  2017-03-28 19:07 ` [PATCH v3 2/2] test-core-checksum-index: core.checksumindex test helper git
@ 2017-03-28 19:16 ` Jeff King
  2017-03-28 19:50   ` Jeff Hostetler
  2 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2017-03-28 19:16 UTC (permalink / raw)
  To: git; +Cc: git, gitster, Jeff Hostetler

On Tue, Mar 28, 2017 at 07:07:30PM +0000, git@jeffhostetler.com wrote:

> From: Jeff Hostetler <jeffhost@microsoft.com>
> 
> Version 3 of this patch series simplifies this effort to just turn
> on/off the hash verification using a "core.checksumindex" config variable.
> 
> I've preserved the original checksum validation code so that we can
> force it on in fsck if desired.
> 
> It eliminates the original threading model completely.
> 
> Jeff Hostetler (2):
>   read-cache: core.checksumindex
>   test-core-checksum-index: core.checksumindex test helper
> 
>  Makefile                            |  1 +
>  read-cache.c                        | 12 ++++++
>  t/helper/.gitignore                 |  1 +
>  t/helper/test-core-checksum-index.c | 77 +++++++++++++++++++++++++++++++++++++

Do we still need test-core-checksum-index? Can we just time ls-files or
something in t/perf?

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
  2017-03-28 19:16 ` [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread Jeff King
@ 2017-03-28 19:50   ` Jeff Hostetler
  2017-03-28 19:56     ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff Hostetler @ 2017-03-28 19:50 UTC (permalink / raw)
  To: Jeff King; +Cc: git, gitster, Jeff Hostetler



On 3/28/2017 3:16 PM, Jeff King wrote:
> On Tue, Mar 28, 2017 at 07:07:30PM +0000, git@jeffhostetler.com wrote:
>
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> Version 3 of this patch series simplifies this effort to just turn
>> on/off the hash verification using a "core.checksumindex" config variable.
>>
>> I've preserved the original checksum validation code so that we can
>> force it on in fsck if desired.
>>
>> It eliminates the original threading model completely.
>>
>> Jeff Hostetler (2):
>>   read-cache: core.checksumindex
>>   test-core-checksum-index: core.checksumindex test helper
>>
>>  Makefile                            |  1 +
>>  read-cache.c                        | 12 ++++++
>>  t/helper/.gitignore                 |  1 +
>>  t/helper/test-core-checksum-index.c | 77 +++++++++++++++++++++++++++++++++++++
>
> Do we still need test-core-checksum-index? Can we just time ls-files or
> something in t/perf?

It was a convenient way to isolate, average, and compare
read_index() times, but I suppose we could do something
like that.

I did confirm that a ls-files does show a slight 0.008
second difference on the 58K file Linux tree when toggled
on or off.

But I'm tempted to suggest that we just omit my helper exe
and not worry about a test -- since we don't have any test
repos large enough to really demonstrate the differences.
My concern is that that 0.008 would be lost in the noise
of the rest of the test and make for an unreliable result.

Jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
  2017-03-28 19:50   ` Jeff Hostetler
@ 2017-03-28 19:56     ` Jeff King
  2017-03-30 19:49       ` Junio C Hamano
  2017-03-30 20:06       ` Thomas Gummerer
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff King @ 2017-03-28 19:56 UTC (permalink / raw)
  To: Jeff Hostetler; +Cc: git, gitster, Jeff Hostetler

On Tue, Mar 28, 2017 at 03:50:34PM -0400, Jeff Hostetler wrote:

> It was a convenient way to isolate, average, and compare
> read_index() times, but I suppose we could do something
> like that.
> 
> I did confirm that a ls-files does show a slight 0.008
> second difference on the 58K file Linux tree when toggled
> on or off.

Yeah, I agree it helps isolate the change. I'm just not sure we want to
carry a bunch of function-specific perf-testing code. And one of the
nice things about testing a real command is that it's...a real command.
So it's an actual improvement a user might see.

> But I'm tempted to suggest that we just omit my helper exe
> and not worry about a test -- since we don't have any test
> repos large enough to really demonstrate the differences.
> My concern is that that 0.008 would be lost in the noise
> of the rest of the test and make for an unreliable result.

Yeah, I think that would be fine. You _could_ write a t/perf test and
then use your 400MB monstrosity as GIT_PERF_LARGE_REPO. But given that
most people don't have such a thing, there's not much value over you
just showing off the perf improvement in the commit message.

We could also have a t/perf test that generates a monstrous index and
shows that it's faster. But frankly, I don't think this is all that
interesting as a performance regression test. It's not like there's
something subtle about the performance improvement; we stopped computing
the SHA-1, and (gasp!) it takes exactly one SHA-1 computation's less
time.

So just mentioning the test case and the improvement in the commit
message is sufficient, IMHO.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
  2017-03-28 19:56     ` Jeff King
@ 2017-03-30 19:49       ` Junio C Hamano
  2017-03-30 19:58         ` Jeff King
  2017-03-30 20:06       ` Thomas Gummerer
  1 sibling, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-03-30 19:49 UTC (permalink / raw)
  To: Jeff Hostetler, Jeff King; +Cc: git, Jeff Hostetler

Jeff King <peff@peff.net> writes:

> So just mentioning the test case and the improvement in the commit
> message is sufficient, IMHO.

So here is how I butchered [v3 1/2] to tentatively queue it on 'pu'.

Notable suggested changes I have in this one are:

 * I stole the numbers from the cover letter of v2 and added them at
   the end of the log message.

 * As the checksum is not a useless relic, but is an integrity
   check, I dropped the "ancient relic" from the proposed log
   message.  It is just that the modern disks are reliable enough to
   make it worthwhile to think about a trade-off this patch makes
   between performance and integrity.

 * As it is customary, the configuration variable starts as an opt
   in feature.  In a few releases, perhaps we can flip the default,
   but we do not do so from day one.

 * Updated the code around the call to config-get-bool to avoid
   asking the same question twice.

 * Added minimum documentation.

By the way, are we sure we have something that validates the
checksum regardless of the configuration setting?  If not, we may
want to tweak this further so that we can force the validation from
"git fsck" or something.  I am not going to do that myself, but it
may be necessary before this graduates to 'master'.

Thanks.

-- >8 --
From: Jeff Hostetler <jeffhost@microsoft.com>
Date: Tue, 28 Mar 2017 19:07:31 +0000
Subject: [PATCH] read-cache: core.checksumindex

Teach git to skip verification of the SHA-1 checksum at the end of
the index file in verify_hdr() called from read_index() when the
core.checksumIndex configuration variable is set to false.

The checksum verification is for detecting disk corruption, and for
small projects, the time it takes to compute SHA-1 is not that
significant, but for gigantic repositories this calculation adds
significant time to every command.

On the Linux kernel repository, the effect is rather trivial.
The time to reading its index with 58k entries drops from 0.0284 sec
down to 0.0155 sec.

On my Windows source tree (450MB index), I'm seeing a savings of 0.6
seconds -- read_index() went from 1.2 to 0.6 seconds.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt |  8 ++++++++
 read-cache.c             | 16 ++++++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 1df1965457..bc7b216d43 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -329,6 +329,14 @@ advice.*::
 		show directions on how to proceed from the current state.
 --
 
+core.checksumIndex::
+	Tell Git to validate the checksum at the end of the index
+	file to detect corruption.  Defaults to `true`.  Those who
+	work on a project with too many files may want to set this
+	variable to `false` to make it faster to load the index (in
+	exchange for reliability, but in general modern disks are
+	reliable enough for most people).
+
 core.fileMode::
 	Tells Git if the executable bit of files in the working tree
 	is to be honored.
diff --git a/read-cache.c b/read-cache.c
index e447751823..3195702cf7 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1376,12 +1376,28 @@ static int verify_hdr(struct cache_header *hdr, unsigned long size)
 	git_SHA_CTX c;
 	unsigned char sha1[20];
 	int hdr_version;
+	static int do_checksum = -1;
 
 	if (hdr->hdr_signature != htonl(CACHE_SIGNATURE))
 		return error("bad signature");
 	hdr_version = ntohl(hdr->hdr_version);
 	if (hdr_version < INDEX_FORMAT_LB || INDEX_FORMAT_UB < hdr_version)
 		return error("bad index version %d", hdr_version);
+
+	if (do_checksum < 0) {
+		/*
+		 * Since we run very early in command startup, git_config()
+		 * may not have been called yet and the various "core_*"
+		 * global variables haven't been set.  So look it up
+		 * explicitly.
+		 */
+		git_config_get_bool("core.checksumindex", &do_checksum);
+		if (do_checksum < 0)
+			do_checksum = 0; /* default to false */
+	}
+	if (!do_checksum)
+		return 0;
+
 	git_SHA1_Init(&c);
 	git_SHA1_Update(&c, hdr, size - 20);
 	git_SHA1_Final(sha1, &c);
-- 
2.12.2-727-gf32eb5229d


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
  2017-03-30 19:49       ` Junio C Hamano
@ 2017-03-30 19:58         ` Jeff King
  2017-03-30 20:44           ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2017-03-30 19:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff Hostetler, git, Jeff Hostetler

On Thu, Mar 30, 2017 at 12:49:15PM -0700, Junio C Hamano wrote:

> Notable suggested changes I have in this one are:
> 
>  * I stole the numbers from the cover letter of v2 and added them at
>    the end of the log message.

Yeah, good.

>  * As the checksum is not a useless relic, but is an integrity
>    check, I dropped the "ancient relic" from the proposed log
>    message.  It is just that the modern disks are reliable enough to
>    make it worthwhile to think about a trade-off this patch makes
>    between performance and integrity.

Yeah, I'd agree this is more of a tradeoff than a cleanup.

>  * As it is customary, the configuration variable starts as an opt
>    in feature.  In a few releases, perhaps we can flip the default,
>    but we do not do so from day one.

I think this is so user-invisible that it doesn't really matter much,
but I'm OK with taking it slow.

>  * Updated the code around the call to config-get-bool to avoid
>    asking the same question twice.

A few comments on that below.

>  * Added minimum documentation.

Yep, looks good.

> By the way, are we sure we have something that validates the
> checksum regardless of the configuration setting?  If not, we may
> want to tweak this further so that we can force the validation from
> "git fsck" or something.  I am not going to do that myself, but it
> may be necessary before this graduates to 'master'.

Yes, I'd agree this shouldn't graduate without the matching change to
teach fsck to flip the switch back.

> +	static int do_checksum = -1;
> [...]
> +	if (do_checksum < 0) {
> +		/*
> +		 * Since we run very early in command startup, git_config()
> +		 * may not have been called yet and the various "core_*"
> +		 * global variables haven't been set.  So look it up
> +		 * explicitly.
> +		 */
> +		git_config_get_bool("core.checksumindex", &do_checksum);
> +		if (do_checksum < 0)
> +			do_checksum = 0; /* default to false */
> +	}
> +	if (!do_checksum)
> +		return 0;

This is basically introducing a new caching layer, but there's no way to
invalidate it (if, say, we looked at the config once and then we knew
that the config changed).  I think it's probably OK, because the main
reason for the config to change is reading it before/after repository
setup. But since this is index code, it shouldn't be called at all
before repository setup.

Still, I'm not sure the extra layer of cache is all that valuable. It
should be a single hash lookup in the config cache (in an operation that
otherwise reads the entire index).

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
  2017-03-28 19:56     ` Jeff King
  2017-03-30 19:49       ` Junio C Hamano
@ 2017-03-30 20:06       ` Thomas Gummerer
  2017-03-30 20:39         ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Thomas Gummerer @ 2017-03-30 20:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Jeff Hostetler, git, gitster, Jeff Hostetler

On 03/28, Jeff King wrote:
> On Tue, Mar 28, 2017 at 03:50:34PM -0400, Jeff Hostetler wrote:
> 
> > It was a convenient way to isolate, average, and compare
> > read_index() times, but I suppose we could do something
> > like that.
> > 
> > I did confirm that a ls-files does show a slight 0.008
> > second difference on the 58K file Linux tree when toggled
> > on or off.
> 
> Yeah, I agree it helps isolate the change. I'm just not sure we want to
> carry a bunch of function-specific perf-testing code. And one of the
> nice things about testing a real command is that it's...a real command.
> So it's an actual improvement a user might see.
> 
> > But I'm tempted to suggest that we just omit my helper exe
> > and not worry about a test -- since we don't have any test
> > repos large enough to really demonstrate the differences.
> > My concern is that that 0.008 would be lost in the noise
> > of the rest of the test and make for an unreliable result.
> 
> Yeah, I think that would be fine. You _could_ write a t/perf test and
> then use your 400MB monstrosity as GIT_PERF_LARGE_REPO. But given that
> most people don't have such a thing, there's not much value over you
> just showing off the perf improvement in the commit message.

Sorry if this was already discussed, but we already do have a perf
test for the index (p0002), and a corresponding helper program which
just does read_cache() and discard_cache().  Maybe we could re-use
that and add a second test running the same using the new config?

> We could also have a t/perf test that generates a monstrous index and
> shows that it's faster. But frankly, I don't think this is all that
> interesting as a performance regression test. It's not like there's
> something subtle about the performance improvement; we stopped computing
> the SHA-1, and (gasp!) it takes exactly one SHA-1 computation's less
> time.
> 
> So just mentioning the test case and the improvement in the commit
> message is sufficient, IMHO.
> 
> -Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
  2017-03-30 20:06       ` Thomas Gummerer
@ 2017-03-30 20:39         ` Jeff King
  2017-03-31 13:23           ` Jeff Hostetler
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2017-03-30 20:39 UTC (permalink / raw)
  To: Thomas Gummerer; +Cc: Jeff Hostetler, git, gitster, Jeff Hostetler

On Thu, Mar 30, 2017 at 09:06:48PM +0100, Thomas Gummerer wrote:

> > Yeah, I think that would be fine. You _could_ write a t/perf test and
> > then use your 400MB monstrosity as GIT_PERF_LARGE_REPO. But given that
> > most people don't have such a thing, there's not much value over you
> > just showing off the perf improvement in the commit message.
> 
> Sorry if this was already discussed, but we already do have a perf
> test for the index (p0002), and a corresponding helper program which
> just does read_cache() and discard_cache().  Maybe we could re-use
> that and add a second test running the same using the new config?

Oh, indeed. Yes, I would think the results of p0002 would probably show
off Jeff's improvements.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
  2017-03-30 19:58         ` Jeff King
@ 2017-03-30 20:44           ` Junio C Hamano
  2017-03-31 13:20             ` Jeff Hostetler
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2017-03-30 20:44 UTC (permalink / raw)
  To: Jeff King; +Cc: Jeff Hostetler, git, Jeff Hostetler

Jeff King <peff@peff.net> writes:

> Still, I'm not sure the extra layer of cache is all that valuable. It
> should be a single hash lookup in the config cache (in an operation that
> otherwise reads the entire index).

OK, let's drop that part, then.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
  2017-03-30 20:44           ` Junio C Hamano
@ 2017-03-31 13:20             ` Jeff Hostetler
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Hostetler @ 2017-03-31 13:20 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: git, Jeff Hostetler



On 3/30/2017 4:44 PM, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
>> Still, I'm not sure the extra layer of cache is all that valuable. It
>> should be a single hash lookup in the config cache (in an operation that
>> otherwise reads the entire index).
>
> OK, let's drop that part, then.
>

Yes, let's omit the caching there.  That way perf tests can run it
multiple times with it on or off as they see fit.  And in the normal
case it will only be executed once anyway.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread
  2017-03-30 20:39         ` Jeff King
@ 2017-03-31 13:23           ` Jeff Hostetler
  0 siblings, 0 replies; 13+ messages in thread
From: Jeff Hostetler @ 2017-03-31 13:23 UTC (permalink / raw)
  To: Jeff King, Thomas Gummerer; +Cc: git, gitster, Jeff Hostetler



On 3/30/2017 4:39 PM, Jeff King wrote:
> On Thu, Mar 30, 2017 at 09:06:48PM +0100, Thomas Gummerer wrote:
>
>>> Yeah, I think that would be fine. You _could_ write a t/perf test and
>>> then use your 400MB monstrosity as GIT_PERF_LARGE_REPO. But given that
>>> most people don't have such a thing, there's not much value over you
>>> just showing off the perf improvement in the commit message.
>>
>> Sorry if this was already discussed, but we already do have a perf
>> test for the index (p0002), and a corresponding helper program which
>> just does read_cache() and discard_cache().  Maybe we could re-use
>> that and add a second test running the same using the new config?
>
> Oh, indeed. Yes, I would think the results of p0002 would probably show
> off Jeff's improvements.
>
> -Peff
>

Let me re-roll it with Junio's cleanup, update fsck to force it on,
and look at using p0002.

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-03-31 13:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-28 19:07 [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread git
2017-03-28 19:07 ` [PATCH v3 1/2] read-cache: core.checksumindex git
2017-03-28 19:07 ` [PATCH v3 2/2] test-core-checksum-index: core.checksumindex test helper git
2017-03-28 19:16 ` [PATCH v3 0/2] read-cache: call verify_hdr() in a background thread Jeff King
2017-03-28 19:50   ` Jeff Hostetler
2017-03-28 19:56     ` Jeff King
2017-03-30 19:49       ` Junio C Hamano
2017-03-30 19:58         ` Jeff King
2017-03-30 20:44           ` Junio C Hamano
2017-03-31 13:20             ` Jeff Hostetler
2017-03-30 20:06       ` Thomas Gummerer
2017-03-30 20:39         ` Jeff King
2017-03-31 13:23           ` Jeff Hostetler

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).