git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v5] read-cache: call verify_hdr() in a background thread
@ 2017-04-05 14:55 git
  2017-04-05 14:55 ` [PATCH v5] read-cache: force_verify_index_checksum git
  0 siblings, 1 reply; 4+ messages in thread
From: git @ 2017-04-05 14:55 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Version 5 of this patch series further simplifies the change to a
single global variable for use by fsck.  It eliminates the core
config setting.  Further discussion on the mailing list indicated
that the config setting only added confusion and probably would not
be used by anyone.

Jeff Hostetler (1):
  read-cache: force_verify_index_checksum

 builtin/fsck.c  |  1 +
 cache.h         |  2 ++
 read-cache.c    |  7 +++++++
 t/t1450-fsck.sh | 11 +++++++++++
 4 files changed, 21 insertions(+)

-- 
2.9.3


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v5] read-cache: force_verify_index_checksum
  2017-04-05 14:55 [PATCH v5] read-cache: call verify_hdr() in a background thread git
@ 2017-04-05 14:55 ` git
  2017-04-05 17:03   ` Jonathan Nieder
  0 siblings, 1 reply; 4+ messages in thread
From: git @ 2017-04-05 14:55 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach git to skip verification of the SHA1-1 checksum at the end of
the index file in verify_hdr() which is called from read_index()
unless the "force_verify_index_checksum" global variable is set.

Teach fsck to force this verification.

The checksum verification is for detecting disk corruption, and for
small projects, the time it takes to compute SHA-1 is not that
significant, but for gigantic repositories this calculation adds
significant time to every command.

These effect can be seen using t/perf/p0002-read-cache.sh:

================
BEFORE:
$ ./p0002-read-cache.sh
0002.1: read_cache/discard_cache 1000 times   0.66(0.58+0.08)

AFTER:
$ ./p0002-read-cache.sh
0002.1: read_cache/discard_cache 1000 times   0.30(0.28+0.01)
================

Also, using the t/perf/p0004-read-tree.sh test that I created
in a parallel patch series.

https://public-inbox.org/git/20170404210847.50860-1-git@jeffhostetler.com/

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 builtin/fsck.c  |  1 +
 cache.h         |  2 ++
 read-cache.c    |  7 +++++++
 t/t1450-fsck.sh | 11 +++++++++++
 4 files changed, 21 insertions(+)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 1a5cacc..04c5e50 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -771,6 +771,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	}
 
 	if (keep_cache_objects) {
+		force_verify_index_checksum = 1;
 		read_cache();
 		for (i = 0; i < active_nr; i++) {
 			unsigned int mode;
diff --git a/cache.h b/cache.h
index 80b6372..1cea517 100644
--- a/cache.h
+++ b/cache.h
@@ -685,6 +685,8 @@ extern void update_index_if_able(struct index_state *, struct lock_file *);
 extern int hold_locked_index(struct lock_file *, int);
 extern void set_alternate_index_output(const char *);
 
+extern int force_verify_index_checksum;
+
 /* Environment bits from configuration mechanism */
 extern int trust_executable_bit;
 extern int trust_ctime;
diff --git a/read-cache.c b/read-cache.c
index 9054369..5637fa3 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1371,6 +1371,9 @@ struct ondisk_cache_entry_extended {
 			    ondisk_cache_entry_extended_size(ce_namelen(ce)) : \
 			    ondisk_cache_entry_size(ce_namelen(ce)))
 
+/* Allow fsck to force verification of the index checksum. */
+int force_verify_index_checksum;
+
 static int verify_hdr(struct cache_header *hdr, unsigned long size)
 {
 	git_SHA_CTX c;
@@ -1382,6 +1385,10 @@ static int verify_hdr(struct cache_header *hdr, unsigned long size)
 	hdr_version = ntohl(hdr->hdr_version);
 	if (hdr_version < INDEX_FORMAT_LB || INDEX_FORMAT_UB < hdr_version)
 		return error("bad index version %d", hdr_version);
+
+	if (!force_verify_index_checksum)
+		return 0;
+
 	git_SHA1_Init(&c);
 	git_SHA1_Update(&c, hdr, size - 20);
 	git_SHA1_Final(sha1, &c);
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 33a51c9..86757ff 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -689,4 +689,15 @@ test_expect_success 'bogus head does not fallback to all heads' '
 	! grep $blob out
 '
 
+test_expect_success PERL 'detect corrupt index file in fsck' '
+	cp .git/index .git/index.backup &&
+	echo zzzzzzzz >zzzzzzzz &&
+	git add zzzzzzzz &&
+	perl -pi -e "s/zzzzzzzz/yyyyyyyy/" .git/index &&
+	test_must_fail git fsck --cache &&
+	rm .git/index &&
+	mv .git/index.backup .git/index &&
+	rm zzzzzzzz
+'
+
 test_done
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v5] read-cache: force_verify_index_checksum
  2017-04-05 14:55 ` [PATCH v5] read-cache: force_verify_index_checksum git
@ 2017-04-05 17:03   ` Jonathan Nieder
  2017-04-05 18:08     ` Jeff Hostetler
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Nieder @ 2017-04-05 17:03 UTC (permalink / raw)
  To: git; +Cc: git, gitster, peff, Jeff Hostetler

Hi,

git@jeffhostetler.com wrote:

> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>  builtin/fsck.c  |  1 +
>  cache.h         |  2 ++
>  read-cache.c    |  7 +++++++
>  t/t1450-fsck.sh | 11 +++++++++++
>  4 files changed, 21 insertions(+)

Yay!  I love this version.

> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -771,6 +771,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
>  	}
>  
>  	if (keep_cache_objects) {
> +		force_verify_index_checksum = 1;

nit: now that there isn't a config this is overriding, this isn't
"force_verify" so much as "verify".

[...]
> +++ b/t/t1450-fsck.sh
> @@ -689,4 +689,15 @@ test_expect_success 'bogus head does not fallback to all heads' '
>  	! grep $blob out
>  '
>  
> +test_expect_success PERL 'detect corrupt index file in fsck' '
> +	cp .git/index .git/index.backup &&
> +	echo zzzzzzzz >zzzzzzzz &&
> +	git add zzzzzzzz &&
> +	perl -pi -e "s/zzzzzzzz/yyyyyyyy/" .git/index &&
> +	test_must_fail git fsck --cache &&
> +	rm .git/index &&
> +	mv .git/index.backup .git/index &&
> +	rm zzzzzzzz
> +'

This is great.

optional: you can do the cleanup commands in test_when_finished to
make sure they happen even if the test fails.

Tests don't seem to use "perl -pi" anywhere else.  This instance could
be simplified by using sed.

With whatever subset of the changes below look good,
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

diff --git i/builtin/fsck.c w/builtin/fsck.c
index 92324e130c..b5e13a4556 100644
--- i/builtin/fsck.c
+++ w/builtin/fsck.c
@@ -771,7 +771,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
 	}
 
 	if (keep_cache_objects) {
-		force_verify_index_checksum = 1;
+		verify_index_checksum = 1;
 		read_cache();
 		for (i = 0; i < active_nr; i++) {
 			unsigned int mode;
diff --git i/cache.h w/cache.h
index 48b47083d3..9834d71f28 100644
--- i/cache.h
+++ w/cache.h
@@ -706,7 +706,7 @@ extern void update_index_if_able(struct index_state *, struct lock_file *);
 extern int hold_locked_index(struct lock_file *, int);
 extern void set_alternate_index_output(const char *);
 
-extern int force_verify_index_checksum;
+extern int verify_index_checksum;
 
 /* Environment bits from configuration mechanism */
 extern int trust_executable_bit;
diff --git i/read-cache.c w/read-cache.c
index 13326464d4..008b335844 100644
--- i/read-cache.c
+++ w/read-cache.c
@@ -1372,7 +1372,7 @@ struct ondisk_cache_entry_extended {
 			    ondisk_cache_entry_size(ce_namelen(ce)))
 
 /* Allow fsck to force verification of the index checksum. */
-int force_verify_index_checksum;
+int verify_index_checksum;
 
 static int verify_hdr(struct cache_header *hdr, unsigned long size)
 {
@@ -1386,7 +1386,7 @@ static int verify_hdr(struct cache_header *hdr, unsigned long size)
 	if (hdr_version < INDEX_FORMAT_LB || INDEX_FORMAT_UB < hdr_version)
 		return error("bad index version %d", hdr_version);
 
-	if (!force_verify_index_checksum)
+	if (!verify_index_checksum)
 		return 0;
 
 	git_SHA1_Init(&c);
diff --git i/t/t1450-fsck.sh w/t/t1450-fsck.sh
index 86757ffa52..e69b32f219 100755
--- i/t/t1450-fsck.sh
+++ w/t/t1450-fsck.sh
@@ -689,15 +689,15 @@ test_expect_success 'bogus head does not fallback to all heads' '
 	! grep $blob out
 '
 
-test_expect_success PERL 'detect corrupt index file in fsck' '
+test_expect_success 'detect corrupt index file in fsck' '
 	cp .git/index .git/index.backup &&
+	test_when_finished "mv .git/index.backup .git/index" &&
 	echo zzzzzzzz >zzzzzzzz &&
+	test_when_finished "rm zzzzzzzz" &&
 	git add zzzzzzzz &&
-	perl -pi -e "s/zzzzzzzz/yyyyyyyy/" .git/index &&
-	test_must_fail git fsck --cache &&
-	rm .git/index &&
-	mv .git/index.backup .git/index &&
-	rm zzzzzzzz
+	sed -e "s/zzzzzzzz/yyyyyyyy/" .git/index >.git/index+ &&
+	mv .git/index+ .git/index &&
+	test_must_fail git fsck --cache
 '
 
 test_done

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v5] read-cache: force_verify_index_checksum
  2017-04-05 17:03   ` Jonathan Nieder
@ 2017-04-05 18:08     ` Jeff Hostetler
  0 siblings, 0 replies; 4+ messages in thread
From: Jeff Hostetler @ 2017-04-05 18:08 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, gitster, peff, Jeff Hostetler



On 4/5/2017 1:03 PM, Jonathan Nieder wrote:
> Hi,
>
> git@jeffhostetler.com wrote:
>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> ---
>>  builtin/fsck.c  |  1 +
>>  cache.h         |  2 ++
>>  read-cache.c    |  7 +++++++
>>  t/t1450-fsck.sh | 11 +++++++++++
>>  4 files changed, 21 insertions(+)
>
> Yay!  I love this version.
>
>> --- a/builtin/fsck.c
>> +++ b/builtin/fsck.c
>> @@ -771,6 +771,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
>>  	}
>>
>>  	if (keep_cache_objects) {
>> +		force_verify_index_checksum = 1;
>
> nit: now that there isn't a config this is overriding, this isn't
> "force_verify" so much as "verify".

good point.

>
> [...]
>> +++ b/t/t1450-fsck.sh
>> @@ -689,4 +689,15 @@ test_expect_success 'bogus head does not fallback to all heads' '
>>  	! grep $blob out
>>  '
>>
>> +test_expect_success PERL 'detect corrupt index file in fsck' '
>> +	cp .git/index .git/index.backup &&
>> +	echo zzzzzzzz >zzzzzzzz &&
>> +	git add zzzzzzzz &&
>> +	perl -pi -e "s/zzzzzzzz/yyyyyyyy/" .git/index &&
>> +	test_must_fail git fsck --cache &&
>> +	rm .git/index &&
>> +	mv .git/index.backup .git/index &&
>> +	rm zzzzzzzz
>> +'
>
> This is great.
>
> optional: you can do the cleanup commands in test_when_finished to
> make sure they happen even if the test fails.
>
> Tests don't seem to use "perl -pi" anywhere else.  This instance could
> be simplified by using sed.
>
> With whatever subset of the changes below look good,
> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
> [...]

good point.

I just pushed another version.
Thanks,
Jeff



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-04-05 18:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-05 14:55 [PATCH v5] read-cache: call verify_hdr() in a background thread git
2017-04-05 14:55 ` [PATCH v5] read-cache: force_verify_index_checksum git
2017-04-05 17:03   ` Jonathan Nieder
2017-04-05 18:08     ` Jeff Hostetler

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).