git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
* [PATCH] Put sha1dc on a diet
@ 2017-03-01  0:30 Linus Torvalds
  2017-03-01 18:42 ` Junio C Hamano
  2017-03-01 19:07 ` Jeff King
  0 siblings, 2 replies; 36+ messages in thread
From: Linus Torvalds @ 2017-03-01  0:30 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King, Marc Stevens, Dan Shumow; +Cc: Git Mailing List


From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue, 28 Feb 2017 16:12:32 -0800
Subject: [PATCH] Put sha1dc on a diet

This removes the unnecessary parts of the sha1dc code, shrinking things from

	[torvalds@i7 git]$ size sha1dc/*.o
	   text	   data	    bss	    dec	    hex	filename
	 277559	    640	      0	 278199	  43eb7	sha1dc/sha1.o
	   4438	  11352	      0	  15790	   3dae	sha1dc/ubc_check.o

to

	[torvalds@i7 git]$ size sha1dc/*.o
	   text	   data	    bss	    dec	    hex	filename
	  13287	      0	      0	  13287	   33e7	sha1dc/sha1.o
	   4438	  11352	      0	  15790	   3dae	sha1dc/ubc_check.o

so the sha1.o text size shrinks from about 271kB to about 13kB.

The shrinking comes mainly from only generating the recompressio functions 
for the two rounds that are actually used (58 and 65), but also from 
removing a couple of other unused functions. The sha1dc library lost its 
"safe_hash" parameter to do that, since we check - and refuse to touch - 
the colliding cases manually.

The git binary itself is about 2MB of text on my system. For other helper 
binaries the size reduction is even more noticeable.  A quarter MB here 
and a quarter MB there, and suddenly you have a big binary ;)

This has been tested with the bad pdf image:

	[torvalds@i7 git]$ ./t/helper/test-sha1 < ~/Downloads/bad.pdf 
	fatal: The SHA1 computation detected evidence of a collision attack;
	refusing to process the contents.

although we obviously still don't have an actual git object to test with.

The normal git test-suite obviously also passes.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

I notice that sha1dc is in the 'pu' branch now, so let's put my money 
where my mouth is, and send in the sha1dc diet patch.

 sha1dc/sha1.c | 356 +++-------------------------------------------------------
 sha1dc/sha1.h |  24 ----
 2 files changed, 18 insertions(+), 362 deletions(-)

diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
index 6569b403e..4910f0c35 100644
--- a/sha1dc/sha1.c
+++ b/sha1dc/sha1.c
@@ -9,6 +9,15 @@
 #include "sha1dc/sha1.h"
 #include "sha1dc/ubc_check.h"
 
+// function type for sha1_recompression_step_T (uint32_t ihvin[5], uint32_t ihvout[5], const uint32_t me2[80], const uint32_t state[5])
+// where 0 <= T < 80
+//       me2 is an expanded message (the expansion of an original message block XOR'ed with a disturbance vector's message block difference)
+//       state is the internal state (a,b,c,d,e) before step T of the SHA-1 compression function while processing the original message block
+// the function will return:
+//       ihvin: the reconstructed input chaining value
+//       ihvout: the reconstructed output chaining value
+typedef void(*sha1_recompression_type)(uint32_t*, uint32_t*, const uint32_t*, const uint32_t*);
+
 #define rotate_right(x,n) (((x)>>(n))|((x)<<(32-(n))))
 #define rotate_left(x,n)  (((x)<<(n))|((x)>>(32-(n))))
 
@@ -39,212 +48,14 @@
 
 
 
-void sha1_message_expansion(uint32_t W[80])
+static void sha1_message_expansion(uint32_t W[80])
 {
 	unsigned i;
 	for (i = 16; i < 80; ++i)
 		W[i] = rotate_left(W[i - 3] ^ W[i - 8] ^ W[i - 14] ^ W[i - 16], 1);
 }
 
-void sha1_compression(uint32_t ihv[5], const uint32_t m[16])
-{
-	uint32_t W[80];
-	uint32_t a, b, c, d, e;
-	unsigned i;
-
-	memcpy(W, m, 16 * 4);
-	for (i = 16; i < 80; ++i)
-		W[i] = rotate_left(W[i - 3] ^ W[i - 8] ^ W[i - 14] ^ W[i - 16], 1);
-
-	a = ihv[0];
-	b = ihv[1];
-	c = ihv[2];
-	d = ihv[3];
-	e = ihv[4];
-
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(a, b, c, d, e, W, 0);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(e, a, b, c, d, W, 1);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(d, e, a, b, c, W, 2);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(c, d, e, a, b, W, 3);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(b, c, d, e, a, W, 4);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(a, b, c, d, e, W, 5);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(e, a, b, c, d, W, 6);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(d, e, a, b, c, W, 7);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(c, d, e, a, b, W, 8);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(b, c, d, e, a, W, 9);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(a, b, c, d, e, W, 10);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(e, a, b, c, d, W, 11);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(d, e, a, b, c, W, 12);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(c, d, e, a, b, W, 13);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(b, c, d, e, a, W, 14);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(a, b, c, d, e, W, 15);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(e, a, b, c, d, W, 16);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(d, e, a, b, c, W, 17);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(c, d, e, a, b, W, 18);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(b, c, d, e, a, W, 19);
-
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(a, b, c, d, e, W, 20);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(e, a, b, c, d, W, 21);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(d, e, a, b, c, W, 22);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(c, d, e, a, b, W, 23);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(b, c, d, e, a, W, 24);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(a, b, c, d, e, W, 25);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(e, a, b, c, d, W, 26);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(d, e, a, b, c, W, 27);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(c, d, e, a, b, W, 28);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(b, c, d, e, a, W, 29);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(a, b, c, d, e, W, 30);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(e, a, b, c, d, W, 31);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(d, e, a, b, c, W, 32);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(c, d, e, a, b, W, 33);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(b, c, d, e, a, W, 34);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(a, b, c, d, e, W, 35);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(e, a, b, c, d, W, 36);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(d, e, a, b, c, W, 37);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(c, d, e, a, b, W, 38);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(b, c, d, e, a, W, 39);
-
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(a, b, c, d, e, W, 40);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(e, a, b, c, d, W, 41);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(d, e, a, b, c, W, 42);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(c, d, e, a, b, W, 43);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(b, c, d, e, a, W, 44);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(a, b, c, d, e, W, 45);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(e, a, b, c, d, W, 46);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(d, e, a, b, c, W, 47);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(c, d, e, a, b, W, 48);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(b, c, d, e, a, W, 49);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(a, b, c, d, e, W, 50);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(e, a, b, c, d, W, 51);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(d, e, a, b, c, W, 52);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(c, d, e, a, b, W, 53);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(b, c, d, e, a, W, 54);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(a, b, c, d, e, W, 55);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(e, a, b, c, d, W, 56);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(d, e, a, b, c, W, 57);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(c, d, e, a, b, W, 58);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(b, c, d, e, a, W, 59);
-
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(a, b, c, d, e, W, 60);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(e, a, b, c, d, W, 61);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(d, e, a, b, c, W, 62);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(c, d, e, a, b, W, 63);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(b, c, d, e, a, W, 64);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(a, b, c, d, e, W, 65);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(e, a, b, c, d, W, 66);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(d, e, a, b, c, W, 67);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(c, d, e, a, b, W, 68);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(b, c, d, e, a, W, 69);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(a, b, c, d, e, W, 70);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(e, a, b, c, d, W, 71);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(d, e, a, b, c, W, 72);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(c, d, e, a, b, W, 73);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(b, c, d, e, a, W, 74);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(a, b, c, d, e, W, 75);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(e, a, b, c, d, W, 76);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(d, e, a, b, c, W, 77);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(c, d, e, a, b, W, 78);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(b, c, d, e, a, W, 79);
-
-	ihv[0] += a; ihv[1] += b; ihv[2] += c; ihv[3] += d; ihv[4] += e;
-}
-
-
-
-void sha1_compression_W(uint32_t ihv[5], const uint32_t W[80])
-{
-	uint32_t a = ihv[0], b = ihv[1], c = ihv[2], d = ihv[3], e = ihv[4];
-
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(a, b, c, d, e, W, 0);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(e, a, b, c, d, W, 1);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(d, e, a, b, c, W, 2);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(c, d, e, a, b, W, 3);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(b, c, d, e, a, W, 4);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(a, b, c, d, e, W, 5);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(e, a, b, c, d, W, 6);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(d, e, a, b, c, W, 7);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(c, d, e, a, b, W, 8);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(b, c, d, e, a, W, 9);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(a, b, c, d, e, W, 10);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(e, a, b, c, d, W, 11);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(d, e, a, b, c, W, 12);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(c, d, e, a, b, W, 13);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(b, c, d, e, a, W, 14);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(a, b, c, d, e, W, 15);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(e, a, b, c, d, W, 16);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(d, e, a, b, c, W, 17);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(c, d, e, a, b, W, 18);
-	HASHCLASH_SHA1COMPRESS_ROUND1_STEP(b, c, d, e, a, W, 19);
-
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(a, b, c, d, e, W, 20);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(e, a, b, c, d, W, 21);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(d, e, a, b, c, W, 22);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(c, d, e, a, b, W, 23);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(b, c, d, e, a, W, 24);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(a, b, c, d, e, W, 25);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(e, a, b, c, d, W, 26);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(d, e, a, b, c, W, 27);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(c, d, e, a, b, W, 28);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(b, c, d, e, a, W, 29);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(a, b, c, d, e, W, 30);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(e, a, b, c, d, W, 31);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(d, e, a, b, c, W, 32);
- 	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(c, d, e, a, b, W, 33);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(b, c, d, e, a, W, 34);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(a, b, c, d, e, W, 35);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(e, a, b, c, d, W, 36);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(d, e, a, b, c, W, 37);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(c, d, e, a, b, W, 38);
-	HASHCLASH_SHA1COMPRESS_ROUND2_STEP(b, c, d, e, a, W, 39);
-
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(a, b, c, d, e, W, 40);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(e, a, b, c, d, W, 41);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(d, e, a, b, c, W, 42);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(c, d, e, a, b, W, 43);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(b, c, d, e, a, W, 44);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(a, b, c, d, e, W, 45);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(e, a, b, c, d, W, 46);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(d, e, a, b, c, W, 47);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(c, d, e, a, b, W, 48);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(b, c, d, e, a, W, 49);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(a, b, c, d, e, W, 50);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(e, a, b, c, d, W, 51);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(d, e, a, b, c, W, 52);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(c, d, e, a, b, W, 53);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(b, c, d, e, a, W, 54);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(a, b, c, d, e, W, 55);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(e, a, b, c, d, W, 56);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(d, e, a, b, c, W, 57);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(c, d, e, a, b, W, 58);
-	HASHCLASH_SHA1COMPRESS_ROUND3_STEP(b, c, d, e, a, W, 59);
-
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(a, b, c, d, e, W, 60);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(e, a, b, c, d, W, 61);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(d, e, a, b, c, W, 62);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(c, d, e, a, b, W, 63);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(b, c, d, e, a, W, 64);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(a, b, c, d, e, W, 65);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(e, a, b, c, d, W, 66);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(d, e, a, b, c, W, 67);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(c, d, e, a, b, W, 68);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(b, c, d, e, a, W, 69);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(a, b, c, d, e, W, 70);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(e, a, b, c, d, W, 71);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(d, e, a, b, c, W, 72);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(c, d, e, a, b, W, 73);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(b, c, d, e, a, W, 74);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(a, b, c, d, e, W, 75);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(e, a, b, c, d, W, 76);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(d, e, a, b, c, W, 77);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(c, d, e, a, b, W, 78);
-	HASHCLASH_SHA1COMPRESS_ROUND4_STEP(b, c, d, e, a, W, 79);
-
-	ihv[0] += a; ihv[1] += b; ihv[2] += c; ihv[3] += d; ihv[4] += e;
-}
-
-
-
-void sha1_compression_states(uint32_t ihv[5], const uint32_t W[80], uint32_t states[80][5])
+static void sha1_compression_states(uint32_t ihv[5], const uint32_t W[80], uint32_t states[80][5])
 {
 	uint32_t a = ihv[0], b = ihv[1], c = ihv[2], d = ihv[3], e = ihv[4];
 
@@ -664,7 +475,7 @@ void sha1_compression_states(uint32_t ihv[5], const uint32_t W[80], uint32_t sta
 
 
 #define SHA1_RECOMPRESS(t) \
-void sha1recompress_fast_ ## t (uint32_t ihvin[5], uint32_t ihvout[5], const uint32_t me2[80], const uint32_t state[5]) \
+static void sha1recompress_fast_ ## t (uint32_t ihvin[5], uint32_t ihvout[5], const uint32_t me2[80], const uint32_t state[5]) \
 { \
 	uint32_t a = state[0], b = state[1], c = state[2], d = state[3], e = state[4]; \
 	if (t > 79) HASHCLASH_SHA1COMPRESS_ROUND4_STEP_BW(b, c, d, e, a, me2, 79); \
@@ -832,111 +643,20 @@ void sha1recompress_fast_ ## t (uint32_t ihvin[5], uint32_t ihvout[5], const uin
 	ihvout[0] = ihvin[0] + a; ihvout[1] = ihvin[1] + b; ihvout[2] = ihvin[2] + c; ihvout[3] = ihvin[3] + d; ihvout[4] = ihvin[4] + e; \
 } 
 
-SHA1_RECOMPRESS(0)
-SHA1_RECOMPRESS(1)
-SHA1_RECOMPRESS(2)
-SHA1_RECOMPRESS(3)
-SHA1_RECOMPRESS(4)
-SHA1_RECOMPRESS(5)
-SHA1_RECOMPRESS(6)
-SHA1_RECOMPRESS(7)
-SHA1_RECOMPRESS(8)
-SHA1_RECOMPRESS(9)
-
-SHA1_RECOMPRESS(10)
-SHA1_RECOMPRESS(11)
-SHA1_RECOMPRESS(12)
-SHA1_RECOMPRESS(13)
-SHA1_RECOMPRESS(14)
-SHA1_RECOMPRESS(15)
-SHA1_RECOMPRESS(16)
-SHA1_RECOMPRESS(17)
-SHA1_RECOMPRESS(18)
-SHA1_RECOMPRESS(19)
-
-SHA1_RECOMPRESS(20)
-SHA1_RECOMPRESS(21)
-SHA1_RECOMPRESS(22)
-SHA1_RECOMPRESS(23)
-SHA1_RECOMPRESS(24)
-SHA1_RECOMPRESS(25)
-SHA1_RECOMPRESS(26)
-SHA1_RECOMPRESS(27)
-SHA1_RECOMPRESS(28)
-SHA1_RECOMPRESS(29)
-
-SHA1_RECOMPRESS(30)
-SHA1_RECOMPRESS(31)
-SHA1_RECOMPRESS(32)
-SHA1_RECOMPRESS(33)
-SHA1_RECOMPRESS(34)
-SHA1_RECOMPRESS(35)
-SHA1_RECOMPRESS(36)
-SHA1_RECOMPRESS(37)
-SHA1_RECOMPRESS(38)
-SHA1_RECOMPRESS(39)
-
-SHA1_RECOMPRESS(40)
-SHA1_RECOMPRESS(41)
-SHA1_RECOMPRESS(42)
-SHA1_RECOMPRESS(43)
-SHA1_RECOMPRESS(44)
-SHA1_RECOMPRESS(45)
-SHA1_RECOMPRESS(46)
-SHA1_RECOMPRESS(47)
-SHA1_RECOMPRESS(48)
-SHA1_RECOMPRESS(49)
-
-SHA1_RECOMPRESS(50)
-SHA1_RECOMPRESS(51)
-SHA1_RECOMPRESS(52)
-SHA1_RECOMPRESS(53)
-SHA1_RECOMPRESS(54)
-SHA1_RECOMPRESS(55)
-SHA1_RECOMPRESS(56)
-SHA1_RECOMPRESS(57)
 SHA1_RECOMPRESS(58)
-SHA1_RECOMPRESS(59)
-
-SHA1_RECOMPRESS(60)
-SHA1_RECOMPRESS(61)
-SHA1_RECOMPRESS(62)
-SHA1_RECOMPRESS(63)
-SHA1_RECOMPRESS(64)
 SHA1_RECOMPRESS(65)
-SHA1_RECOMPRESS(66)
-SHA1_RECOMPRESS(67)
-SHA1_RECOMPRESS(68)
-SHA1_RECOMPRESS(69)
-
-SHA1_RECOMPRESS(70)
-SHA1_RECOMPRESS(71)
-SHA1_RECOMPRESS(72)
-SHA1_RECOMPRESS(73)
-SHA1_RECOMPRESS(74)
-SHA1_RECOMPRESS(75)
-SHA1_RECOMPRESS(76)
-SHA1_RECOMPRESS(77)
-SHA1_RECOMPRESS(78)
-SHA1_RECOMPRESS(79)
-
-sha1_recompression_type sha1_recompression_step[80] =
+
+static sha1_recompression_type sha1_recompression_step[80] =
 {
-	sha1recompress_fast_0, sha1recompress_fast_1, sha1recompress_fast_2, sha1recompress_fast_3, sha1recompress_fast_4, sha1recompress_fast_5, sha1recompress_fast_6, sha1recompress_fast_7, sha1recompress_fast_8, sha1recompress_fast_9,
-	sha1recompress_fast_10, sha1recompress_fast_11, sha1recompress_fast_12, sha1recompress_fast_13, sha1recompress_fast_14, sha1recompress_fast_15, sha1recompress_fast_16, sha1recompress_fast_17, sha1recompress_fast_18, sha1recompress_fast_19,
-	sha1recompress_fast_20, sha1recompress_fast_21, sha1recompress_fast_22, sha1recompress_fast_23, sha1recompress_fast_24, sha1recompress_fast_25, sha1recompress_fast_26, sha1recompress_fast_27, sha1recompress_fast_28, sha1recompress_fast_29,
-	sha1recompress_fast_30, sha1recompress_fast_31, sha1recompress_fast_32, sha1recompress_fast_33, sha1recompress_fast_34, sha1recompress_fast_35, sha1recompress_fast_36, sha1recompress_fast_37, sha1recompress_fast_38, sha1recompress_fast_39,
-	sha1recompress_fast_40, sha1recompress_fast_41, sha1recompress_fast_42, sha1recompress_fast_43, sha1recompress_fast_44, sha1recompress_fast_45, sha1recompress_fast_46, sha1recompress_fast_47, sha1recompress_fast_48, sha1recompress_fast_49,
-	sha1recompress_fast_50, sha1recompress_fast_51, sha1recompress_fast_52, sha1recompress_fast_53, sha1recompress_fast_54, sha1recompress_fast_55, sha1recompress_fast_56, sha1recompress_fast_57, sha1recompress_fast_58, sha1recompress_fast_59,
-	sha1recompress_fast_60, sha1recompress_fast_61, sha1recompress_fast_62, sha1recompress_fast_63, sha1recompress_fast_64, sha1recompress_fast_65, sha1recompress_fast_66, sha1recompress_fast_67, sha1recompress_fast_68, sha1recompress_fast_69,
-	sha1recompress_fast_70, sha1recompress_fast_71, sha1recompress_fast_72, sha1recompress_fast_73, sha1recompress_fast_74, sha1recompress_fast_75, sha1recompress_fast_76, sha1recompress_fast_77, sha1recompress_fast_78, sha1recompress_fast_79,
+	[58] = sha1recompress_fast_58,
+	[65] = sha1recompress_fast_65,
 };
 
 
 
 
 
-void sha1_process(SHA1_CTX* ctx, const uint32_t block[16]) 
+static void sha1_process(SHA1_CTX* ctx, const uint32_t block[16])
 {
 	unsigned i, j;
 	uint32_t ubc_dv_mask[DVMASKSIZE];
@@ -973,12 +693,6 @@ void sha1_process(SHA1_CTX* ctx, const uint32_t block[16])
 					if (ctx->callback != NULL)
 						ctx->callback(ctx->total - 64, ctx->ihv1, ctx->ihv2, ctx->m1, ctx->m2);
 
-					if (ctx->safe_hash) 
-					{
-						sha1_compression_W(ctx->ihv, ctx->m1);
-						sha1_compression_W(ctx->ihv, ctx->m1);
-					}
-
 					break;
 				}
 			}
@@ -990,7 +704,7 @@ void sha1_process(SHA1_CTX* ctx, const uint32_t block[16])
 
 
 
-void swap_bytes(uint32_t val[16]) 
+static void swap_bytes(uint32_t val[16])
 {
 	unsigned i;
 	for (i = 0; i < 16; ++i) 
@@ -1011,7 +725,6 @@ void SHA1DCInit(SHA1_CTX* ctx)
 	ctx->ihv[3] = 0x10325476;
 	ctx->ihv[4] = 0xC3D2E1F0;
 	ctx->found_collision = 0;
-	ctx->safe_hash = 1;
 	ctx->ubc_check = 1;
 	ctx->detect_coll = 1;
 	ctx->reduced_round_coll = 0;
@@ -1019,39 +732,6 @@ void SHA1DCInit(SHA1_CTX* ctx)
 	ctx->callback = NULL;
 }
 
-void SHA1DCSetSafeHash(SHA1_CTX* ctx, int safehash)
-{
-	if (safehash)
-		ctx->safe_hash = 1;
-	else
-		ctx->safe_hash = 0;
-}
-
-
-void SHA1DCSetUseUBC(SHA1_CTX* ctx, int ubc_check)
-{
-	if (ubc_check)
-		ctx->ubc_check = 1;
-	else
-		ctx->ubc_check = 0;
-}
-
-void SHA1DCSetUseDetectColl(SHA1_CTX* ctx, int detect_coll)
-{
-	if (detect_coll)
-		ctx->detect_coll = 1;
-	else
-		ctx->detect_coll = 0;
-}
-
-void SHA1DCSetDetectReducedRoundCollision(SHA1_CTX* ctx, int reduced_round_coll)
-{
-	if (reduced_round_coll)
-		ctx->reduced_round_coll = 1;
-	else
-		ctx->reduced_round_coll = 0;
-}
-
 void SHA1DCSetCallback(SHA1_CTX* ctx, collision_block_callback callback)
 {
 	ctx->callback = callback;
diff --git a/sha1dc/sha1.h b/sha1dc/sha1.h
index 1bb0ace99..f126bf63b 100644
--- a/sha1dc/sha1.h
+++ b/sha1dc/sha1.h
@@ -5,29 +5,6 @@
 * https://opensource.org/licenses/MIT
 ***/
 
-// uses SHA-1 message expansion to expand the first 16 words of W[] to 80 words
-void sha1_message_expansion(uint32_t W[80]);
-
-// sha-1 compression function; first version takes a message block pre-parsed as 16 32-bit integers, second version takes an already expanded message)
-void sha1_compression(uint32_t ihv[5], const uint32_t m[16]);
-void sha1_compression_W(uint32_t ihv[5], const uint32_t W[80]);
-
-// same as sha1_compression_W, but additionally store intermediate states
-// only stores states ii (the state between step ii-1 and step ii) when DOSTORESTATEii is defined in ubc_check.h
-void sha1_compression_states(uint32_t ihv[5], const uint32_t W[80], uint32_t states[80][5]);
-
-// function type for sha1_recompression_step_T (uint32_t ihvin[5], uint32_t ihvout[5], const uint32_t me2[80], const uint32_t state[5])
-// where 0 <= T < 80
-//       me2 is an expanded message (the expansion of an original message block XOR'ed with a disturbance vector's message block difference)
-//       state is the internal state (a,b,c,d,e) before step T of the SHA-1 compression function while processing the original message block
-// the function will return:
-//       ihvin: the reconstructed input chaining value
-//       ihvout: the reconstructed output chaining value
-typedef void(*sha1_recompression_type)(uint32_t*, uint32_t*, const uint32_t*, const uint32_t*);
-
-// table of sha1_recompression_step_0, ... , sha1_recompression_step_79
-extern sha1_recompression_type sha1_recompression_step[80];
-
 // a callback function type that can be set to be called when a collision block has been found:
 // void collision_block_callback(uint64_t byteoffset, const uint32_t ihvin1[5], const uint32_t ihvin2[5], const uint32_t m1[80], const uint32_t m2[80])
 typedef void(*collision_block_callback)(uint64_t, const uint32_t*, const uint32_t*, const uint32_t*, const uint32_t*);
@@ -39,7 +16,6 @@ typedef struct {
 	unsigned char buffer[64];
 	int bigendian;
 	int found_collision;
-	int safe_hash;
 	int detect_coll;
 	int ubc_check;
 	int reduced_round_coll;
-- 
2.12.0.4.g94589516d


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01  0:30 [PATCH] Put sha1dc on a diet Linus Torvalds
@ 2017-03-01 18:42 ` Junio C Hamano
  2017-03-01 18:49   ` Linus Torvalds
  2017-03-01 19:07 ` Jeff King
  1 sibling, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2017-03-01 18:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff King, Marc Stevens, Dan Shumow, Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> I notice that sha1dc is in the 'pu' branch now, so let's put my money 
> where my mouth is, and send in the sha1dc diet patch.

I see //c99 comments and also T array[] = { [58] = val } both of
which I think we stay away from (and the former is from the initial
import), so some people on other platforms MAY have trouble with
this topic.

Let's see what happens by queuing it on 'pu' ;-)

Thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 18:42 ` Junio C Hamano
@ 2017-03-01 18:49   ` Linus Torvalds
  2017-03-01 19:41     ` Junio C Hamano
  2017-03-01 19:53     ` Jeff King
  0 siblings, 2 replies; 36+ messages in thread
From: Linus Torvalds @ 2017-03-01 18:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Marc Stevens, Dan Shumow, Git Mailing List

On Wed, Mar 1, 2017 at 10:42 AM, Junio C Hamano <gitster@pobox.com> wrote:
> I see //c99 comments

sha1dc is already full of // style comments. I just followed the
existing practice.

> and also T array[] = { [58] = val } both of
> which I think we stay away from (and the former is from the initial
> import), so some people on other platforms MAY have trouble with
> this topic.

Hmm. The "{ [58] = val; }" kind of initialization would be easy to
work around by just filling in everything else with NULL, but it would
make for a pretty nasty readability issue.

That said, if you mis-count the NULL's, the end result will pretty
immediately SIGSEGV, so I guess it wouldn't be much of a maintenance
problem.

But if you're just willing to take the "let's see" approach, I think
the explicitly numbered initializer is much better.

The main people who I assume would really want to use the sha1dc
library are hosting places. And they won't be using crazy compilers
from the last century.

That said, I think that it would be lovely to just default to
USE_SHA1DC and just put the whole attack behind us. Yes, it's slower.
No, it doesn't really seem to matter that much in practice.

              Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01  0:30 [PATCH] Put sha1dc on a diet Linus Torvalds
  2017-03-01 18:42 ` Junio C Hamano
@ 2017-03-01 19:07 ` Jeff King
  2017-03-01 19:10   ` Linus Torvalds
  1 sibling, 1 reply; 36+ messages in thread
From: Jeff King @ 2017-03-01 19:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Marc Stevens, Dan Shumow, Git Mailing List

On Tue, Feb 28, 2017 at 04:30:26PM -0800, Linus Torvalds wrote:

> From: Linus Torvalds <torvalds@linux-foundation.org>
> Date: Tue, 28 Feb 2017 16:12:32 -0800
> Subject: [PATCH] Put sha1dc on a diet
> 
> This removes the unnecessary parts of the sha1dc code, shrinking things from
> [...]

So obviously the smaller object size is nice, and the diffstat is
certainly satisfying. My only qualm would be whether this conflicts with
the optimizations that Dan is working on (probably not conceptually, but
textually).

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 19:07 ` Jeff King
@ 2017-03-01 19:10   ` Linus Torvalds
  0 siblings, 0 replies; 36+ messages in thread
From: Linus Torvalds @ 2017-03-01 19:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Marc Stevens, Dan Shumow, Git Mailing List

On Wed, Mar 1, 2017 at 11:07 AM, Jeff King <peff@peff.net> wrote:
>
> So obviously the smaller object size is nice, and the diffstat is
> certainly satisfying. My only qualm would be whether this conflicts with
> the optimizations that Dan is working on (probably not conceptually, but
> textually).

Yeah. But I'll happily just re-apply the patch on any new version that
Dan posts.  The patch is obviously trivial, even if size-wise it's a
fair number of lines.

So I wouldn't suggest using the patched version as some kind of
starting point. It's much easier to just take a new version of
upstream and repeat the diet patch on it.

.. and obviously later versions of the upstream sha1dc code may not
even need this at all since Dan and Marc are now aware of the issue.

                  Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 18:49   ` Linus Torvalds
@ 2017-03-01 19:41     ` Junio C Hamano
  2017-03-01 21:56       ` Johannes Schindelin
  2017-03-01 19:53     ` Jeff King
  1 sibling, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2017-03-01 19:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff King, Marc Stevens, Dan Shumow, Git Mailing List

Linus Torvalds <torvalds@linux-foundation.org> writes:

> That said, I think that it would be lovely to just default to
> USE_SHA1DC and just put the whole attack behind us. Yes, it's slower.
> No, it doesn't really seem to matter that much in practice.

Yes.  It would be a very good goal.


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 18:49   ` Linus Torvalds
  2017-03-01 19:41     ` Junio C Hamano
@ 2017-03-01 19:53     ` Jeff King
       [not found]       ` <CA+55aFwf3sxKW+dGTMjNAeHMOf=rvctEQohm+rbhEb=e3KLpHw@mail.gmail.com>
  1 sibling, 1 reply; 36+ messages in thread
From: Jeff King @ 2017-03-01 19:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Marc Stevens, Dan Shumow, Git Mailing List

On Wed, Mar 01, 2017 at 10:49:55AM -0800, Linus Torvalds wrote:

> That said, I think that it would be lovely to just default to
> USE_SHA1DC and just put the whole attack behind us. Yes, it's slower.
> No, it doesn't really seem to matter that much in practice.

My biggest concern is the index-pack operation. Try this:

  time git clone --no-local --bare linux tmp.git

with and without USE_SHA1DC. I get:

  [w/ openssl]
  real	1m52.307s
  user	2m47.928s
  sys	0m14.992s

  [w/ sha1dc]
  real	3m4.043s
  user	6m16.412s
  sys	0m13.772s

That's real latency the user will see. It's hard to break it down,
though. The actual "receiving" phase is generally going to be network
bound. The delta-resolution that happens afterwards is totally local and
CPU-bound (but does run in parallel).

And of course this repository tends to the larger side (though certainly
there are bigger ones), and you only feel the pain on clone or when
doing an initial push, not day-to-day.

So maybe we just suck it up and accept that it's a bit slower.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
       [not found]       ` <CA+55aFwf3sxKW+dGTMjNAeHMOf=rvctEQohm+rbhEb=e3KLpHw@mail.gmail.com>
@ 2017-03-01 20:34         ` Jeff King
       [not found]           ` <CA+55aFwr1jncrk-cekn0Y8rs_S+zs7RrgQ-Jb-ZbgCvmVrHT_A@mail.gmail.com>
  2017-03-01 23:38           ` Linus Torvalds
  0 siblings, 2 replies; 36+ messages in thread
From: Jeff King @ 2017-03-01 20:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Dan Shumow, Marc Stevens

On Wed, Mar 01, 2017 at 12:14:34PM -0800, Linus Torvalds wrote:

> > My biggest concern is the index-pack operation. Try this:
> 
> I'm mobile right now, so I can't test, but I'd this perhaps at least partly
> due to the full checksum over the pack-file?
>
> We have two very different uses of SHA1: the actual object name hash, but
> also the sha1file checksums that we do on the index file and the pack files.
>
> And the checksum code really doesn't need the collision checking at all.

I don't think that helps. The sha1 over the pack-file takes about 1.3s
with openssl, and 5s with sha1dc. So we already know the increase there
is only a few seconds, not a few minutes.

And it makes sense if you think about the index-pack operation. It has
to inflate each object, resolving deltas, and checksum the result. And
the number of inflated bytes is _much_ larger than the on-disk bytes.
You can see the difference with:

  git cat-file --batch-all-objects \
    --batch-check='%(objectsize:disk) %(objectsize)' |
  perl -alne '
    $disk += $F[0]; $raw += $F[1];
    END { print "$disk $raw" }
  '

On linux.git that yields:

  1210521959 63279680406

That's over a 50x increase in the bytes we have to sha1 for objects
versus pack-checksums.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 19:41     ` Junio C Hamano
@ 2017-03-01 21:56       ` Johannes Schindelin
  2017-03-01 22:05         ` Junio C Hamano
  2017-03-01 22:16         ` Linus Torvalds
  0 siblings, 2 replies; 36+ messages in thread
From: Johannes Schindelin @ 2017-03-01 21:56 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Linus Torvalds, Jeff King, Marc Stevens, Dan Shumow, Git Mailing List

Hi,

On Wed, 1 Mar 2017, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
> 
> > That said, I think that it would be lovely to just default to
> > USE_SHA1DC and just put the whole attack behind us. Yes, it's slower.
> > No, it doesn't really seem to matter that much in practice.
> 
> Yes.  It would be a very good goal.

So let me get this straight: not only do we now implicitly want to bump
the required C compiler to C99 without any grace period worth mentioning
[*1*], we are also all of a sudden no longer worried about a double digit
percentage drop of speed [*2*]?

Puzzled,
Johannes

Footnote *1*: I know, it is easy to forget that some developers cannot
choose their tools, or even their hardware. In the past, we seemed to take
appropriate care, though.

Footnote *2*: With real-world repositories of notable size, that
performance regression hurts. A lot. We just spent time to get the speed
of SHA-1 down by a couple percent and it was a noticeable improvement here.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 21:56       ` Johannes Schindelin
@ 2017-03-01 22:05         ` Junio C Hamano
  2017-03-01 22:16         ` Linus Torvalds
  1 sibling, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2017-03-01 22:05 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Linus Torvalds, Jeff King, Marc Stevens, Dan Shumow, Git Mailing List

On Wed, Mar 1, 2017 at 1:56 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Wed, 1 Mar 2017, Junio C Hamano wrote:
>
>> Linus Torvalds <torvalds@linux-foundation.org> writes:
>>
>> > That said, I think that it would be lovely to just default to
>> > USE_SHA1DC and just put the whole attack behind us. Yes, it's slower.
>> > No, it doesn't really seem to matter that much in practice.
>>
>> Yes.  It would be a very good goal.
>
> So let me get this straight: not only do we now implicitly want to bump
> the required C compiler to C99 without any grace period worth mentioning
> [*1*], we are also all of a sudden no longer worried about a double digit
> percentage drop of speed [*2*]?

Before we get the code into shape suitable for 'next', it is more important to
make sure it operates correctly, adding necessary features if any (e.g. "hash
with or without check" knob) while it is in 'pu', and *1* is to allow
it to progress
faster without having to worry about something we could do mechanically
before making it ready for 'next'.

The performance thing is really "let's see how well it goes". With effort to
optimize still "just has began", I think it is too early to tell if
Linus's "doesn't
really seem to matter" is the case or not.

Queuing such a topic on 'pu' is one effective way to make sure people are
working off of the same codebase.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 21:56       ` Johannes Schindelin
  2017-03-01 22:05         ` Junio C Hamano
@ 2017-03-01 22:16         ` Linus Torvalds
  2017-03-01 22:51           ` Johannes Schindelin
  1 sibling, 1 reply; 36+ messages in thread
From: Linus Torvalds @ 2017-03-01 22:16 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Jeff King, Marc Stevens, Dan Shumow, Git Mailing List

On Wed, Mar 1, 2017 at 1:56 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Footnote *1*: I know, it is easy to forget that some developers cannot
> choose their tools, or even their hardware. In the past, we seemed to take
> appropriate care, though.

I don't think you need to worry about the Windows side. That can
continue to do something else.

When I advocated perhaps using  USE_SHA1DC by default, I definitely
did not mean it in a "everywhere, regardless of issues" manner.

For example, the conmfig.mak.uname script already explicitly asks for
"BLK_SHA1 = YesPlease" for Windows. Don't bother changing that, it's
an explicit choice.

But the Linux rules don't actually specify which SHA1 version to use,
so the main Makefile currently defaults to just using openssl.

So that's the "default" choice I think we might want to change. Not
the "we're windows, and explicitly want BLK_SHA1 because of
environment and build infrastructure".

             Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 22:16         ` Linus Torvalds
@ 2017-03-01 22:51           ` Johannes Schindelin
  2017-03-01 23:05             ` Linus Torvalds
  0 siblings, 1 reply; 36+ messages in thread
From: Johannes Schindelin @ 2017-03-01 22:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Jeff King, Marc Stevens, Dan Shumow, Git Mailing List

Hi,

On Wed, 1 Mar 2017, Linus Torvalds wrote:

> On Wed, Mar 1, 2017 at 1:56 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> > Footnote *1*: I know, it is easy to forget that some developers cannot
> > choose their tools, or even their hardware. In the past, we seemed to take
> > appropriate care, though.
> 
> I don't think you need to worry about the Windows side.

I am not. I build G?t for Windows using GCC.

My concern is about that unexpected turn "oh, let's just switch to C99
because, well, because my compiler canehandle it, and everybody else
should just switch tn a modern compiler". That really sounded careless.

> That can continue to do something else.
> 
> When I advocated perhaps using  USE_SHA1DC by default, I definitely did
> not mean it in a "everywhere, regardless of issues" manner.
> 
> For example, the conmfig.mak.uname script already explicitly asks for
> "BLK_SHA1 = YesPlease" for Windows. Don't bother changing that, it's an
> explicit choice.

That setting is only in git.git's version, not in gxt-for-windows/git.git.
We switched to OpenSSL because of speed improvements, in particular with
recent Intel processors.

> But the Linux rules don't actually specify which SHA1 version to use,
> so the main Makefile currently defaults to just using openssl.
> 
> So that's the "default" choice I think we might want to change. Not
> the "we're windows, and explicitly want BLK_SHA1 because of
> environment and build infrastructure".

Since we switched away from BLOCK_SHA1, any such change would affect Git
for Windews.

But I think bigger than just developers on Windows OS. There are many
developers out there working on large repositories (yes, much larger than
Linux). Also using Macs and Linux. I am not at all sure that we want to
give them an updated Git they cannot fail to notice to be much slower than
before.

Don't get me wrong: I *hope* that you'll manage to get sha1dc
competitively fast. If you don't, well, then we simply cannot use it by
default for *all* of our calls (you already pointed out that the pack
index' checksum does not need collision detection, and in fact, *any*
operation that works on implicitly trusted data falls into the same court,
e.g. `git add`).

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 22:51           ` Johannes Schindelin
@ 2017-03-01 23:05             ` Linus Torvalds
  2017-03-01 23:19               ` Jeff King
  2017-03-02 14:37               ` Johannes Schindelin
  0 siblings, 2 replies; 36+ messages in thread
From: Linus Torvalds @ 2017-03-01 23:05 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Jeff King, Marc Stevens, Dan Shumow, Git Mailing List

On Wed, Mar 1, 2017 at 2:51 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> But I think bigger than just developers on Windows OS. There are many
> developers out there working on large repositories (yes, much larger than
> Linux). Also using Macs and Linux. I am not at all sure that we want to
> give them an updated Git they cannot fail to notice to be much slower than
> before.

Johannes, have you *tried* the patches?

I really don't think you have. It is completely unnoticeable in any
normal situation. The two cases that it's noticeable is:

 - a full fsck is noticeable slower

 - a full non-local clone is slower (but not really noticeably so
since the network traffic dominates).

In other words, I think you're making shit up. I don't think you
understand how little the SHA1 performance actually matters. It's
noticeable in benchmarks. It's not noticeable in any normal operation.

.. and yes, I've actually been running the patches locally since I
posted my first version (which apparently didn't go out to the list
because of list size limits) and now running the version in 'pu'.

                Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
       [not found]           ` <CA+55aFwr1jncrk-cekn0Y8rs_S+zs7RrgQ-Jb-ZbgCvmVrHT_A@mail.gmail.com>
@ 2017-03-01 23:13             ` Jeff King
  0 siblings, 0 replies; 36+ messages in thread
From: Jeff King @ 2017-03-01 23:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Dan Shumow, Git Mailing List, Junio C Hamano, Marc Stevens

On Wed, Mar 01, 2017 at 12:58:39PM -0800, Linus Torvalds wrote:

>> I don't think that helps. The sha1 over the pack-file takes about 1.3s
>> with openssl, and 5s with sha1dc. So we already know the increase there
>> is only a few seconds, not a few minutes.
> 
> OK. I guess what w could easily do is to just add an argument to
> git_SHA1_Init() to say whether we want checking or not, and only use the
> checking functions when receiving objects (which would include creating new
> objects from files, and obviously deck).
> 
> You'd still eat the cost on the receiving side of a clone, but that's when
> you really want the checking anyway. At least it wouldn't be so visible on
> the sending side, which is all the hosting etc, where there might be server
> utilization issues.
> 
> Would that make deployment happier? It should be an easy little flag to
> add, I think.

I don't think it makes all that big a difference. The sending side
wastes the extra 2-3 seconds of CPU to checksum the whole packfile, but
it's not inflating all the object contents in the first place (between
reachability bitmaps to get the list of objects in the first place, and
then verbatim reuse of pack contents).

Which isn't to say it's not reasonable to limit the checking to a few
spots (basically anything that's _writing_ objects). But I don't think
it makes a big difference to the server side of a fetch or clone.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 23:05             ` Linus Torvalds
@ 2017-03-01 23:19               ` Jeff King
  2017-03-02  6:10                 ` Duy Nguyen
  2017-03-02 14:39                 ` Johannes Schindelin
  2017-03-02 14:37               ` Johannes Schindelin
  1 sibling, 2 replies; 36+ messages in thread
From: Jeff King @ 2017-03-01 23:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Schindelin, Junio C Hamano, Marc Stevens, Dan Shumow,
	Git Mailing List

On Wed, Mar 01, 2017 at 03:05:25PM -0800, Linus Torvalds wrote:

> On Wed, Mar 1, 2017 at 2:51 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >
> > But I think bigger than just developers on Windows OS. There are many
> > developers out there working on large repositories (yes, much larger than
> > Linux). Also using Macs and Linux. I am not at all sure that we want to
> > give them an updated Git they cannot fail to notice to be much slower than
> > before.
> 
> Johannes, have you *tried* the patches?
> 
> I really don't think you have. It is completely unnoticeable in any
> normal situation. The two cases that it's noticeable is:
> 
>  - a full fsck is noticeable slower
> 
>  - a full non-local clone is slower (but not really noticeably so
> since the network traffic dominates).
> 
> In other words, I think you're making shit up. I don't think you
> understand how little the SHA1 performance actually matters. It's
> noticeable in benchmarks. It's not noticeable in any normal operation.
> 
> .. and yes, I've actually been running the patches locally since I
> posted my first version (which apparently didn't go out to the list
> because of list size limits) and now running the version in 'pu'.

You have to remember that some of the Git for Windows users are doing
horrific things like using repositories with 450MB .git/index files, and
the speed to compute the sha1 during an update is noticeable there.

IMHO that is a good sign that the right approach is to switch to an
index format that doesn't require rewriting all 450MB to update one
entry. But obviously that is a much harder problem than just using an
optimized sha1 implementation.

I do think that could argue for turning on the collision detection only
during object-write operations, which is where it matters. It would be
really trivial to flip the "check collisions" bit on sha1dc. But I
suspect you could go faster still by compiling against two separate
implementations: the fast-as-possible one (which could be openssl or
blk-sha1), and the slower-but-careful sha1dc.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 20:34         ` Jeff King
       [not found]           ` <CA+55aFwr1jncrk-cekn0Y8rs_S+zs7RrgQ-Jb-ZbgCvmVrHT_A@mail.gmail.com>
@ 2017-03-01 23:38           ` Linus Torvalds
  2017-03-02  1:31             ` Dan Shumow
  1 sibling, 1 reply; 36+ messages in thread
From: Linus Torvalds @ 2017-03-01 23:38 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Git Mailing List, Dan Shumow, Marc Stevens

On Wed, Mar 1, 2017 at 12:34 PM, Jeff King <peff@peff.net> wrote:
>
> I don't think that helps. The sha1 over the pack-file takes about 1.3s
> with openssl, and 5s with sha1dc. So we already know the increase there
> is only a few seconds, not a few minutes.

Yeah, I did a few statistics by adding just logging of "SHA1_Init()"
calls. For that network clone situation, the call distribution is

      1        SHA1: Init at builtin/index-pack.c:326
 841228        SHA1: Init at builtin/index-pack.c:450
      2        SHA1: Init at csum-file.c:152
4415756        SHA1: Init at sha1_file.c:3218

(the line numbers are a bit off from 'pu', because I obviously have
the logging code).

The big number (one for every object) is from
write_sha1_file_prepare(), which we'd want to be the strong collision
checking version because those are things we're about to create git
objects out of. It's called from

 - hash_sha1_file() - doesn't actually write the object, but is used
to calculate the sha for incoming data after applying the delta, for
example.

 - write_sha1_file() - many uses, actually writes the object

 - hash_sha1_file_literally() - git hash-object

and that index-pack.c:450 is from unpack_entry_data() for the base
non-delta objects (which should also be the strong kind).

So all of them should check against collision attacks, so none of them
seem to be things you'd want to optimize away..

So I was wrong in thinking that there were a lot of unnecessary SHA1
calculations in that load. They all look like they should be done with
the slower checking code.

Oh well.

                      Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH] Put sha1dc on a diet
  2017-03-01 23:38           ` Linus Torvalds
@ 2017-03-02  1:31             ` Dan Shumow
  2017-03-02  4:38               ` Junio C Hamano
  0 siblings, 1 reply; 36+ messages in thread
From: Dan Shumow @ 2017-03-02  1:31 UTC (permalink / raw)
  To: Linus Torvalds, Jeff King; +Cc: Junio C Hamano, Git Mailing List, Marc Stevens

I played around tweaking the code a bit more and I got our performance down to a 2.077182x slowdown with check and a 1.055961x slowdown without checking.  However, that slowdown is basically with the check turned off through our API.  If I rip extraneous code for storing states and checking if we are doing collision detection out, I can reach performance parity with the block-sha1 implementation in the Git codebase, which basically tells me that is about as good as I can do for optimizing the C code.

SHA1 is more amenable to assembler implementation because its use of rotations, which are notoriously difficult to access through C code.  And as this happens in the inner loop of the function, the inline asm tends to not cut it.  This is one of the reasons that the OpenSSL SHA-1 runs like a scalded monkey, compared to the C implemenations.  Marc and I have also discussed using SIMD operations to speed up the UBC checks, which could definitely help achieve better performance, but is highly dependent on processor support.  It will take some time to do either a SIMD implementation of the UBC checks or an assembler implementation.

At this point, I would suggest that I take the C optimizations, clean them up and fold them in with the diet changes Linus has suggested.  The slowdown is still 2x over block-sha1 and more over OpenSSL.  But it is better than nothing.  And then if there is interest Marc and I can investigate other processor specific optimizations like ASM or SIMD and circle back with those performance optimizations at a later date.

Also, to Johannes Schindelin's point:
> My concern is about that unexpected turn "oh, let's just switch to C99 because, well, because my compiler canehandle it, and everybody else should just switch tn a modern compiler". That really sounded careless.

While it will probably be a pain, if it is a requirement, we can modify the code to move away from any c99 specific stuff we have in here, if it makes adopting the code more palatable for Git.

Thanks,
Dan



^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-02  1:31             ` Dan Shumow
@ 2017-03-02  4:38               ` Junio C Hamano
  2017-03-04  1:07                 ` Dan Shumow
  0 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2017-03-02  4:38 UTC (permalink / raw)
  To: Dan Shumow; +Cc: Linus Torvalds, Jeff King, Git Mailing List, Marc Stevens

Dan Shumow <danshu@microsoft.com> writes:

> At this point, I would suggest that I take the C optimizations,
> clean them up and fold them in with the diet changes Linus has
> suggested.  The slowdown is still 2x over block-sha1 and more over
> OpenSSL.  But it is better than nothing.  And then if there is
> interest Marc and I can investigate other processor specific
> optimizations like ASM or SIMD and circle back with those
> performance optimizations at a later date.
>
> Also, to Johannes Schindelin's point:
>> My concern is about that unexpected turn "oh, let's just switch
>> to C99 because, well, because my compiler canehandle it, and
>> everybody else should just switch tn a modern compiler". That
>> really sounded careless.
>
> While it will probably be a pain, if it is a requirement, we can
> modify the code to move away from any c99 specific stuff we have
> in here, if it makes adopting the code more palatable for Git.

I was assuming that we would treat your code just like how we treat
any other "borrowed code from elsewhere".  The usual way for us to
do so is to take code that was released by the "upstream" (under a
license that allows us to use it---yours is MIT, which does) in the
style and language of upstream's choice, and then we in the Git
development community takes responsiblity for massaging the code to
match our style, for trimming what we won't use and for doing any
other customization to fit our needs.

As you and Marc seemed to be still working on speeding up, such a
customization work to fully adjust your code to our codebase was
premature, so I tentatively queued what we saw on the list as-is on
our 'pu' branch so that people can have a reference point.  Which
unfortunately solicited a premature reaction by Johannes.  Please do
not worry too much about the comment.

But if you are willing to help us by getting involved in the
"customization" part, too, that would be a very welcome news to us.
In that case, "welcome to the Git development community" ;-)

So,... from my point of view, we are OK either way.  It is OK if you
are a third-party upstream that is not particularly interested in
Git project's specific requirement.  We surely would be happier if
you and Marc, the upstream authors of the code in question, also act
as participants in the Git development community.

Either way, thanks for your great help.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 23:19               ` Jeff King
@ 2017-03-02  6:10                 ` Duy Nguyen
  2017-03-02 14:45                   ` Johannes Schindelin
  2017-03-02 14:39                 ` Johannes Schindelin
  1 sibling, 1 reply; 36+ messages in thread
From: Duy Nguyen @ 2017-03-02  6:10 UTC (permalink / raw)
  To: Jeff King
  Cc: Linus Torvalds, Johannes Schindelin, Junio C Hamano,
	Marc Stevens, Dan Shumow, Git Mailing List

On Thu, Mar 2, 2017 at 6:19 AM, Jeff King <peff@peff.net> wrote:
> You have to remember that some of the Git for Windows users are doing
> horrific things like using repositories with 450MB .git/index files, and
> the speed to compute the sha1 during an update is noticeable there.

We probably should separate this use case from the object hashing
anyway. Here we need a better, more reliable crc32 basically, to
detect bit flips. Even if we move to SHA-something, we can keep
staying with SHA-1 here (and with the fastest implementation)
-- 
Duy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 23:05             ` Linus Torvalds
  2017-03-01 23:19               ` Jeff King
@ 2017-03-02 14:37               ` Johannes Schindelin
  1 sibling, 0 replies; 36+ messages in thread
From: Johannes Schindelin @ 2017-03-02 14:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Junio C Hamano, Jeff King, Marc Stevens, Dan Shumow, Git Mailing List

Hi Linus,

On Wed, 1 Mar 2017, Linus Torvalds wrote:

> On Wed, Mar 1, 2017 at 2:51 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >
> > But I think bigger than just developers on Windows OS. There are many
> > developers out there working on large repositories (yes, much larger
> > than Linux). Also using Macs and Linux. I am not at all sure that we
> > want to give them an updated Git they cannot fail to notice to be much
> > slower than before.
> 
> Johannes, have you *tried* the patches?
> 
> I really don't think you have. It is completely unnoticeable in any
> normal situation. The two cases that it's noticeable is:
> 
>  - a full fsck is noticeable slower
> 
>  - a full non-local clone is slower (but not really noticeably so
> since the network traffic dominates).
> 
> In other words, I think you're making shit up. I don't think you
> understand how little the SHA1 performance actually matters. It's
> noticeable in benchmarks. It's not noticeable in any normal operation.
> 
> .. and yes, I've actually been running the patches locally since I
> posted my first version (which apparently didn't go out to the list
> because of list size limits) and now running the version in 'pu'.

If you think that the Linux repository is a big one, then your reaction is
understandable.

I have zero interest in potty language, therefore my reply is very terse:
yes, I have been looking ad SHA-1 performance, and yes, it matters. Think
an index file of 300-400MB.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-01 23:19               ` Jeff King
  2017-03-02  6:10                 ` Duy Nguyen
@ 2017-03-02 14:39                 ` Johannes Schindelin
  1 sibling, 0 replies; 36+ messages in thread
From: Johannes Schindelin @ 2017-03-02 14:39 UTC (permalink / raw)
  To: Jeff King
  Cc: Linus Torvalds, Junio C Hamano, Marc Stevens, Dan Shumow,
	Git Mailing List

Hi Peff,

On Wed, 1 Mar 2017, Jeff King wrote:

> I do think that could argue for turning on the collision detection only
> during object-write operations, which is where it matters. It would be
> really trivial to flip the "check collisions" bit on sha1dc. But I
> suspect you could go faster still by compiling against two separate
> implementations: the fast-as-possible one (which could be openssl or
> blk-sha1), and the slower-but-careful sha1dc.

Given the speed difference between OpenSSL and sha1dc, it would be a wise
thing indeed to do sha1dc only where objects enter from possibly untrusted
sources, and use OpenSSL for all other hashing.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-02  6:10                 ` Duy Nguyen
@ 2017-03-02 14:45                   ` Johannes Schindelin
  2017-03-02 16:35                     ` Linus Torvalds
  0 siblings, 1 reply; 36+ messages in thread
From: Johannes Schindelin @ 2017-03-02 14:45 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Jeff King, Linus Torvalds, Junio C Hamano, Marc Stevens,
	Dan Shumow, Git Mailing List

Hi Duy,

On Thu, 2 Mar 2017, Duy Nguyen wrote:

> On Thu, Mar 2, 2017 at 6:19 AM, Jeff King <peff@peff.net> wrote:
> > You have to remember that some of the Git for Windows users are doing
> > horrific things like using repositories with 450MB .git/index files,
> > and the speed to compute the sha1 during an update is noticeable
> > there.
> 
> We probably should separate this use case from the object hashing
> anyway. Here we need a better, more reliable crc32 basically, to detect
> bit flips. Even if we move to SHA-something, we can keep staying with
> SHA-1 here (and with the fastest implementation)

I guess it was convenient to use the same hash algorithm for all hashing
purposes in the beginning. The downside, of course, was that we kept
talking about SHA-1s instead of commit hashes and the index checksum (i.e.
using labels based on implementation details rather than semantically
meaningful names).

In the meantime, we use different hash algorithms where appropriate, of
course, and we typically encapsulate the exact hash algorithm so that it
is easy to switch when/if necessary (think the hash functions for strings
in our hashtables, and the hash functions in xdiff).

It would probably make sense to switch the index integrity check away from
SHA-1 because we really only care about detecting bit flips there, and we
have no need for the computational overhead of using a full-blown
cryptographic hash for that purpose.

Ciao,
Johannes

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-02 14:45                   ` Johannes Schindelin
@ 2017-03-02 16:35                     ` Linus Torvalds
  2017-03-02 18:37                       ` Jeff Hostetler
  0 siblings, 1 reply; 36+ messages in thread
From: Linus Torvalds @ 2017-03-02 16:35 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Duy Nguyen, Jeff King, Junio C Hamano, Marc Stevens, Dan Shumow,
	Git Mailing List

On Thu, Mar 2, 2017 at 6:45 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> It would probably make sense to switch the index integrity check away from
> SHA-1 because we really only care about detecting bit flips there, and we
> have no need for the computational overhead of using a full-blown
> cryptographic hash for that purpose.

Which index do you actually see as being a problem, btw? The main file
index (.git/index) or the pack-file indexes?

We definitely don't need the checking version of sha1 for either of
those, but as Jeff already did the math, at least the pack-file index
is almost negligible, because the pack-file operations that update it
end up doing SHA1 over the objects - and the object SHA1 calculations
are much bigger.

And I don't think we even check the pack-file index hashes except on fsck.

Now, if your _file_ index is 300-400MB (and I do think we check the
SHA fingerprint on that even on just reading it - verify_hdr() in
do_read_index()), then that's going to be a somewhat noticeable hit on
every normal "git diff" etc.

But I'd have expected the stat() calls of all the files listed by that
index to be the _much_ bigger problem in that case. Or do you just
turn those off with assume-unchanged?

Yeah, those stat calls are threaded when preloading, but even so..

Anyway, the file index SHA1 checking could probably just be disabled
entirely (with a config flag). It's a corruption check that simply
isn't that important. So if that's your main SHA1 issue, that would be
easy to fix.

Everything else - like pack-file generation etc for a big clone() may
end up using a ton of SHA1 too, but the SHA1 costs all scale with the
other costs that drown them out (ie zlib, network, etc).

I'd love to see a profile if you have one.

                      Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-02 16:35                     ` Linus Torvalds
@ 2017-03-02 18:37                       ` Jeff Hostetler
  2017-03-02 19:04                         ` Linus Torvalds
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff Hostetler @ 2017-03-02 18:37 UTC (permalink / raw)
  To: Linus Torvalds, Johannes Schindelin
  Cc: Duy Nguyen, Jeff King, Junio C Hamano, Marc Stevens, Dan Shumow,
	Git Mailing List



On 3/2/2017 11:35 AM, Linus Torvalds wrote:
> On Thu, Mar 2, 2017 at 6:45 AM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>> It would probably make sense to switch the index integrity check away from
>> SHA-1 because we really only care about detecting bit flips there, and we
>> have no need for the computational overhead of using a full-blown
>> cryptographic hash for that purpose.
> Which index do you actually see as being a problem, btw? The main file
> index (.git/index) or the pack-file indexes?
>
> We definitely don't need the checking version of sha1 for either of
> those, but as Jeff already did the math, at least the pack-file index
> is almost negligible, because the pack-file operations that update it
> end up doing SHA1 over the objects - and the object SHA1 calculations
> are much bigger.
>
> And I don't think we even check the pack-file index hashes except on fsck.
>
> Now, if your _file_ index is 300-400MB (and I do think we check the
> SHA fingerprint on that even on just reading it - verify_hdr() in
> do_read_index()), then that's going to be a somewhat noticeable hit on
> every normal "git diff" etc.

Yes, the .git/index is 450MB with ~3.1M entries.  verify_hdr() is called 
each time
we read it into memory.

We have been testing a patch in GfW to run the verification in a 
separate thread
while the main thread parses (and mallocs) the cache_entries.  This does 
help
offset the time.
         https://github.com/git-for-windows/git/pull/978/files

> But I'd have expected the stat() calls of all the files listed by that
> index to be the _much_ bigger problem in that case. Or do you just
> turn those off with assume-unchanged?
>
> Yeah, those stat calls are threaded when preloading, but even so..

Yes, the stat() calls are more significant percentage of the time (and 
having
core.fscache and core.preloadindex help that greatly), but the total 
time for a command
is just that -- the total -- so using the philosophy of "every little 
bit helps", the faster
routines help us here.

> Anyway, the file index SHA1 checking could probably just be disabled
> entirely (with a config flag). It's a corruption check that simply
> isn't that important. So if that's your main SHA1 issue, that would be
> easy to fix.

Yes, in the GVFS effort, we disabled the verification with a config 
setting and haven't
had any incidents.


> Everything else - like pack-file generation etc for a big clone() may
> end up using a ton of SHA1 too, but the SHA1 costs all scale with the
> other costs that drown them out (ie zlib, network, etc).
>
> I'd love to see a profile if you have one.
>
>                        Linus


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-02 18:37                       ` Jeff Hostetler
@ 2017-03-02 19:04                         ` Linus Torvalds
  0 siblings, 0 replies; 36+ messages in thread
From: Linus Torvalds @ 2017-03-02 19:04 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Johannes Schindelin, Duy Nguyen, Jeff King, Junio C Hamano,
	Marc Stevens, Dan Shumow, Git Mailing List

On Thu, Mar 2, 2017 at 10:37 AM, Jeff Hostetler <git@jeffhostetler.com> wrote:
>>
>> Now, if your _file_ index is 300-400MB (and I do think we check the
>> SHA fingerprint on that even on just reading it - verify_hdr() in
>> do_read_index()), then that's going to be a somewhat noticeable hit on
>> every normal "git diff" etc.
>
> Yes, the .git/index is 450MB with ~3.1M entries.  verify_hdr() is called
> each time we read it into memory.

Ok. So that's really just a purely historical artifact.

The file index is actually the first part of git to have ever been
written. You can't even see it in the history, because the initial
revision from Apr 7, 2005, obviously depended on the actual object
hashing.

But the file index actually came first. You can _kind_ of see that in
the layout of the original git tree, and how the main header file is
still called "cache.h", and how the original ".git" directory was
actually called ".dircache".

And the two biggest files (by a fairly big margin) are "read-cache.c"
and "update-cache.c".

So that file index cache was in many ways _the_ central part of the
original git model. The sha1 file indexing and object database was
just the backing store for the file index.

But part of that history is then how much I worried about corruption
of that index (and, let's face it, general corruption resistance _was_
one of the primary design goals - performance was high up there too,
but safety in the face of filesystem corruption was and is a primary
issue).

But realistically, I don't think we've *ever* hit anything serious on
the index file, and it's obviously not a security issue. It also isn't
even a compatibility issue, so it would be trivial to just bump the
version header and saying that the signature changes the meaning of
the checksum.

That said:

> We have been testing a patch in GfW to run the verification in a separate thread
> while the main thread parses (and mallocs) the cache_entries.  This does help
> offset the time.

Yeah, that seems an even better solution, honestly.

The patch would be cleaner without the NO_PTHREADS things.

I wonder how meaningful that thing even is today. Looking at what
seems to select NO_PTHREADS, I suspect that's all entirely historical.
For example, you'll see it for QNX etc, which seems wrong - QNX
definitely has pthreads according to their docs, for example.

                     Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH] Put sha1dc on a diet
  2017-03-02  4:38               ` Junio C Hamano
@ 2017-03-04  1:07                 ` Dan Shumow
  2017-03-13 15:13                   ` Jeff King
  0 siblings, 1 reply; 36+ messages in thread
From: Dan Shumow @ 2017-03-04  1:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Jeff King, Git Mailing List, Marc Stevens

From: Junio C Hamano [mailto:gitster@pobox.com] 

> As you and Marc seemed to be still working on speeding up, such a customization work to fully adjust your code to our codebase was premature, so I tentatively queued what we saw on the list as-is on our 'pu' branch so that people can have a reference point.  Which unfortunately solicited a premature reaction by Johannes.  Please do not worry too much about the comment.

> But if you are willing to help us by getting involved in the "customization" part, too, that would be a very welcome news to us.
>In that case, "welcome to the Git development community" ;-)

> So,... from my point of view, we are OK either way.  It is OK if you are a third-party upstream that is not particularly interested in Git project's specific requirement.  We surely would be happier if you and Marc, the upstream authors of the code in question, also act as participants in the Git development community.

> Either way, thanks for your great help.

You are very welcome.  Thank you for the warm welcome.  As it turns out, Marc and I are working on the simplifications / removal of c99 and performance upstream in our GitHub repo.  I am happy to help for any GitHub specific customizations that are needed as well.  But for now, lets see if we can get you everything you want upstream -- I think that's the most simple.

Thanks again,
Dan


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-04  1:07                 ` Dan Shumow
@ 2017-03-13 15:13                   ` Jeff King
       [not found]                     ` <CY1PR0301MB2107B3C5131D5DC7F91A0147C4250@CY1PR0301MB2107.namprd03.prod.outlook.com>
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2017-03-13 15:13 UTC (permalink / raw)
  To: Dan Shumow; +Cc: Junio C Hamano, Linus Torvalds, Git Mailing List, Marc Stevens

On Sat, Mar 04, 2017 at 01:07:16AM +0000, Dan Shumow wrote:

> You are very welcome.  Thank you for the warm welcome.  As it turns
> out, Marc and I are working on the simplifications / removal of c99
> and performance upstream in our GitHub repo.  I am happy to help for
> any GitHub specific customizations that are needed as well.  But for
> now, lets see if we can get you everything you want upstream -- I
> think that's the most simple.

I've been watching the repo at:

  https://github.com/cr-marcstevens/sha1collisiondetection

The work on the feature/performance branch seems to be producing good
results. The best timings I got show sha1dc (with checks enabled) at
1.75x block-sha1, which is pretty good. That was using your
ad744c8b7a841d2afcb2d4c04f8952d9005501be.

Curiously, the performance gets worse after that. Even more curious, the
bad performance bisects to a merge, and it performs worse than either
side of the merge.

Try this:

  # mine is a 1.2GB linux packfile, but anything big should do
  file=/some/large/file

  # the merge with the funny behavior
  merge=55d1db0980501e582f6cd103a04f493995b1df78

  for i in $merge^ $merge^2 $merge; do
    git checkout $i &&
    rm -f bin/* &&
    make &&
    time bin/sha1dcsum $file
  done

I get:

  [$merge^, the feature/performance branch before the merge]
  real	0m3.391s
  user	0m3.304s
  sys	0m0.084s

  [$merge^2, the master branch before the merge]
  real	0m5.272s
  user	0m5.164s
  sys	0m0.096s

  [$merge, the merge of the two]
  real	0m7.038s
  user	0m6.924s
  sys	0m0.104s

So that's odd. Looking at the diff, I don't see anything that obviously
jumps out as a mis-merge.

Feel free to tell me "stop looking at that branch; it's a work in
progress". But I think the results from $merge^ (ad744c8b7) are getting
good enough to consider moving forward with integrating it into git.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
       [not found]                       ` <CY1PR0301MB2107876B6E47FBCF03AB1EA1C4250@CY1PR0301MB2107.namprd03.prod.outlook.com>
@ 2017-03-13 19:48                         ` Jeff King
  2017-03-13 20:12                           ` Marc Stevens
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2017-03-13 19:48 UTC (permalink / raw)
  To: Dan Shumow; +Cc: Linus Torvalds, Marc Stevens, Junio C Hamano, Git Mailing List

On Mon, Mar 13, 2017 at 07:42:17PM +0000, Dan Shumow wrote:

> Marc just made a commit this morning fixing problems with the merge.
> Please give the latest in feature/performance a try, as that seems to
> eliminate the problem.

Yeah, b17728507 makes the problem go away for me. Thanks.

FWIW, I have all sha1s on github.com running through this right now
(actually, the ad744c8b7 version), and logging any false-positives on
the collision detection. Nothing so far, after a few hours.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-13 19:48                         ` Jeff King
@ 2017-03-13 20:12                           ` Marc Stevens
  2017-03-13 20:20                             ` Linus Torvalds
  0 siblings, 1 reply; 36+ messages in thread
From: Marc Stevens @ 2017-03-13 20:12 UTC (permalink / raw)
  To: Jeff King, Dan Shumow; +Cc: Linus Torvalds, Junio C Hamano, Git Mailing List

Indeed, I've committed a fix, and a small bug fix for the new code just now.

The merge incorrectly removed some control logic,
which caused more unnecessary checks to happen.
I already marked this in the PR, but committed a fix only today.

BTW as noted in the Readme, the theoretic false positive probability is
<<2^-90, almost non-existent.

Best regards,
Marc Stevens


On 3/13/2017 8:48 PM, Jeff King wrote:
> On Mon, Mar 13, 2017 at 07:42:17PM +0000, Dan Shumow wrote:
>
>> Marc just made a commit this morning fixing problems with the merge.
>> Please give the latest in feature/performance a try, as that seems to
>> eliminate the problem.
> Yeah, b17728507 makes the problem go away for me. Thanks.
>
> FWIW, I have all sha1s on github.com running through this right now
> (actually, the ad744c8b7 version), and logging any false-positives on
> the collision detection. Nothing so far, after a few hours.
>
> -Peff


^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-13 20:12                           ` Marc Stevens
@ 2017-03-13 20:20                             ` Linus Torvalds
  2017-03-13 20:47                               ` Marc Stevens
  0 siblings, 1 reply; 36+ messages in thread
From: Linus Torvalds @ 2017-03-13 20:20 UTC (permalink / raw)
  To: Marc Stevens; +Cc: Jeff King, Dan Shumow, Junio C Hamano, Git Mailing List

On Mon, Mar 13, 2017 at 1:12 PM, Marc Stevens <marc.stevens@cwi.nl> wrote:
> Indeed, I've committed a fix, and a small bug fix for the new code just now.

Unrelated side note: there may be some missing dependencies in the
build infrastructure or something, because when I tried Jeff's script
that did that "test the merge and the two parents", and used the
pack-file of the kernel for testing, I got:

  5611971c610143e6d38bbdca463f4c9f79a056a0
/home/torvalds/v2.6/linux/.git/objects/pack/pack-153bb8cd11846cf9a27ef7b1069aa9cb9f5b724f.pack

  real 0m2.432s
  user 0m2.348s
  sys 0m0.084s

  5611971c610143e6d38bbdca463f4c9f79a056a0
/home/torvalds/v2.6/linux/.git/objects/pack/pack-153bb8cd11846cf9a27ef7b1069aa9cb9f5b724f.pack

  real 0m3.747s
  user 0m3.672s
  sys 0m0.076s

  5611971c610143e6d38bbdca463f4c9f79a056a0 *coll*
/home/torvalds/v2.6/linux/.git/objects/pack/pack-153bb8cd11846cf9a27ef7b1069aa9cb9f5b724f.pack

  real 0m5.061s
  user 0m4.984s
  sys 0m0.077s

never mind the performace, notice the *coll* in that last case.

But doing a "git clean -dqfx; make -j8" and re-testing the same tree,
the issue is gone.

I suspect some dependency on a header file is broken, causing some
object file to not be properly re-built, which in turn then
incorrectly causes the 'ctx2.found_collision' test to test the wrong
bit or something.

                 Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-13 20:20                             ` Linus Torvalds
@ 2017-03-13 20:47                               ` Marc Stevens
  2017-03-13 21:00                                 ` Jeff King
  0 siblings, 1 reply; 36+ messages in thread
From: Marc Stevens @ 2017-03-13 20:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jeff King, Dan Shumow, Junio C Hamano, Git Mailing List

Linus:
I would be surprised, the dependencies should be automatically determined.

BTW Did you make local changes to this perf branch?
Specifically did you disable the safe hash mode that is on by default?
Because if you did not, it might also be something else as all three hashes below are the same.

-- Marc

----- Original Message -----
From: "Linus Torvalds" <torvalds@linux-foundation.org>
To: "Marc Stevens" <marc.stevens@cwi.nl>
Cc: "Jeff King" <peff@peff.net>, "Dan Shumow" <danshu@microsoft.com>, "Junio C Hamano" <gitster@pobox.com>, "Git Mailing List" <git@vger.kernel.org>
Sent: Monday, March 13, 2017 9:20:02 PM
Subject: Re: [PATCH] Put sha1dc on a diet

On Mon, Mar 13, 2017 at 1:12 PM, Marc Stevens <marc.stevens@cwi.nl> wrote:
> Indeed, I've committed a fix, and a small bug fix for the new code just now.

Unrelated side note: there may be some missing dependencies in the
build infrastructure or something, because when I tried Jeff's script
that did that "test the merge and the two parents", and used the
pack-file of the kernel for testing, I got:

  5611971c610143e6d38bbdca463f4c9f79a056a0
/home/torvalds/v2.6/linux/.git/objects/pack/pack-153bb8cd11846cf9a27ef7b1069aa9cb9f5b724f.pack

  real 0m2.432s
  user 0m2.348s
  sys 0m0.084s

  5611971c610143e6d38bbdca463f4c9f79a056a0
/home/torvalds/v2.6/linux/.git/objects/pack/pack-153bb8cd11846cf9a27ef7b1069aa9cb9f5b724f.pack

  real 0m3.747s
  user 0m3.672s
  sys 0m0.076s

  5611971c610143e6d38bbdca463f4c9f79a056a0 *coll*
/home/torvalds/v2.6/linux/.git/objects/pack/pack-153bb8cd11846cf9a27ef7b1069aa9cb9f5b724f.pack

  real 0m5.061s
  user 0m4.984s
  sys 0m0.077s

never mind the performace, notice the *coll* in that last case.

But doing a "git clean -dqfx; make -j8" and re-testing the same tree,
the issue is gone.

I suspect some dependency on a header file is broken, causing some
object file to not be properly re-built, which in turn then
incorrectly causes the 'ctx2.found_collision' test to test the wrong
bit or something.

                 Linus

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-13 20:47                               ` Marc Stevens
@ 2017-03-13 21:00                                 ` Jeff King
  2017-03-13 21:15                                   ` Marc Stevens
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2017-03-13 21:00 UTC (permalink / raw)
  To: Marc Stevens; +Cc: Linus Torvalds, Dan Shumow, Junio C Hamano, Git Mailing List

On Mon, Mar 13, 2017 at 09:47:54PM +0100, Marc Stevens wrote:

> Linus:
> I would be surprised, the dependencies should be automatically determined.
> 
> BTW Did you make local changes to this perf branch?

I can reproduce it with:

  cd sha1collisiondetection
  git clean -dqfx ;# make sure we are starting from scratch

  git checkout 9c8e73cadb35776d3310e3f8ceda7183fa75a39f
  make
  bin/sha1dcsum $file

  git checkout 55d1db0980501e582f6cd103a04f493995b1df78
  make
  bin/sha1dcsum $file

The final call to sha1dcsum will report a collision, even though the
first one did not.

It also reproduces with the original snippet I posted. I didn't notice
because I was just collecting the timings then (and I originally noticed
the problem on the versions I had pulled into Git, where it works as
expected; but then I am just pulling in the two source files, without
all of the libtool magic).

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-13 21:00                                 ` Jeff King
@ 2017-03-13 21:15                                   ` Marc Stevens
  2017-03-16 18:22                                     ` Marc Stevens
  0 siblings, 1 reply; 36+ messages in thread
From: Marc Stevens @ 2017-03-13 21:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, Dan Shumow, Junio C Hamano, Git Mailing List

I think I now understand.
The Makefile indeed seems to fail to correctly rebuild when a header has changed.

As the performance branch has removed the 'int bigendian' from SHA1_CTX in lib/sha1.h,
the perf-branch and master-branch are binary incompatible.
So the command-line utility does not get fully recompiled 
and instead of the value of found_collision will read a different value of SHA1_CTX.

So be careful to always do a 'make clean' for now.

-- Marc

----- Original Message -----
From: "Jeff King" <peff@peff.net>
To: "Marc Stevens" <Marc.Stevens@cwi.nl>
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>, "Dan Shumow" <danshu@microsoft.com>, "Junio C Hamano" <gitster@pobox.com>, "Git Mailing List" <git@vger.kernel.org>
Sent: Monday, March 13, 2017 10:00:23 PM
Subject: Re: [PATCH] Put sha1dc on a diet

On Mon, Mar 13, 2017 at 09:47:54PM +0100, Marc Stevens wrote:

> Linus:
> I would be surprised, the dependencies should be automatically determined.
> 
> BTW Did you make local changes to this perf branch?

I can reproduce it with:

  cd sha1collisiondetection
  git clean -dqfx ;# make sure we are starting from scratch

  git checkout 9c8e73cadb35776d3310e3f8ceda7183fa75a39f
  make
  bin/sha1dcsum $file

  git checkout 55d1db0980501e582f6cd103a04f493995b1df78
  make
  bin/sha1dcsum $file

The final call to sha1dcsum will report a collision, even though the
first one did not.

It also reproduces with the original snippet I posted. I didn't notice
because I was just collecting the timings then (and I originally noticed
the problem on the versions I had pulled into Git, where it works as
expected; but then I am just pulling in the two source files, without
all of the libtool magic).

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-13 21:15                                   ` Marc Stevens
@ 2017-03-16 18:22                                     ` Marc Stevens
  2017-03-16 22:06                                       ` Jeff King
  0 siblings, 1 reply; 36+ messages in thread
From: Marc Stevens @ 2017-03-16 18:22 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, Dan Shumow, Junio C Hamano, Git Mailing List

Hi all,

Today I merged the perf-branch into master after code review and correctness testing.
So master is now more performant and safe to use.

-- Marc

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] Put sha1dc on a diet
  2017-03-16 18:22                                     ` Marc Stevens
@ 2017-03-16 22:06                                       ` Jeff King
  2017-03-16 22:07                                         ` Dan Shumow
  0 siblings, 1 reply; 36+ messages in thread
From: Jeff King @ 2017-03-16 22:06 UTC (permalink / raw)
  To: Marc Stevens; +Cc: Linus Torvalds, Dan Shumow, Junio C Hamano, Git Mailing List

On Thu, Mar 16, 2017 at 07:22:14PM +0100, Marc Stevens wrote:

> Today I merged the perf-branch into master after code review and correctness testing.
> So master is now more performant and safe to use.

Great, thank you (and Dan) so much for all your work. We're looking at
integrating this version in a nearby thread.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH] Put sha1dc on a diet
  2017-03-16 22:06                                       ` Jeff King
@ 2017-03-16 22:07                                         ` Dan Shumow
  0 siblings, 0 replies; 36+ messages in thread
From: Dan Shumow @ 2017-03-16 22:07 UTC (permalink / raw)
  To: Jeff King, Marc Stevens; +Cc: Linus Torvalds, Junio C Hamano, Git Mailing List

Great! Keep us posted if there is anything else that you would like from the code.  Or anyway we can make the process go more smoothly.

Thanks,
Dan


-----Original Message-----
From: Jeff King [mailto:peff@peff.net] 
Sent: Thursday, March 16, 2017 3:06 PM
To: Marc Stevens <Marc.Stevens@cwi.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>; Dan Shumow <danshu@microsoft.com>; Junio C Hamano <gitster@pobox.com>; Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH] Put sha1dc on a diet

On Thu, Mar 16, 2017 at 07:22:14PM +0100, Marc Stevens wrote:

> Today I merged the perf-branch into master after code review and correctness testing.
> So master is now more performant and safe to use.

Great, thank you (and Dan) so much for all your work. We're looking at integrating this version in a nearby thread.

-Peff

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, back to index

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-01  0:30 [PATCH] Put sha1dc on a diet Linus Torvalds
2017-03-01 18:42 ` Junio C Hamano
2017-03-01 18:49   ` Linus Torvalds
2017-03-01 19:41     ` Junio C Hamano
2017-03-01 21:56       ` Johannes Schindelin
2017-03-01 22:05         ` Junio C Hamano
2017-03-01 22:16         ` Linus Torvalds
2017-03-01 22:51           ` Johannes Schindelin
2017-03-01 23:05             ` Linus Torvalds
2017-03-01 23:19               ` Jeff King
2017-03-02  6:10                 ` Duy Nguyen
2017-03-02 14:45                   ` Johannes Schindelin
2017-03-02 16:35                     ` Linus Torvalds
2017-03-02 18:37                       ` Jeff Hostetler
2017-03-02 19:04                         ` Linus Torvalds
2017-03-02 14:39                 ` Johannes Schindelin
2017-03-02 14:37               ` Johannes Schindelin
2017-03-01 19:53     ` Jeff King
     [not found]       ` <CA+55aFwf3sxKW+dGTMjNAeHMOf=rvctEQohm+rbhEb=e3KLpHw@mail.gmail.com>
2017-03-01 20:34         ` Jeff King
     [not found]           ` <CA+55aFwr1jncrk-cekn0Y8rs_S+zs7RrgQ-Jb-ZbgCvmVrHT_A@mail.gmail.com>
2017-03-01 23:13             ` Jeff King
2017-03-01 23:38           ` Linus Torvalds
2017-03-02  1:31             ` Dan Shumow
2017-03-02  4:38               ` Junio C Hamano
2017-03-04  1:07                 ` Dan Shumow
2017-03-13 15:13                   ` Jeff King
     [not found]                     ` <CY1PR0301MB2107B3C5131D5DC7F91A0147C4250@CY1PR0301MB2107.namprd03.prod.outlook.com>
     [not found]                       ` <CY1PR0301MB2107876B6E47FBCF03AB1EA1C4250@CY1PR0301MB2107.namprd03.prod.outlook.com>
2017-03-13 19:48                         ` Jeff King
2017-03-13 20:12                           ` Marc Stevens
2017-03-13 20:20                             ` Linus Torvalds
2017-03-13 20:47                               ` Marc Stevens
2017-03-13 21:00                                 ` Jeff King
2017-03-13 21:15                                   ` Marc Stevens
2017-03-16 18:22                                     ` Marc Stevens
2017-03-16 22:06                                       ` Jeff King
2017-03-16 22:07                                         ` Dan Shumow
2017-03-01 19:07 ` Jeff King
2017-03-01 19:10   ` Linus Torvalds

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox