git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default
@ 2017-05-15 12:49 Ævar Arnfjörð Bjarmason
  2017-05-15 13:58 ` Marc Stevens
  0 siblings, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-15 12:49 UTC (permalink / raw)
  To: Git Mailing List; +Cc: michael.kebe, Jeff King, Marc Stevens

Since 2.13.0 just running "git status" on a newly init'd repo on
Solaris SPARC[1] segfaults. Michael (CC'd) reported this issue on #git
and I helped him debug it.

Just compiling with BLK_SHA1=YesPlease solves the issue.

There are at least two different issues with DC_SHA1 here:

 * We don't properly detect that this platform is big endian. The
check at the top of sha1dc/sha1.c needs to test for _BIG_ENDIAN. This
comes from sys/isa_defs.h which (I'm told by #solaris) is included on
Solaris by default, at least by stdio.h.

Hacking the endian detection makes t0013-sha1dc.sh pass.

  * Even with that & the test passing just a plain "git init x && cd x
&& touch A && git add A && git commit" will segfault.

This is some bug in the sha1dc code, presumably some big endian issue
that's not resolved by the change above. Backtrace for that (censored
actual author info):

 Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 1 (LWP 1)]
 0x002f8c84 in sha1_compression_states (ihv=0xffbf8268, m=0x398ad5,
W=0xffbf82fc, states=0xffbf857c) at sha1dc/sha1.c:291
 291             SHA1COMPRESS_FULL_ROUND1_STEP_LOAD(a, b, c, d, e, m,
W, 0, temp);
 (gdb) bt
 #0  0x002f8c84 in sha1_compression_states (ihv=0xffbf8268,
m=0x398ad5, W=0xffbf82fc, states=0xffbf857c) at sha1dc/sha1.c:291
 #1  0x00300b60 in sha1_process (ctx=0xffbf8260, block=0x398ad5) at
sha1dc/sha1.c:1616
 #2  0x00301188 in SHA1DCUpdate (ctx=0xffbf8260,
     buf=0x398ad5 "deadbeefdeadbeefdeadbeefdeadbeefdeadbeef\nauthor Au
Thor <au.thor@example.com> 123456789 +0000\ncommitter Au Thor
<au.thor@example.com> 123456789
                  +0000\n\nBlah Blah"..., len=220)
     at sha1dc/sha1.c:1731
 #3  0x0030168c in git_SHA1DCUpdate (ctx=0xffbf8260, vdata=0x398aa0,
len=273) at sha1dc/sha1.c:1808
 #4  0x002a6f7c in write_sha1_file_prepare (buf=0x398aa0, len=273,
type=0x959c8 "commit", sha1=0xffbfd630 "",
     hdr=0xffbf8c28 "commit 273", hdrlen=0xffbf8c24) at sha1_file.c:3207
 #5  0x002a71ac in hash_sha1_file (buf=0x398aa0, len=273, type=0x959c8
"commit", sha1=0xffbfd630 "") at sha1_file.c:3266
 #6  0x002a25f8 in check_sha1_signature (sha1=0xffbfdbb8
"\375\067\356\337\002", map=0x398aa0, size=273, type=0x959c8 "commit")
     at sha1_file.c:1644
 #7  0x0022816c in parse_object (sha1=0xffbfdbb8
"\375\067\356\337\002") at object.c:269
 #8  0x0027c258 in get_reference (revs=0xffbfdc88, name=0xa87f0
"HEAD", sha1=0xffbfdbb8 "\375\067\356\337\002", flags=0)
     at revision.c:196
 #9  0x00284714 in setup_revisions (argc=0, argv=0x0, revs=0xffbfdc88,
opt=0xffbfdc74) at revision.c:2295
 #10 0x002ee4d8 in wt_status_collect_changes_index (s=0x348ea8
<s.24114>) at wt-status.c:585
 #11 0x002eeae8 in wt_status_collect (s=0x348ea8 <s.24114>) at wt-status.c:701
 #12 0x000db4fc in cmd_status (argc=0, argv=0xffbfe6fc, prefix=0x0) at
builtin/commit.c:1393
 #13 0x000acc24 in run_builtin (p=0x340ccc <commands+1200>, argc=1,
argv=0xffbfe6fc) at git.c:371
 #14 0x000acf98 in handle_builtin (argc=1, argv=0xffbfe6fc) at git.c:572
 #15 0x000ad1a4 in run_argv (argcp=0xffbfe66c, argv=0xffbfe670) at git.c:624
 #16 0x000ad3fc in cmd_main (argc=1, argv=0xffbfe6fc) at git.c:701
 #17 0x001747a0 in main (argc=5, argv=0xffbfe6ec) at common-main.c:43

1. Relevant part of uname -a: "SunOS <hostname> 5.11 11.3 sun4v sparc sun4v".

^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default
  2017-05-15 12:49 Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default Ævar Arnfjörð Bjarmason
@ 2017-05-15 13:58 ` Marc Stevens
  2017-05-15 14:13   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 17+ messages in thread
From: Marc Stevens @ 2017-05-15 13:58 UTC (permalink / raw)
  To: 'Ævar Arnfjörð Bjarmason',
	'Git Mailing List'
  Cc: michael.kebe, 'Jeff King'

Hi Aevar,

Thank you for notifying us of this issue.
Big endianness is a tricky issue, also since I don't have access or accurate knowledge about all big endian systems.
Our github repo does check correct functioning, including an endianness mistake, with 'make test'.
But I guess this is not included for SHA1DC in Git.

Anyway, we can easily add the _BIG_ENDIAN macrotest to the git repo and will do so soon.

I don't think the segfault is caused by buffer overflow, inproper access, or the endianness issue.
But I did notice an unexpected issue: the message block pointer m=0x398ad5 is odd. 
Can you confirm whether loading an uint32_t from an odd address triggers a hardware interrupt on your platform?
This is not problem for x86, but maybe for your platform it is?
If it is then we should always copy buffer contents to the sha1context to avoid this issue.

Best regards,
Marc Stevens

-----Oorspronkelijk bericht-----
Van: Ævar Arnfjörð Bjarmason [mailto:avarab@gmail.com] 
Verzonden: maandag 15 mei 2017 14:49
Aan: Git Mailing List <git@vger.kernel.org>
CC: michael.kebe@gmail.com; Jeff King <peff@peff.net>; Marc Stevens <marc@marc-stevens.nl>
Onderwerp: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default

Since 2.13.0 just running "git status" on a newly init'd repo on Solaris SPARC[1] segfaults. Michael (CC'd) reported this issue on #git and I helped him debug it.

Just compiling with BLK_SHA1=YesPlease solves the issue.

There are at least two different issues with DC_SHA1 here:

 * We don't properly detect that this platform is big endian. The check at the top of sha1dc/sha1.c needs to test for _BIG_ENDIAN. This comes from sys/isa_defs.h which (I'm told by #solaris) is included on Solaris by default, at least by stdio.h.

Hacking the endian detection makes t0013-sha1dc.sh pass.

  * Even with that & the test passing just a plain "git init x && cd x && touch A && git add A && git commit" will segfault.

This is some bug in the sha1dc code, presumably some big endian issue that's not resolved by the change above. Backtrace for that (censored actual author info):

 Program received signal SIGSEGV, Segmentation fault.
 [Switching to Thread 1 (LWP 1)]
 0x002f8c84 in sha1_compression_states (ihv=0xffbf8268, m=0x398ad5, W=0xffbf82fc, states=0xffbf857c) at sha1dc/sha1.c:291
 291             SHA1COMPRESS_FULL_ROUND1_STEP_LOAD(a, b, c, d, e, m,
W, 0, temp);
 (gdb) bt
 #0  0x002f8c84 in sha1_compression_states (ihv=0xffbf8268, m=0x398ad5, W=0xffbf82fc, states=0xffbf857c) at sha1dc/sha1.c:291
 #1  0x00300b60 in sha1_process (ctx=0xffbf8260, block=0x398ad5) at
sha1dc/sha1.c:1616
 #2  0x00301188 in SHA1DCUpdate (ctx=0xffbf8260,
     buf=0x398ad5 "deadbeefdeadbeefdeadbeefdeadbeefdeadbeef\nauthor Au Thor <au.thor@example.com> 123456789 +0000\ncommitter Au Thor <au.thor@example.com> 123456789
                  +0000\n\nBlah Blah"..., len=220)
     at sha1dc/sha1.c:1731
 #3  0x0030168c in git_SHA1DCUpdate (ctx=0xffbf8260, vdata=0x398aa0,
len=273) at sha1dc/sha1.c:1808
 #4  0x002a6f7c in write_sha1_file_prepare (buf=0x398aa0, len=273,
type=0x959c8 "commit", sha1=0xffbfd630 "",
     hdr=0xffbf8c28 "commit 273", hdrlen=0xffbf8c24) at sha1_file.c:3207
 #5  0x002a71ac in hash_sha1_file (buf=0x398aa0, len=273, type=0x959c8 "commit", sha1=0xffbfd630 "") at sha1_file.c:3266
 #6  0x002a25f8 in check_sha1_signature (sha1=0xffbfdbb8 "\375\067\356\337\002", map=0x398aa0, size=273, type=0x959c8 "commit")
     at sha1_file.c:1644
 #7  0x0022816c in parse_object (sha1=0xffbfdbb8
"\375\067\356\337\002") at object.c:269
 #8  0x0027c258 in get_reference (revs=0xffbfdc88, name=0xa87f0 "HEAD", sha1=0xffbfdbb8 "\375\067\356\337\002", flags=0)
     at revision.c:196
 #9  0x00284714 in setup_revisions (argc=0, argv=0x0, revs=0xffbfdc88,
opt=0xffbfdc74) at revision.c:2295
 #10 0x002ee4d8 in wt_status_collect_changes_index (s=0x348ea8
<s.24114>) at wt-status.c:585
 #11 0x002eeae8 in wt_status_collect (s=0x348ea8 <s.24114>) at wt-status.c:701
 #12 0x000db4fc in cmd_status (argc=0, argv=0xffbfe6fc, prefix=0x0) at
builtin/commit.c:1393
 #13 0x000acc24 in run_builtin (p=0x340ccc <commands+1200>, argc=1,
argv=0xffbfe6fc) at git.c:371
 #14 0x000acf98 in handle_builtin (argc=1, argv=0xffbfe6fc) at git.c:572
 #15 0x000ad1a4 in run_argv (argcp=0xffbfe66c, argv=0xffbfe670) at git.c:624
 #16 0x000ad3fc in cmd_main (argc=1, argv=0xffbfe6fc) at git.c:701
 #17 0x001747a0 in main (argc=5, argv=0xffbfe6ec) at common-main.c:43

1. Relevant part of uname -a: "SunOS <hostname> 5.11 11.3 sun4v sparc sun4v".


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default
  2017-05-15 13:58 ` Marc Stevens
@ 2017-05-15 14:13   ` Ævar Arnfjörð Bjarmason
  2017-05-15 22:09     ` Jeff King
       [not found]     ` <CAKKM46vwM9pxyMxTc4jA0z_8vGKdDGCGg9ziKkFAsqr5ULYJxA@mail.gmail.com>
  0 siblings, 2 replies; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-15 14:13 UTC (permalink / raw)
  To: Marc Stevens; +Cc: Git Mailing List, Michael Kebe, Jeff King

On Mon, May 15, 2017 at 3:58 PM, Marc Stevens <marc@marc-stevens.nl> wrote:
> Hi Aevar,
>
> Thank you for notifying us of this issue.
> Big endianness is a tricky issue, also since I don't have access or accurate knowledge about all big endian systems.
> Our github repo does check correct functioning, including an endianness mistake, with 'make test'.
> But I guess this is not included for SHA1DC in Git.
>
> Anyway, we can easily add the _BIG_ENDIAN macrotest to the git repo and will do so soon.
>
> I don't think the segfault is caused by buffer overflow, inproper access, or the endianness issue.
> But I did notice an unexpected issue: the message block pointer m=0x398ad5 is odd.
> Can you confirm whether loading an uint32_t from an odd address triggers a hardware interrupt on your platform?
> This is not problem for x86, but maybe for your platform it is?
> If it is then we should always copy buffer contents to the sha1context to avoid this issue.

I don't have access to the box in question, Michael was testing this
code for me. But unaligned access is probably the cause, although
according to some info I found online that should give a SIGBUS not a
SIGSEGV, but that may have changed:

https://bugs.python.org/issue12181
https://github.com/magnumripper/JohnTheRipper/issues/2187

> Best regards,
> Marc Stevens
>
> -----Oorspronkelijk bericht-----
> Van: Ævar Arnfjörð Bjarmason [mailto:avarab@gmail.com]
> Verzonden: maandag 15 mei 2017 14:49
> Aan: Git Mailing List <git@vger.kernel.org>
> CC: michael.kebe@gmail.com; Jeff King <peff@peff.net>; Marc Stevens <marc@marc-stevens.nl>
> Onderwerp: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default
>
> Since 2.13.0 just running "git status" on a newly init'd repo on Solaris SPARC[1] segfaults. Michael (CC'd) reported this issue on #git and I helped him debug it.
>
> Just compiling with BLK_SHA1=YesPlease solves the issue.
>
> There are at least two different issues with DC_SHA1 here:
>
>  * We don't properly detect that this platform is big endian. The check at the top of sha1dc/sha1.c needs to test for _BIG_ENDIAN. This comes from sys/isa_defs.h which (I'm told by #solaris) is included on Solaris by default, at least by stdio.h.
>
> Hacking the endian detection makes t0013-sha1dc.sh pass.
>
>   * Even with that & the test passing just a plain "git init x && cd x && touch A && git add A && git commit" will segfault.
>
> This is some bug in the sha1dc code, presumably some big endian issue that's not resolved by the change above. Backtrace for that (censored actual author info):
>
>  Program received signal SIGSEGV, Segmentation fault.
>  [Switching to Thread 1 (LWP 1)]
>  0x002f8c84 in sha1_compression_states (ihv=0xffbf8268, m=0x398ad5, W=0xffbf82fc, states=0xffbf857c) at sha1dc/sha1.c:291
>  291             SHA1COMPRESS_FULL_ROUND1_STEP_LOAD(a, b, c, d, e, m,
> W, 0, temp);
>  (gdb) bt
>  #0  0x002f8c84 in sha1_compression_states (ihv=0xffbf8268, m=0x398ad5, W=0xffbf82fc, states=0xffbf857c) at sha1dc/sha1.c:291
>  #1  0x00300b60 in sha1_process (ctx=0xffbf8260, block=0x398ad5) at
> sha1dc/sha1.c:1616
>  #2  0x00301188 in SHA1DCUpdate (ctx=0xffbf8260,
>      buf=0x398ad5 "deadbeefdeadbeefdeadbeefdeadbeefdeadbeef\nauthor Au Thor <au.thor@example.com> 123456789 +0000\ncommitter Au Thor <au.thor@example.com> 123456789
>                   +0000\n\nBlah Blah"..., len=220)
>      at sha1dc/sha1.c:1731
>  #3  0x0030168c in git_SHA1DCUpdate (ctx=0xffbf8260, vdata=0x398aa0,
> len=273) at sha1dc/sha1.c:1808
>  #4  0x002a6f7c in write_sha1_file_prepare (buf=0x398aa0, len=273,
> type=0x959c8 "commit", sha1=0xffbfd630 "",
>      hdr=0xffbf8c28 "commit 273", hdrlen=0xffbf8c24) at sha1_file.c:3207
>  #5  0x002a71ac in hash_sha1_file (buf=0x398aa0, len=273, type=0x959c8 "commit", sha1=0xffbfd630 "") at sha1_file.c:3266
>  #6  0x002a25f8 in check_sha1_signature (sha1=0xffbfdbb8 "\375\067\356\337\002", map=0x398aa0, size=273, type=0x959c8 "commit")
>      at sha1_file.c:1644
>  #7  0x0022816c in parse_object (sha1=0xffbfdbb8
> "\375\067\356\337\002") at object.c:269
>  #8  0x0027c258 in get_reference (revs=0xffbfdc88, name=0xa87f0 "HEAD", sha1=0xffbfdbb8 "\375\067\356\337\002", flags=0)
>      at revision.c:196
>  #9  0x00284714 in setup_revisions (argc=0, argv=0x0, revs=0xffbfdc88,
> opt=0xffbfdc74) at revision.c:2295
>  #10 0x002ee4d8 in wt_status_collect_changes_index (s=0x348ea8
> <s.24114>) at wt-status.c:585
>  #11 0x002eeae8 in wt_status_collect (s=0x348ea8 <s.24114>) at wt-status.c:701
>  #12 0x000db4fc in cmd_status (argc=0, argv=0xffbfe6fc, prefix=0x0) at
> builtin/commit.c:1393
>  #13 0x000acc24 in run_builtin (p=0x340ccc <commands+1200>, argc=1,
> argv=0xffbfe6fc) at git.c:371
>  #14 0x000acf98 in handle_builtin (argc=1, argv=0xffbfe6fc) at git.c:572
>  #15 0x000ad1a4 in run_argv (argcp=0xffbfe66c, argv=0xffbfe670) at git.c:624
>  #16 0x000ad3fc in cmd_main (argc=1, argv=0xffbfe6fc) at git.c:701
>  #17 0x001747a0 in main (argc=5, argv=0xffbfe6ec) at common-main.c:43
>
> 1. Relevant part of uname -a: "SunOS <hostname> 5.11 11.3 sun4v sparc sun4v".
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default
  2017-05-15 14:13   ` Ævar Arnfjörð Bjarmason
@ 2017-05-15 22:09     ` Jeff King
  2017-06-01 14:03       ` demerphq
       [not found]     ` <CAKKM46vwM9pxyMxTc4jA0z_8vGKdDGCGg9ziKkFAsqr5ULYJxA@mail.gmail.com>
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff King @ 2017-05-15 22:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Marc Stevens, Git Mailing List, Michael Kebe

On Mon, May 15, 2017 at 04:13:58PM +0200, Ævar Arnfjörð Bjarmason wrote:

> On Mon, May 15, 2017 at 3:58 PM, Marc Stevens <marc@marc-stevens.nl> wrote:
> > Hi Aevar,
> >
> > Thank you for notifying us of this issue.
> > Big endianness is a tricky issue, also since I don't have access or accurate knowledge about all big endian systems.
> > Our github repo does check correct functioning, including an endianness mistake, with 'make test'.
> > But I guess this is not included for SHA1DC in Git.
> >
> > Anyway, we can easily add the _BIG_ENDIAN macrotest to the git repo and will do so soon.
> >
> > I don't think the segfault is caused by buffer overflow, inproper access, or the endianness issue.
> > But I did notice an unexpected issue: the message block pointer m=0x398ad5 is odd.
> > Can you confirm whether loading an uint32_t from an odd address triggers a hardware interrupt on your platform?
> > This is not problem for x86, but maybe for your platform it is?
> > If it is then we should always copy buffer contents to the sha1context to avoid this issue.
> 
> I don't have access to the box in question, Michael was testing this
> code for me. But unaligned access is probably the cause, although
> according to some info I found online that should give a SIGBUS not a
> SIGSEGV, but that may have changed:

Yeah, I would have expected SIGBUS there. If we have alignment issues,
though, I'd expect that ARM systems will experience problems.

Block-sha1 uses a macro which allows unaligned loads on platforms that
support it, and otherwise does the endian conversion on the fly as we
load the bytes into a local variable (which presumably happens all
in-register). That may be faster than doing a mass copy of the buffer.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default
       [not found]       ` <007001d2cd88$2b916180$82b42480$@marc-stevens.nl>
@ 2017-05-16  5:43         ` Michael Kebe
  2017-05-17  5:39           ` [PATCH] sha1dc: fix issues with a big endian platform Junio C Hamano
  0 siblings, 1 reply; 17+ messages in thread
From: Michael Kebe @ 2017-05-16  5:43 UTC (permalink / raw)
  To: Marc Stevens
  Cc: Ævar Arnfjörð Bjarmason, Jeff King,
	Git Mailing List

Hi Marc,

works like a charm!

Michael


2017-05-15 16:33 GMT+02:00 Marc Stevens <marc@marc-stevens.nl>:
> Hi Michael,
>
>
>
> See the latest commit to the SHA1DC repo:
>
> https://github.com/cr-marcstevens/sha1collisiondetection/commit/33a694a9ee1b79c24be45f9eab5ac0e1aeeaf271
>
> Just apply this same patch to SHA1DC in Git.
>
>
>
> Best regards,
>
> Marc Stevens
>
>
>
> Van: Michael Kebe [mailto:michael.kebe@gmail.com]
> Verzonden: maandag 15 mei 2017 16:22
> Aan: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> CC: Jeff King <peff@peff.net>; Git Mailing List <git@vger.kernel.org>; Marc
> Stevens <marc@marc-stevens.nl>
> Onderwerp: Re: Git 2.13.0 segfaults on Solaris SPARC due to
> DC_SHA1=YesPlease being on by default
>
>
>
> Just give me patch or repo, I will test it tomorrow.
>
>
>
> Michael
>
>
>
>
>
> Am 15.05.2017 4:14 nachm. schrieb "Ævar Arnfjörð Bjarmason"
> <avarab@gmail.com>:
>
> On Mon, May 15, 2017 at 3:58 PM, Marc Stevens <marc@marc-stevens.nl> wrote:
>> Hi Aevar,
>>
>> Thank you for notifying us of this issue.
>> Big endianness is a tricky issue, also since I don't have access or
>> accurate knowledge about all big endian systems.
>> Our github repo does check correct functioning, including an endianness
>> mistake, with 'make test'.
>> But I guess this is not included for SHA1DC in Git.
>>
>> Anyway, we can easily add the _BIG_ENDIAN macrotest to the git repo and
>> will do so soon.
>>
>> I don't think the segfault is caused by buffer overflow, inproper access,
>> or the endianness issue.
>> But I did notice an unexpected issue: the message block pointer m=0x398ad5
>> is odd.
>> Can you confirm whether loading an uint32_t from an odd address triggers a
>> hardware interrupt on your platform?
>> This is not problem for x86, but maybe for your platform it is?
>> If it is then we should always copy buffer contents to the sha1context to
>> avoid this issue.
>
> I don't have access to the box in question, Michael was testing this
> code for me. But unaligned access is probably the cause, although
> according to some info I found online that should give a SIGBUS not a
> SIGSEGV, but that may have changed:
>
> https://bugs.python.org/issue12181
> https://github.com/magnumripper/JohnTheRipper/issues/2187
>
>> Best regards,
>> Marc Stevens
>>
>> -----Oorspronkelijk bericht-----
>> Van: Ævar Arnfjörð Bjarmason [mailto:avarab@gmail.com]
>> Verzonden: maandag 15 mei 2017 14:49
>> Aan: Git Mailing List <git@vger.kernel.org>
>> CC: michael.kebe@gmail.com; Jeff King <peff@peff.net>; Marc Stevens
>> <marc@marc-stevens.nl>
>> Onderwerp: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease
>> being on by default
>>
>> Since 2.13.0 just running "git status" on a newly init'd repo on Solaris
>> SPARC[1] segfaults. Michael (CC'd) reported this issue on #git and I helped
>> him debug it.
>>
>> Just compiling with BLK_SHA1=YesPlease solves the issue.
>>
>> There are at least two different issues with DC_SHA1 here:
>>
>>  * We don't properly detect that this platform is big endian. The check at
>> the top of sha1dc/sha1.c needs to test for _BIG_ENDIAN. This comes from
>> sys/isa_defs.h which (I'm told by #solaris) is included on Solaris by
>> default, at least by stdio.h.
>>
>> Hacking the endian detection makes t0013-sha1dc.sh pass.
>>
>>   * Even with that & the test passing just a plain "git init x && cd x &&
>> touch A && git add A && git commit" will segfault.
>>
>> This is some bug in the sha1dc code, presumably some big endian issue
>> that's not resolved by the change above. Backtrace for that (censored actual
>> author info):
>>
>>  Program received signal SIGSEGV, Segmentation fault.
>>  [Switching to Thread 1 (LWP 1)]
>>  0x002f8c84 in sha1_compression_states (ihv=0xffbf8268, m=0x398ad5,
>> W=0xffbf82fc, states=0xffbf857c) at sha1dc/sha1.c:291
>>  291             SHA1COMPRESS_FULL_ROUND1_STEP_LOAD(a, b, c, d, e, m,
>> W, 0, temp);
>>  (gdb) bt
>>  #0  0x002f8c84 in sha1_compression_states (ihv=0xffbf8268, m=0x398ad5,
>> W=0xffbf82fc, states=0xffbf857c) at sha1dc/sha1.c:291
>>  #1  0x00300b60 in sha1_process (ctx=0xffbf8260, block=0x398ad5) at
>> sha1dc/sha1.c:1616
>>  #2  0x00301188 in SHA1DCUpdate (ctx=0xffbf8260,
>>      buf=0x398ad5 "deadbeefdeadbeefdeadbeefdeadbeefdeadbeef\nauthor Au
>> Thor <au.thor@example.com> 123456789 +0000\ncommitter Au Thor
>> <au.thor@example.com> 123456789
>>                   +0000\n\nBlah Blah"..., len=220)
>>      at sha1dc/sha1.c:1731
>>  #3  0x0030168c in git_SHA1DCUpdate (ctx=0xffbf8260, vdata=0x398aa0,
>> len=273) at sha1dc/sha1.c:1808
>>  #4  0x002a6f7c in write_sha1_file_prepare (buf=0x398aa0, len=273,
>> type=0x959c8 "commit", sha1=0xffbfd630 "",
>>      hdr=0xffbf8c28 "commit 273", hdrlen=0xffbf8c24) at sha1_file.c:3207
>>  #5  0x002a71ac in hash_sha1_file (buf=0x398aa0, len=273, type=0x959c8
>> "commit", sha1=0xffbfd630 "") at sha1_file.c:3266
>>  #6  0x002a25f8 in check_sha1_signature (sha1=0xffbfdbb8
>> "\375\067\356\337\002", map=0x398aa0, size=273, type=0x959c8 "commit")
>>      at sha1_file.c:1644
>>  #7  0x0022816c in parse_object (sha1=0xffbfdbb8
>> "\375\067\356\337\002") at object.c:269
>>  #8  0x0027c258 in get_reference (revs=0xffbfdc88, name=0xa87f0 "HEAD",
>> sha1=0xffbfdbb8 "\375\067\356\337\002", flags=0)
>>      at revision.c:196
>>  #9  0x00284714 in setup_revisions (argc=0, argv=0x0, revs=0xffbfdc88,
>> opt=0xffbfdc74) at revision.c:2295
>>  #10 0x002ee4d8 in wt_status_collect_changes_index (s=0x348ea8
>> <s.24114>) at wt-status.c:585
>>  #11 0x002eeae8 in wt_status_collect (s=0x348ea8 <s.24114>) at
>> wt-status.c:701
>>  #12 0x000db4fc in cmd_status (argc=0, argv=0xffbfe6fc, prefix=0x0) at
>> builtin/commit.c:1393
>>  #13 0x000acc24 in run_builtin (p=0x340ccc <commands+1200>, argc=1,
>> argv=0xffbfe6fc) at git.c:371
>>  #14 0x000acf98 in handle_builtin (argc=1, argv=0xffbfe6fc) at git.c:572
>>  #15 0x000ad1a4 in run_argv (argcp=0xffbfe66c, argv=0xffbfe670) at
>> git.c:624
>>  #16 0x000ad3fc in cmd_main (argc=1, argv=0xffbfe6fc) at git.c:701
>>  #17 0x001747a0 in main (argc=5, argv=0xffbfe6ec) at common-main.c:43
>>
>> 1. Relevant part of uname -a: "SunOS <hostname> 5.11 11.3 sun4v sparc
>> sun4v".
>>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] sha1dc: fix issues with a big endian platform
  2017-05-16  5:43         ` Michael Kebe
@ 2017-05-17  5:39           ` Junio C Hamano
  2017-05-17  6:30             ` Ævar Arnfjörð Bjarmason
  2017-05-17 15:26             ` [PATCH] sha1dc: fix issues with a big endian platform Johannes Schindelin
  0 siblings, 2 replies; 17+ messages in thread
From: Junio C Hamano @ 2017-05-17  5:39 UTC (permalink / raw)
  To: Marc Stevens
  Cc: Michael Kebe, Ævar Arnfjörð Bjarmason, Jeff King,
	Git Mailing List

From: Marc Stevens <marc@marc-stevens.nl>

Some big-endian platforms define _BIG_ENDIAN, which the test at the
beginning of file has missed.  Also, when the input is not aligned,
some platforms trigger SIGBUS.

This change corresponds to 33a694a9 ("Fix issues with a big endian
platform", 2017-05-15) in the history of the upstream repository
https://github.com/cr-marcstevens/sha1collisiondetection

---

 * So here is my attempt to clarify the log message (I left the
   title as-is, but this change deals both with endianness and
   alignment requirement).

   Please look it over, and then sign-off your patch ;-)

   Thanks.

   P.S. I wonder how often "buf" is not aligned---could we somehow
   optimize out memcpy when it is not necessary, or is it not worth
   it?

 sha1dc/sha1.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
index 35e9dd5bf4..ae25318c47 100644
--- a/sha1dc/sha1.c
+++ b/sha1dc/sha1.c
@@ -20,7 +20,7 @@
  */
 #if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || \
     (defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __BIG_ENDIAN__)) || \
-    defined(__BIG_ENDIAN__) || defined(__ARMEB__) || defined(__THUMBEB__) ||  defined(__AARCH64EB__) || \
+    defined(_BIG_ENDIAN) || defined(__BIG_ENDIAN__) || defined(__ARMEB__) || defined(__THUMBEB__) ||  defined(__AARCH64EB__) || \
     defined(_MIPSEB) || defined(__MIPSEB) || defined(__MIPSEB__)
 
 #define SHA1DC_BIGENDIAN	1
@@ -1728,7 +1728,8 @@ void SHA1DCUpdate(SHA1_CTX* ctx, const char* buf, size_t len)
 	while (len >= 64)
 	{
 		ctx->total += 64;
-		sha1_process(ctx, (uint32_t*)(buf));
+		memcpy(ctx->buffer, buf, 64);
+		sha1_process(ctx, (uint32_t*)(ctx->buffer));
 		buf += 64;
 		len -= 64;
 	}
-- 
2.13.0-416-g4c6b804423


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] sha1dc: fix issues with a big endian platform
  2017-05-17  5:39           ` [PATCH] sha1dc: fix issues with a big endian platform Junio C Hamano
@ 2017-05-17  6:30             ` Ævar Arnfjörð Bjarmason
  2017-05-17  7:09               ` Junio C Hamano
  2017-05-17 15:26             ` [PATCH] sha1dc: fix issues with a big endian platform Johannes Schindelin
  1 sibling, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-17  6:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Marc Stevens, Michael Kebe, Jeff King, Git Mailing List

On Wed, May 17, 2017 at 7:39 AM, Junio C Hamano <gitster@pobox.com> wrote:
> From: Marc Stevens <marc@marc-stevens.nl>
>
> Some big-endian platforms define _BIG_ENDIAN, which the test at the
> beginning of file has missed.  Also, when the input is not aligned,
> some platforms trigger SIGBUS.
>
> This change corresponds to 33a694a9 ("Fix issues with a big endian
> platform", 2017-05-15) in the history of the upstream repository
> https://github.com/cr-marcstevens/sha1collisiondetection

Then why not just update sha1dc from upstream instead of
cherry-picking one patch from them? My update-sha1dc change does this:
https://github.com/git/git/compare/master...avar:update-sha1dc

This follows the instructions Jeff detailed in 28dc98e343. I solved a
few merge conflicts, I think I got them right, but review of that
especially welcome.

Maybe you just meant this patch for maint, but if that's the case it's
not clear from your commit message or E-Mail.

> ---
>
>  * So here is my attempt to clarify the log message (I left the
>    title as-is, but this change deals both with endianness and
>    alignment requirement).
>
>    Please look it over, and then sign-off your patch ;-)
>
>    Thanks.
>
>    P.S. I wonder how often "buf" is not aligned---could we somehow
>    optimize out memcpy when it is not necessary, or is it not worth
>    it?
>
>  sha1dc/sha1.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
> index 35e9dd5bf4..ae25318c47 100644
> --- a/sha1dc/sha1.c
> +++ b/sha1dc/sha1.c
> @@ -20,7 +20,7 @@
>   */
>  #if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || \
>      (defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __BIG_ENDIAN__)) || \
> -    defined(__BIG_ENDIAN__) || defined(__ARMEB__) || defined(__THUMBEB__) ||  defined(__AARCH64EB__) || \
> +    defined(_BIG_ENDIAN) || defined(__BIG_ENDIAN__) || defined(__ARMEB__) || defined(__THUMBEB__) ||  defined(__AARCH64EB__) || \
>      defined(_MIPSEB) || defined(__MIPSEB) || defined(__MIPSEB__)
>
>  #define SHA1DC_BIGENDIAN       1
> @@ -1728,7 +1728,8 @@ void SHA1DCUpdate(SHA1_CTX* ctx, const char* buf, size_t len)
>         while (len >= 64)
>         {
>                 ctx->total += 64;
> -               sha1_process(ctx, (uint32_t*)(buf));
> +               memcpy(ctx->buffer, buf, 64);
> +               sha1_process(ctx, (uint32_t*)(ctx->buffer));
>                 buf += 64;
>                 len -= 64;
>         }
> --
> 2.13.0-416-g4c6b804423
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] sha1dc: fix issues with a big endian platform
  2017-05-17  6:30             ` Ævar Arnfjörð Bjarmason
@ 2017-05-17  7:09               ` Junio C Hamano
  2017-05-17 11:38                 ` [PATCH/RFC 0/3] Use sha1collisiondetection as a submodule Ævar Arnfjörð Bjarmason
                                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Junio C Hamano @ 2017-05-17  7:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Marc Stevens, Michael Kebe, Jeff King, Git Mailing List

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Wed, May 17, 2017 at 7:39 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> From: Marc Stevens <marc@marc-stevens.nl>
>>
>> Some big-endian platforms define _BIG_ENDIAN, which the test at the
>> beginning of file has missed.  Also, when the input is not aligned,
>> some platforms trigger SIGBUS.
>>
>> This change corresponds to 33a694a9 ("Fix issues with a big endian
>> platform", 2017-05-15) in the history of the upstream repository
>> https://github.com/cr-marcstevens/sha1collisiondetection
>
> Then why not just update sha1dc from upstream instead of
> cherry-picking one patch from them?

See the original message upthread.  I did the cherry-pick simply
because that was what Marc instructed the patch recipient to do ;-).


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH/RFC 0/3] Use sha1collisiondetection as a submodule
  2017-05-17  7:09               ` Junio C Hamano
@ 2017-05-17 11:38                 ` Ævar Arnfjörð Bjarmason
  2017-05-17 18:52                   ` Stefan Beller
  2017-05-17 11:38                 ` [PATCH/RFC 1/3] sha1dc: update from my fork of upstream Ævar Arnfjörð Bjarmason
  2017-05-17 11:38                 ` [PATCH/RFC 2/3] sha1dc: use sha1collisiondetection as a submodule Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-17 11:38 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Marc Stevens, Michael Kebe, Jeff King,
	Ævar Arnfjörð Bjarmason

On Wed, May 17, 2017 at 9:09 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Wed, May 17, 2017 at 7:39 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>> From: Marc Stevens <marc@marc-stevens.nl>
>>>
>>> Some big-endian platforms define _BIG_ENDIAN, which the test at the
>>> beginning of file has missed.  Also, when the input is not aligned,
>>> some platforms trigger SIGBUS.
>>>
>>> This change corresponds to 33a694a9 ("Fix issues with a big endian
>>> platform", 2017-05-15) in the history of the upstream repository
>>> https://github.com/cr-marcstevens/sha1collisiondetection
>>
>> Then why not just update sha1dc from upstream instead of
>> cherry-picking one patch from them?
>
> See the original message upthread.  I did the cherry-pick simply
> because that was what Marc instructed the patch recipient to do ;-).

Since that patch is now in Marc's upstream code we can just update to
that.

While we're at it let's see if Marc will take a patch so that we can
use his code as-is with some Makefile trickery of our own, instead of
having to solve merge conflicts each time we update from him. I'll
submit a pull request for that shortly.

And since if and when that pull request gets accepted we're at the
point of being able to use the upstream code as-is & don't need to
locally modify it, we can just use a submodule to track it.

Ævar Arnfjörð Bjarmason (3):
  sha1dc: update from my fork of upstream
  sha1dc: use sha1collisiondetection as a submodule
  sha1dc: remove our old copy of the sha1dc code

 .gitmodules            |    4 +
 Makefile               |   13 +-
 hash.h                 |    2 +-
 sha1collisiondetection |    1 +
 sha1dc/LICENSE.txt     |   30 -
 sha1dc/sha1.c          | 1809 ------------------------------------------------
 sha1dc/sha1.h          |  122 ----
 sha1dc/ubc_check.c     |  363 ----------
 sha1dc/ubc_check.h     |   44 --
 sha1dc_git.c           |   19 +
 sha1dc_git.h           |   14 +
 11 files changed, 49 insertions(+), 2372 deletions(-)
 create mode 100644 .gitmodules
 create mode 160000 sha1collisiondetection
 delete mode 100644 sha1dc/LICENSE.txt
 delete mode 100644 sha1dc/sha1.c
 delete mode 100644 sha1dc/sha1.h
 delete mode 100644 sha1dc/ubc_check.c
 delete mode 100644 sha1dc/ubc_check.h
 create mode 100644 sha1dc_git.c
 create mode 100644 sha1dc_git.h

-- 
2.13.0.303.g4ebf302169


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH/RFC 1/3] sha1dc: update from my fork of upstream
  2017-05-17  7:09               ` Junio C Hamano
  2017-05-17 11:38                 ` [PATCH/RFC 0/3] Use sha1collisiondetection as a submodule Ævar Arnfjörð Bjarmason
@ 2017-05-17 11:38                 ` Ævar Arnfjörð Bjarmason
  2017-05-17 11:38                 ` [PATCH/RFC 2/3] sha1dc: use sha1collisiondetection as a submodule Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-17 11:38 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Marc Stevens, Michael Kebe, Jeff King,
	Ævar Arnfjörð Bjarmason

Upgrade the sha1dc code from my fork of the project, which has one
patch I'm hoping will be integrated upstream after this series
demonstrates its usefulness:

    https://github.com/cr-marcstevens/sha1collisiondetection/compare/master...avar:easier-inclusion-in-other-programs

That minor change allows us to in the future just use the upstream
code as-is, without any local modifications.

Now all the local modifications we've done in this directory (see the
git history of sha1dc/) are done either with defines, or via the newly
added sha1dc_git.[ch] files.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile           |  9 +++++-
 sha1dc/sha1.c      | 79 ++++++++++++++++++++++++++---------------------
 sha1dc/sha1.h      | 90 +++++++++++++++++++++++-------------------------------
 sha1dc/ubc_check.c |  9 ++++--
 sha1dc/ubc_check.h | 10 ++++--
 sha1dc_git.c       | 19 ++++++++++++
 sha1dc_git.h       | 14 +++++++++
 7 files changed, 139 insertions(+), 91 deletions(-)
 create mode 100644 sha1dc_git.c
 create mode 100644 sha1dc_git.h

diff --git a/Makefile b/Makefile
index e35542e631..342466d83a 100644
--- a/Makefile
+++ b/Makefile
@@ -1414,7 +1414,14 @@ else
 	DC_SHA1 := YesPlease
 	LIB_OBJS += sha1dc/sha1.o
 	LIB_OBJS += sha1dc/ubc_check.o
-	BASIC_CFLAGS += -DSHA1_DC
+	BASIC_CFLAGS += \
+		-DSHA1_DC \
+		-DSHA1DC_NO_STANDARD_INCLUDES \
+		-DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 \
+		-DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" \
+		-DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_C="\"../sha1dc_git.c\"" \
+		-DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_H="\"../sha1dc_git.h\"" \
+		-DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\""
 endif
 endif
 endif
diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
index 35e9dd5bf4..c53f68b844 100644
--- a/sha1dc/sha1.c
+++ b/sha1dc/sha1.c
@@ -5,12 +5,26 @@
 * https://opensource.org/licenses/MIT
 ***/
 
-#include "cache.h"
-#include "sha1dc/sha1.h"
-#include "sha1dc/ubc_check.h"
+#ifndef SHA1DC_NO_STANDARD_INCLUDES
+#include <string.h>
+#include <memory.h>
+#include <stdio.h>
+#include <stdlib.h>
+#endif
+
+#ifdef SHA1DC_CUSTOM_INCLUDE_SHA1_C
+#include SHA1DC_CUSTOM_INCLUDE_SHA1_C
+#endif
 
+#ifndef SHA1DC_INIT_SAFE_HASH_DEFAULT
+#define SHA1DC_INIT_SAFE_HASH_DEFAULT 1
+#endif
 
-/*
+#include "sha1.h"
+#include "ubc_check.h"
+
+
+/* 
    Because Little-Endian architectures are most common,
    we only set SHA1DC_BIGENDIAN if one of these conditions is met.
    Note that all MSFT platforms are little endian,
@@ -18,14 +32,17 @@
    If you are compiling on a big endian platform and your compiler does not define one of these,
    you will have to add whatever macros your tool chain defines to indicate Big-Endianness.
  */
-#if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || \
+#ifdef SHA1DC_BIGENDIAN
+#undef SHA1DC_BIGENDIAN
+#endif
+#if (!defined SHA1DC_FORCE_LITTLEENDIAN) && \
+    ((defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || \
     (defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __BIG_ENDIAN__)) || \
-    defined(__BIG_ENDIAN__) || defined(__ARMEB__) || defined(__THUMBEB__) ||  defined(__AARCH64EB__) || \
-    defined(_MIPSEB) || defined(__MIPSEB) || defined(__MIPSEB__)
+    defined(_BIG_ENDIAN) || defined(__BIG_ENDIAN__) || defined(__ARMEB__) || defined(__THUMBEB__) ||  defined(__AARCH64EB__) || \
+    defined(_MIPSEB) || defined(__MIPSEB) || defined(__MIPSEB__) || defined(SHA1DC_FORCE_BIGENDIAN))
+
+#define SHA1DC_BIGENDIAN
 
-#define SHA1DC_BIGENDIAN	1
-#else
-#undef SHA1DC_BIGENDIAN
 #endif /*ENDIANNESS SELECTION*/
 
 #define rotate_right(x,n) (((x)>>(n))|((x)<<(32-(n))))
@@ -36,11 +53,11 @@
 
 #define sha1_mix(W, t)  (rotate_left(W[t - 3] ^ W[t - 8] ^ W[t - 14] ^ W[t - 16], 1))
 
-#if defined(SHA1DC_BIGENDIAN)
+#ifdef SHA1DC_BIGENDIAN
 	#define sha1_load(m, t, temp)  { temp = m[t]; }
 #else
 	#define sha1_load(m, t, temp)  { temp = m[t]; sha1_bswap32(temp); }
-#endif /* !defined(SHA1DC_BIGENDIAN) */
+#endif
 
 #define sha1_store(W, t, x)	*(volatile uint32_t *)&W[t] = x
 
@@ -869,6 +886,11 @@ static void sha1recompress_fast_ ## t (uint32_t ihvin[5], uint32_t ihvout[5], co
 	ihvout[0] = ihvin[0] + a; ihvout[1] = ihvin[1] + b; ihvout[2] = ihvin[2] + c; ihvout[3] = ihvin[3] + d; ihvout[4] = ihvin[4] + e; \
 }
 
+#ifdef _MSC_VER
+#pragma warning(push)
+#pragma warning(disable: 4127)  /* Complier complains about the checks in the above macro being constant. */
+#endif
+
 #ifdef DOSTORESTATE0
 SHA1_RECOMPRESS(0)
 #endif
@@ -1189,6 +1211,10 @@ SHA1_RECOMPRESS(78)
 SHA1_RECOMPRESS(79)
 #endif
 
+#ifdef _MSC_VER
+#pragma warning(pop)
+#endif
+
 static void sha1_recompression_step(uint32_t step, uint32_t ihvin[5], uint32_t ihvout[5], const uint32_t me2[80], const uint32_t state[5])
 {
 	switch (step)
@@ -1606,7 +1632,7 @@ static void sha1_process(SHA1_CTX* ctx, const uint32_t block[16])
 	unsigned i, j;
 	uint32_t ubc_dv_mask[DVMASKSIZE] = { 0xFFFFFFFF };
 	uint32_t ihvtmp[5];
-
+	
 	ctx->ihv1[0] = ctx->ihv[0];
 	ctx->ihv1[1] = ctx->ihv[1];
 	ctx->ihv1[2] = ctx->ihv[2];
@@ -1662,7 +1688,7 @@ void SHA1DCInit(SHA1_CTX* ctx)
 	ctx->ihv[3] = 0x10325476;
 	ctx->ihv[4] = 0xC3D2E1F0;
 	ctx->found_collision = 0;
-	ctx->safe_hash = 0;
+	ctx->safe_hash = SHA1DC_INIT_SAFE_HASH_DEFAULT;
 	ctx->ubc_check = 1;
 	ctx->detect_coll = 1;
 	ctx->reduced_round_coll = 0;
@@ -1728,7 +1754,8 @@ void SHA1DCUpdate(SHA1_CTX* ctx, const char* buf, size_t len)
 	while (len >= 64)
 	{
 		ctx->total += 64;
-		sha1_process(ctx, (uint32_t*)(buf));
+		memcpy(ctx->buffer, buf, 64);
+		sha1_process(ctx, (uint32_t*)(ctx->buffer));
 		buf += 64;
 		len -= 64;
 	}
@@ -1788,22 +1815,6 @@ int SHA1DCFinal(unsigned char output[20], SHA1_CTX *ctx)
 	return ctx->found_collision;
 }
 
-void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx)
-{
-	if (!SHA1DCFinal(hash, ctx))
-		return;
-	die("SHA-1 appears to be part of a collision attack: %s",
-	    sha1_to_hex(hash));
-}
-
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len)
-{
-	const char *data = vdata;
-	/* We expect an unsigned long, but sha1dc only takes an int */
-	while (len > INT_MAX) {
-		SHA1DCUpdate(ctx, data, INT_MAX);
-		data += INT_MAX;
-		len -= INT_MAX;
-	}
-	SHA1DCUpdate(ctx, data, len);
-}
+#ifdef SHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_C
+#include SHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_C
+#endif
diff --git a/sha1dc/sha1.h b/sha1dc/sha1.h
index bd8bd928fb..285161ee35 100644
--- a/sha1dc/sha1.h
+++ b/sha1dc/sha1.h
@@ -4,6 +4,7 @@
 * See accompanying file LICENSE.txt or copy at
 * https://opensource.org/licenses/MIT
 ***/
+
 #ifndef SHA1DC_SHA1_H
 #define SHA1DC_SHA1_H
 
@@ -11,36 +12,30 @@
 extern "C" {
 #endif
 
-/* uses SHA-1 message expansion to expand the first 16 words of W[] to 80 words */
-/* void sha1_message_expansion(uint32_t W[80]); */
-
-/* sha-1 compression function; first version takes a message block pre-parsed as 16 32-bit integers, second version takes an already expanded message) */
-/* void sha1_compression(uint32_t ihv[5], const uint32_t m[16]);
-void sha1_compression_W(uint32_t ihv[5], const uint32_t W[80]); */
+#ifndef SHA1DC_NO_STANDARD_INCLUDES
+#include <stdint.h>
+#endif
 
-/* same as sha1_compression_W, but additionally store intermediate states */
+/* sha-1 compression function that takes an already expanded message, and additionally store intermediate states */
 /* only stores states ii (the state between step ii-1 and step ii) when DOSTORESTATEii is defined in ubc_check.h */
 void sha1_compression_states(uint32_t[5], const uint32_t[16], uint32_t[80], uint32_t[80][5]);
 
 /*
-// function type for sha1_recompression_step_T (uint32_t ihvin[5], uint32_t ihvout[5], const uint32_t me2[80], const uint32_t state[5])
-// where 0 <= T < 80
-//       me2 is an expanded message (the expansion of an original message block XOR'ed with a disturbance vector's message block difference)
-//       state is the internal state (a,b,c,d,e) before step T of the SHA-1 compression function while processing the original message block
-// the function will return:
-//       ihvin: the reconstructed input chaining value
-//       ihvout: the reconstructed output chaining value
+// Function type for sha1_recompression_step_T (uint32_t ihvin[5], uint32_t ihvout[5], const uint32_t me2[80], const uint32_t state[5]).
+// Where 0 <= T < 80
+//       me2 is an expanded message (the expansion of an original message block XOR'ed with a disturbance vector's message block difference.)
+//       state is the internal state (a,b,c,d,e) before step T of the SHA-1 compression function while processing the original message block.
+// The function will return:
+//       ihvin: The reconstructed input chaining value.
+//       ihvout: The reconstructed output chaining value.
 */
 typedef void(*sha1_recompression_type)(uint32_t*, uint32_t*, const uint32_t*, const uint32_t*);
 
-/* table of sha1_recompression_step_0, ... , sha1_recompression_step_79 */
-/* extern sha1_recompression_type sha1_recompression_step[80];*/
-
-/* a callback function type that can be set to be called when a collision block has been found: */
+/* A callback function type that can be set to be called when a collision block has been found: */
 /* void collision_block_callback(uint64_t byteoffset, const uint32_t ihvin1[5], const uint32_t ihvin2[5], const uint32_t m1[80], const uint32_t m2[80]) */
 typedef void(*collision_block_callback)(uint64_t, const uint32_t*, const uint32_t*, const uint32_t*, const uint32_t*);
 
-/* the SHA-1 context */
+/* The SHA-1 context. */
 typedef struct {
 	uint64_t total;
 	uint32_t ihv[5];
@@ -59,30 +54,34 @@ typedef struct {
 	uint32_t states[80][5];
 } SHA1_CTX;
 
-/* initialize SHA-1 context */
+/* Initialize SHA-1 context. */
 void SHA1DCInit(SHA1_CTX*);
 
 /*
-// function to enable safe SHA-1 hashing:
-// collision attacks are thwarted by hashing a detected near-collision block 3 times
-// think of it as extending SHA-1 from 80-steps to 240-steps for such blocks:
-//   the best collision attacks against SHA-1 have complexity about 2^60,
-//   thus for 240-steps an immediate lower-bound for the best cryptanalytic attacks would 2^180
-//   an attacker would be better off using a generic birthday search of complexity 2^80
-//
-// enabling safe SHA-1 hashing will result in the correct SHA-1 hash for messages where no collision attack was detected
-// but it will result in a different SHA-1 hash for messages where a collision attack was detected
-// this will automatically invalidate SHA-1 based digital signature forgeries
-// enabled by default
+    Function to enable safe SHA-1 hashing:
+    Collision attacks are thwarted by hashing a detected near-collision block 3 times.
+    Think of it as extending SHA-1 from 80-steps to 240-steps for such blocks:
+        The best collision attacks against SHA-1 have complexity about 2^60,
+        thus for 240-steps an immediate lower-bound for the best cryptanalytic attacks would be 2^180.
+        An attacker would be better off using a generic birthday search of complexity 2^80.
+  
+   Enabling safe SHA-1 hashing will result in the correct SHA-1 hash for messages where no collision attack was detected,
+   but it will result in a different SHA-1 hash for messages where a collision attack was detected.
+   This will automatically invalidate SHA-1 based digital signature forgeries.
+   Enabled by default.
 */
 void SHA1DCSetSafeHash(SHA1_CTX*, int);
 
-/* function to disable or enable the use of Unavoidable Bitconditions (provides a significant speed up) */
-/* enabled by default */
+/*
+    Function to disable or enable the use of Unavoidable Bitconditions (provides a significant speed up).
+    Enabled by default
+ */
 void SHA1DCSetUseUBC(SHA1_CTX*, int);
 
-/* function to disable or enable the use of Collision Detection */
-/* enabled by default */
+/*
+    Function to disable or enable the use of Collision Detection.
+    Enabled by default.
+ */
 void SHA1DCSetUseDetectColl(SHA1_CTX*, int);
 
 /* function to disable or enable the detection of reduced-round SHA-1 collisions */
@@ -98,25 +97,14 @@ void SHA1DCUpdate(SHA1_CTX*, const char*, size_t);
 
 /* obtain SHA-1 hash from SHA-1 context */
 /* returns: 0 = no collision detected, otherwise = collision found => warn user for active attack */
-int  SHA1DCFinal(unsigned char[20], SHA1_CTX*);
+int  SHA1DCFinal(unsigned char[20], SHA1_CTX*); 
 
-/*
- * Same as SHA1DCFinal, but convert collision attack case into a verbose die().
- */
-void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
-
-/*
- * Same as SHA1DCUpdate, but adjust types to match git's usual interface.
- */
-void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
-
-#define platform_SHA_CTX SHA1_CTX
-#define platform_SHA1_Init SHA1DCInit
-#define platform_SHA1_Update git_SHA1DCUpdate
-#define platform_SHA1_Final git_SHA1DCFinal
+#ifdef SHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_H
+#include SHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_H
+#endif
 
 #if defined(__cplusplus)
 }
 #endif
 
-#endif /* SHA1DC_SHA1_H */
+#endif
diff --git a/sha1dc/ubc_check.c b/sha1dc/ubc_check.c
index 089dd4743d..0614926648 100644
--- a/sha1dc/ubc_check.c
+++ b/sha1dc/ubc_check.c
@@ -24,8 +24,13 @@
 // ubc_check has been verified against ubc_check_verify using the 'ubc_check_test' program in the tools section
 */
 
-#include "git-compat-util.h"
-#include "sha1dc/ubc_check.h"
+#ifndef SHA1DC_NO_STANDARD_INCLUDES
+#include <stdint.h>
+#endif
+#ifdef SHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C
+#include SHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C
+#endif
+#include "ubc_check.h"
 
 static const uint32_t DV_I_43_0_bit 	= (uint32_t)(1) << 0;
 static const uint32_t DV_I_44_0_bit 	= (uint32_t)(1) << 1;
diff --git a/sha1dc/ubc_check.h b/sha1dc/ubc_check.h
index b64c306d77..f6bb62547a 100644
--- a/sha1dc/ubc_check.h
+++ b/sha1dc/ubc_check.h
@@ -20,13 +20,17 @@
 // thus one needs to do the recompression check for each DV that has its bit set
 */
 
-#ifndef UBC_CHECK_H
-#define UBC_CHECK_H
+#ifndef SHA1DC_UBC_CHECK_H
+#define SHA1DC_UBC_CHECK_H
 
 #if defined(__cplusplus)
 extern "C" {
 #endif
 
+#ifndef SHA1DC_NO_STANDARD_INCLUDES
+#include <stdint.h>
+#endif
+
 #define DVMASKSIZE 1
 typedef struct { int dvType; int dvK; int dvB; int testt; int maski; int maskb; uint32_t dm[80]; } dv_info_t;
 extern dv_info_t sha1_dvs[];
@@ -41,4 +45,4 @@ void ubc_check(const uint32_t W[80], uint32_t dvmask[DVMASKSIZE]);
 }
 #endif
 
-#endif /* UBC_CHECK_H */
+#endif
diff --git a/sha1dc_git.c b/sha1dc_git.c
new file mode 100644
index 0000000000..e59c1d8103
--- /dev/null
+++ b/sha1dc_git.c
@@ -0,0 +1,19 @@
+void git_SHA1DCFinal(unsigned char hash[20], SHA1_CTX *ctx)
+{
+	if (!SHA1DCFinal(hash, ctx))
+		return;
+	die("SHA-1 appears to be part of a collision attack: %s",
+	    sha1_to_hex(hash));
+}
+
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *vdata, unsigned long len)
+{
+	const char *data = vdata;
+	/* We expect an unsigned long, but sha1dc only takes an int */
+	while (len > INT_MAX) {
+		SHA1DCUpdate(ctx, data, INT_MAX);
+		data += INT_MAX;
+		len -= INT_MAX;
+	}
+	SHA1DCUpdate(ctx, data, len);
+}
diff --git a/sha1dc_git.h b/sha1dc_git.h
new file mode 100644
index 0000000000..801b5a1fcb
--- /dev/null
+++ b/sha1dc_git.h
@@ -0,0 +1,14 @@
+/*
+ * Same as SHA1DCFinal, but convert collision attack case into a verbose die().
+ */
+void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *);
+
+/*
+ * Same as SHA1DCUpdate, but adjust types to match git's usual interface.
+ */
+void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len);
+
+#define platform_SHA_CTX SHA1_CTX
+#define platform_SHA1_Init SHA1DCInit
+#define platform_SHA1_Update git_SHA1DCUpdate
+#define platform_SHA1_Final git_SHA1DCFinal
-- 
2.13.0.303.g4ebf302169


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH/RFC 2/3] sha1dc: use sha1collisiondetection as a submodule
  2017-05-17  7:09               ` Junio C Hamano
  2017-05-17 11:38                 ` [PATCH/RFC 0/3] Use sha1collisiondetection as a submodule Ævar Arnfjörð Bjarmason
  2017-05-17 11:38                 ` [PATCH/RFC 1/3] sha1dc: update from my fork of upstream Ævar Arnfjörð Bjarmason
@ 2017-05-17 11:38                 ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-17 11:38 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Marc Stevens, Michael Kebe, Jeff King,
	Ævar Arnfjörð Bjarmason

Now that we don't need local modifications to the code anymore with my
avar/sha1collisiondetection:easier-inclusion-in-other-programs branch
we can use the upstream code-as is.

If and when this change gets accepted by upstream we should of course
point to the https://github.com/cr-marcstevens/sha1collisiondetection
master branch instead.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 .gitmodules            | 4 ++++
 Makefile               | 8 ++++----
 hash.h                 | 2 +-
 sha1collisiondetection | 1 +
 4 files changed, 10 insertions(+), 5 deletions(-)
 create mode 100644 .gitmodules
 create mode 160000 sha1collisiondetection

diff --git a/.gitmodules b/.gitmodules
new file mode 100644
index 0000000000..b60238a087
--- /dev/null
+++ b/.gitmodules
@@ -0,0 +1,4 @@
+[submodule "sha1collisiondetection"]
+	path = sha1collisiondetection
+	url = https://github.com/avar/sha1collisiondetection.git
+	branch = easier-inclusion-in-other-programs
diff --git a/Makefile b/Makefile
index 342466d83a..63223fd02f 100644
--- a/Makefile
+++ b/Makefile
@@ -1412,15 +1412,15 @@ ifdef APPLE_COMMON_CRYPTO
 	BASIC_CFLAGS += -DSHA1_APPLE
 else
 	DC_SHA1 := YesPlease
-	LIB_OBJS += sha1dc/sha1.o
-	LIB_OBJS += sha1dc/ubc_check.o
+	LIB_OBJS += sha1dc/lib/sha1.o
+	LIB_OBJS += sha1dc/lib/ubc_check.o
 	BASIC_CFLAGS += \
 		-DSHA1_DC \
 		-DSHA1DC_NO_STANDARD_INCLUDES \
 		-DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 \
 		-DSHA1DC_CUSTOM_INCLUDE_SHA1_C="\"cache.h\"" \
-		-DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_C="\"../sha1dc_git.c\"" \
-		-DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_H="\"../sha1dc_git.h\"" \
+		-DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_C="\"../../sha1dc_git.c\"" \
+		-DSHA1DC_CUSTOM_TRAILING_INCLUDE_SHA1_H="\"../../sha1dc_git.h\"" \
 		-DSHA1DC_CUSTOM_INCLUDE_UBC_CHECK_C="\"git-compat-util.h\""
 endif
 endif
diff --git a/hash.h b/hash.h
index a11fc9233f..e585237136 100644
--- a/hash.h
+++ b/hash.h
@@ -8,7 +8,7 @@
 #elif defined(SHA1_OPENSSL)
 #include <openssl/sha.h>
 #elif defined(SHA1_DC)
-#include "sha1dc/sha1.h"
+#include "sha1dc/lib/sha1.h"
 #else /* SHA1_BLK */
 #include "block-sha1/sha1.h"
 #endif
diff --git a/sha1collisiondetection b/sha1collisiondetection
new file mode 160000
index 0000000000..5423dd8bcf
--- /dev/null
+++ b/sha1collisiondetection
@@ -0,0 +1 @@
+Subproject commit 5423dd8bcf3cdbd029dfa08d5d5f6dc044f7ac6d
-- 
2.13.0.303.g4ebf302169


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] sha1dc: fix issues with a big endian platform
  2017-05-17  5:39           ` [PATCH] sha1dc: fix issues with a big endian platform Junio C Hamano
  2017-05-17  6:30             ` Ævar Arnfjörð Bjarmason
@ 2017-05-17 15:26             ` Johannes Schindelin
  1 sibling, 0 replies; 17+ messages in thread
From: Johannes Schindelin @ 2017-05-17 15:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Marc Stevens, Michael Kebe,
	Ævar Arnfjörð Bjarmason, Jeff King,
	Git Mailing List

Hi Junio,

On Wed, 17 May 2017, Junio C Hamano wrote:

> diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
> index 35e9dd5bf4..ae25318c47 100644
> --- a/sha1dc/sha1.c
> +++ b/sha1dc/sha1.c
> @@ -20,7 +20,7 @@
>   */
>  #if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || \
>      (defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __BIG_ENDIAN__)) || \
> -    defined(__BIG_ENDIAN__) || defined(__ARMEB__) || defined(__THUMBEB__) ||  defined(__AARCH64EB__) || \
> +    defined(_BIG_ENDIAN) || defined(__BIG_ENDIAN__) || defined(__ARMEB__) || defined(__THUMBEB__) ||  defined(__AARCH64EB__) || \
>      defined(_MIPSEB) || defined(__MIPSEB) || defined(__MIPSEB__)
>  
>  #define SHA1DC_BIGENDIAN	1
> @@ -1728,7 +1728,8 @@ void SHA1DCUpdate(SHA1_CTX* ctx, const char* buf, size_t len)
>  	while (len >= 64)
>  	{
>  		ctx->total += 64;
> -		sha1_process(ctx, (uint32_t*)(buf));
> +		memcpy(ctx->buffer, buf, 64);
> +		sha1_process(ctx, (uint32_t*)(ctx->buffer));
>  		buf += 64;

There is actually a discussion going on about this. See

https://github.com/cr-marcstevens/sha1collisiondetection/commit/33a694a9ee1b79c24be45f9eab5ac0e1aeeaf271#commitcomment-22158243

for details.

The fixup commit under discussion is here:

https://github.com/cr-marcstevens/sha1collisiondetection/commit/40f07a0c12d525b24ac1235ee8a81bbf33957ca5

Ciao,
dscho

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 0/3] Use sha1collisiondetection as a submodule
  2017-05-17 11:38                 ` [PATCH/RFC 0/3] Use sha1collisiondetection as a submodule Ævar Arnfjörð Bjarmason
@ 2017-05-17 18:52                   ` Stefan Beller
  2017-05-17 19:45                     ` Ævar Arnfjörð Bjarmason
  2017-05-18  0:11                     ` Brandon Williams
  0 siblings, 2 replies; 17+ messages in thread
From: Stefan Beller @ 2017-05-17 18:52 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git@vger.kernel.org, Junio C Hamano, Marc Stevens, Michael Kebe,
	Jeff King

On Wed, May 17, 2017 at 4:38 AM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Wed, May 17, 2017 at 9:09 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>>> On Wed, May 17, 2017 at 7:39 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>> From: Marc Stevens <marc@marc-stevens.nl>
>>>>
>>>> Some big-endian platforms define _BIG_ENDIAN, which the test at the
>>>> beginning of file has missed.  Also, when the input is not aligned,
>>>> some platforms trigger SIGBUS.
>>>>
>>>> This change corresponds to 33a694a9 ("Fix issues with a big endian
>>>> platform", 2017-05-15) in the history of the upstream repository
>>>> https://github.com/cr-marcstevens/sha1collisiondetection
>>>
>>> Then why not just update sha1dc from upstream instead of
>>> cherry-picking one patch from them?
>>
>> See the original message upthread.  I did the cherry-pick simply
>> because that was what Marc instructed the patch recipient to do ;-).
>
> Since that patch is now in Marc's upstream code we can just update to
> that.
>
> While we're at it let's see if Marc will take a patch so that we can
> use his code as-is with some Makefile trickery of our own, instead of
> having to solve merge conflicts each time we update from him. I'll
> submit a pull request for that shortly.
>
> And since if and when that pull request gets accepted we're at the
> point of being able to use the upstream code as-is & don't need to
> locally modify it, we can just use a submodule to track it.

As someone who works on submodules: YAY! This sounds
wonderful in the sense that more git developers experience the
sharp edges (if any) of submodules and would fix them.

With the reset work on submodules (checkout, reset,
ls-files, grep) and more in the works (better push/pull)
we may be close to prime time.

However we do distribute tarballs (well, Junio does),
which is not yet supported to include submodules.

I did follow the SHA1DC discussion just from the sideline,
how crucial is that library for us?

Also now that we discuss having submodules:
Would we just point the submodule URL to github\
and call it a day?

We could make a friendly fork of it and put it next to all the repositories
of https://git-blame.blogspot.com/p/git-public-repositories.html
and then use relative URLs in the .gitmodules file.

On a tangent, in an off-list discussion we discussed having
the git-annex tests as an optional submodule instead of an
"external" test, but in conclusion we considered that idea not
ideal to implement (because our tests are like contracts, we should
actually write our own tests and not rely on downstream to write
tests for us)

looking forward for a discussion here,
Stefan

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 0/3] Use sha1collisiondetection as a submodule
  2017-05-17 18:52                   ` Stefan Beller
@ 2017-05-17 19:45                     ` Ævar Arnfjörð Bjarmason
  2017-05-17 19:57                       ` Stefan Beller
  2017-05-18  0:11                     ` Brandon Williams
  1 sibling, 1 reply; 17+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2017-05-17 19:45 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git@vger.kernel.org, Junio C Hamano, Marc Stevens, Michael Kebe,
	Jeff King

On Wed, May 17, 2017 at 8:52 PM, Stefan Beller <sbeller@google.com> wrote:
> On Wed, May 17, 2017 at 4:38 AM, Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>> On Wed, May 17, 2017 at 9:09 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>>
>>>> On Wed, May 17, 2017 at 7:39 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>>> From: Marc Stevens <marc@marc-stevens.nl>
>>>>>
>>>>> Some big-endian platforms define _BIG_ENDIAN, which the test at the
>>>>> beginning of file has missed.  Also, when the input is not aligned,
>>>>> some platforms trigger SIGBUS.
>>>>>
>>>>> This change corresponds to 33a694a9 ("Fix issues with a big endian
>>>>> platform", 2017-05-15) in the history of the upstream repository
>>>>> https://github.com/cr-marcstevens/sha1collisiondetection
>>>>
>>>> Then why not just update sha1dc from upstream instead of
>>>> cherry-picking one patch from them?
>>>
>>> See the original message upthread.  I did the cherry-pick simply
>>> because that was what Marc instructed the patch recipient to do ;-).
>>
>> Since that patch is now in Marc's upstream code we can just update to
>> that.
>>
>> While we're at it let's see if Marc will take a patch so that we can
>> use his code as-is with some Makefile trickery of our own, instead of
>> having to solve merge conflicts each time we update from him. I'll
>> submit a pull request for that shortly.
>>
>> And since if and when that pull request gets accepted we're at the
>> point of being able to use the upstream code as-is & don't need to
>> locally modify it, we can just use a submodule to track it.
>
> As someone who works on submodules: YAY! This sounds
> wonderful in the sense that more git developers experience the
> sharp edges (if any) of submodules and would fix them.

Yeah I agree git.git should dogfood submodules, and this seemed like a
perfect opportunity for thaht.

As noted later in your mail everything may not be ready, so when I
submit a non-RFC version of this (after upstream pulls my changes,
hopefully), the 2nd and 3rd patches can just be dropped.

> With the reset work on submodules (checkout, reset,
> ls-files, grep) and more in the works (better push/pull)
> we may be close to prime time.
>
> However we do distribute tarballs (well, Junio does),
> which is not yet supported to include submodules.

Ouch. So there's no non-manual way with git-archive to make a tarball
release of a git project using submodules?

> I did follow the SHA1DC discussion just from the sideline,
> how crucial is that library for us?

Since 2.13.0 it's git's default sha1 implementation.

> Also now that we discuss having submodules:
> Would we just point the submodule URL to github\
> and call it a day?
>
> We could make a friendly fork of it and put it next to all the repositories
> of https://git-blame.blogspot.com/p/git-public-repositories.html
> and then use relative URLs in the .gitmodules file.

Wouldn't we lose the ability to just "pull" the module to see if
upstream changed, i.e. now we'd need to manually munge the URL first.

If so having a fetchurl be a relative URL similar to git-push's
pushurl would solve it wouldn't it? I.e. a project like git could say
"here's my mirrors" different from "here's my upstream".

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 0/3] Use sha1collisiondetection as a submodule
  2017-05-17 19:45                     ` Ævar Arnfjörð Bjarmason
@ 2017-05-17 19:57                       ` Stefan Beller
  0 siblings, 0 replies; 17+ messages in thread
From: Stefan Beller @ 2017-05-17 19:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git@vger.kernel.org, Junio C Hamano, Marc Stevens, Michael Kebe,
	Jeff King

On Wed, May 17, 2017 at 12:45 PM, Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Wed, May 17, 2017 at 8:52 PM, Stefan Beller <sbeller@google.com> wrote:
>> On Wed, May 17, 2017 at 4:38 AM, Ævar Arnfjörð Bjarmason
>> <avarab@gmail.com> wrote:
>>> On Wed, May 17, 2017 at 9:09 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>>>
>>>>> On Wed, May 17, 2017 at 7:39 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>>>>> From: Marc Stevens <marc@marc-stevens.nl>
>>>>>>
>>>>>> Some big-endian platforms define _BIG_ENDIAN, which the test at the
>>>>>> beginning of file has missed.  Also, when the input is not aligned,
>>>>>> some platforms trigger SIGBUS.
>>>>>>
>>>>>> This change corresponds to 33a694a9 ("Fix issues with a big endian
>>>>>> platform", 2017-05-15) in the history of the upstream repository
>>>>>> https://github.com/cr-marcstevens/sha1collisiondetection
>>>>>
>>>>> Then why not just update sha1dc from upstream instead of
>>>>> cherry-picking one patch from them?
>>>>
>>>> See the original message upthread.  I did the cherry-pick simply
>>>> because that was what Marc instructed the patch recipient to do ;-).
>>>
>>> Since that patch is now in Marc's upstream code we can just update to
>>> that.
>>>
>>> While we're at it let's see if Marc will take a patch so that we can
>>> use his code as-is with some Makefile trickery of our own, instead of
>>> having to solve merge conflicts each time we update from him. I'll
>>> submit a pull request for that shortly.
>>>
>>> And since if and when that pull request gets accepted we're at the
>>> point of being able to use the upstream code as-is & don't need to
>>> locally modify it, we can just use a submodule to track it.
>>
>> As someone who works on submodules: YAY! This sounds
>> wonderful in the sense that more git developers experience the
>> sharp edges (if any) of submodules and would fix them.
>
> Yeah I agree git.git should dogfood submodules, and this seemed like a
> perfect opportunity for thaht.
>
> As noted later in your mail everything may not be ready, so when I
> submit a non-RFC version of this (after upstream pulls my changes,
> hopefully), the 2nd and 3rd patches can just be dropped.
>
>> With the reset work on submodules (checkout, reset,
>> ls-files, grep) and more in the works (better push/pull)
>> we may be close to prime time.
>>
>> However we do distribute tarballs (well, Junio does),
>> which is not yet supported to include submodules.
>
> Ouch. So there's no non-manual way with git-archive to make a tarball
> release of a git project using submodules?

Yes. Ouch.

>> We could make a friendly fork of it and put it next to all the repositories
>> of https://git-blame.blogspot.com/p/git-public-repositories.html
>> and then use relative URLs in the .gitmodules file.
>
> Wouldn't we lose the ability to just "pull" the module to see if
> upstream changed, i.e. now we'd need to manually munge the URL first.

I assumed more people would fetch git and trust the community rather
than people checking if SHA1DC is up-to-date, so I thought it is easier for
the minority to rewrite a URL.

> If so having a fetchurl be a relative URL similar to git-push's
> pushurl would solve it wouldn't it? I.e. a project like git could say
> "here's my mirrors" different from "here's my upstream".

I would think so.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH/RFC 0/3] Use sha1collisiondetection as a submodule
  2017-05-17 18:52                   ` Stefan Beller
  2017-05-17 19:45                     ` Ævar Arnfjörð Bjarmason
@ 2017-05-18  0:11                     ` Brandon Williams
  1 sibling, 0 replies; 17+ messages in thread
From: Brandon Williams @ 2017-05-18  0:11 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Ævar Arnfjörð Bjarmason, git@vger.kernel.org,
	Junio C Hamano, Marc Stevens, Michael Kebe, Jeff King

On 05/17, Stefan Beller wrote:
> On Wed, May 17, 2017 at 4:38 AM, Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
> > On Wed, May 17, 2017 at 9:09 AM, Junio C Hamano <gitster@pobox.com> wrote:
> >> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> >>
> >>> On Wed, May 17, 2017 at 7:39 AM, Junio C Hamano <gitster@pobox.com> wrote:
> >>>> From: Marc Stevens <marc@marc-stevens.nl>
> >>>>
> >>>> Some big-endian platforms define _BIG_ENDIAN, which the test at the
> >>>> beginning of file has missed.  Also, when the input is not aligned,
> >>>> some platforms trigger SIGBUS.
> >>>>
> >>>> This change corresponds to 33a694a9 ("Fix issues with a big endian
> >>>> platform", 2017-05-15) in the history of the upstream repository
> >>>> https://github.com/cr-marcstevens/sha1collisiondetection
> >>>
> >>> Then why not just update sha1dc from upstream instead of
> >>> cherry-picking one patch from them?
> >>
> >> See the original message upthread.  I did the cherry-pick simply
> >> because that was what Marc instructed the patch recipient to do ;-).
> >
> > Since that patch is now in Marc's upstream code we can just update to
> > that.
> >
> > While we're at it let's see if Marc will take a patch so that we can
> > use his code as-is with some Makefile trickery of our own, instead of
> > having to solve merge conflicts each time we update from him. I'll
> > submit a pull request for that shortly.
> >
> > And since if and when that pull request gets accepted we're at the
> > point of being able to use the upstream code as-is & don't need to
> > locally modify it, we can just use a submodule to track it.
> 
> As someone who works on submodules: YAY! This sounds
> wonderful in the sense that more git developers experience the
> sharp edges (if any) of submodules and would fix them.

Haha, I wish I had your enthusiasm for this.  That is a good argument
for us to include them in git.git, but I still feel apprehensive.  There
is still so much to do in submodule land!  We'll make them great though!

> 
> With the reset work on submodules (checkout, reset,
> ls-files, grep) and more in the works (better push/pull)
> we may be close to prime time.
> 
> However we do distribute tarballs (well, Junio does),
> which is not yet supported to include submodules.
> 
> I did follow the SHA1DC discussion just from the sideline,
> how crucial is that library for us?
> 
> Also now that we discuss having submodules:
> Would we just point the submodule URL to github\
> and call it a day?
> 
> We could make a friendly fork of it and put it next to all the repositories
> of https://git-blame.blogspot.com/p/git-public-repositories.html
> and then use relative URLs in the .gitmodules file.
> 
> On a tangent, in an off-list discussion we discussed having
> the git-annex tests as an optional submodule instead of an
> "external" test, but in conclusion we considered that idea not
> ideal to implement (because our tests are like contracts, we should
> actually write our own tests and not rely on downstream to write
> tests for us)
> 
> looking forward for a discussion here,
> Stefan

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default
  2017-05-15 22:09     ` Jeff King
@ 2017-06-01 14:03       ` demerphq
  0 siblings, 0 replies; 17+ messages in thread
From: demerphq @ 2017-06-01 14:03 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, Marc Stevens,
	Git Mailing List, Michael Kebe

On 16 May 2017 at 00:09, Jeff King <peff@peff.net> wrote:
> On Mon, May 15, 2017 at 04:13:58PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> On Mon, May 15, 2017 at 3:58 PM, Marc Stevens <marc@marc-stevens.nl> wrote:
>> > Hi Aevar,
>> >
>> > Thank you for notifying us of this issue.
>> > Big endianness is a tricky issue, also since I don't have access or accurate knowledge about all big endian systems.
>> > Our github repo does check correct functioning, including an endianness mistake, with 'make test'.
>> > But I guess this is not included for SHA1DC in Git.
>> >
>> > Anyway, we can easily add the _BIG_ENDIAN macrotest to the git repo and will do so soon.
>> >
>> > I don't think the segfault is caused by buffer overflow, inproper access, or the endianness issue.
>> > But I did notice an unexpected issue: the message block pointer m=0x398ad5 is odd.
>> > Can you confirm whether loading an uint32_t from an odd address triggers a hardware interrupt on your platform?
>> > This is not problem for x86, but maybe for your platform it is?
>> > If it is then we should always copy buffer contents to the sha1context to avoid this issue.
>>
>> I don't have access to the box in question, Michael was testing this
>> code for me. But unaligned access is probably the cause, although
>> according to some info I found online that should give a SIGBUS not a
>> SIGSEGV, but that may have changed:
>
> Yeah, I would have expected SIGBUS there. If we have alignment issues,
> though, I'd expect that ARM systems will experience problems.
>
> Block-sha1 uses a macro which allows unaligned loads on platforms that
> support it, and otherwise does the endian conversion on the fly as we
> load the bytes into a local variable (which presumably happens all
> in-register). That may be faster than doing a mass copy of the buffer.

I agree. It is fairly normal to use that kind of macro and not do a
memcpy with hash functions.

In fact many hash functions ONLY use that kind of macro, as decent
compilers will automagically convert the macro into an unaligned load
on platforms that support fast unaligned loads.

The only reason to expose the direct unaligned load is to make sure
that the hashing code is fast on such platforms even when compiled
under -g.

Yves


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2017-06-01 14:03 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-15 12:49 Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease being on by default Ævar Arnfjörð Bjarmason
2017-05-15 13:58 ` Marc Stevens
2017-05-15 14:13   ` Ævar Arnfjörð Bjarmason
2017-05-15 22:09     ` Jeff King
2017-06-01 14:03       ` demerphq
     [not found]     ` <CAKKM46vwM9pxyMxTc4jA0z_8vGKdDGCGg9ziKkFAsqr5ULYJxA@mail.gmail.com>
     [not found]       ` <007001d2cd88$2b916180$82b42480$@marc-stevens.nl>
2017-05-16  5:43         ` Michael Kebe
2017-05-17  5:39           ` [PATCH] sha1dc: fix issues with a big endian platform Junio C Hamano
2017-05-17  6:30             ` Ævar Arnfjörð Bjarmason
2017-05-17  7:09               ` Junio C Hamano
2017-05-17 11:38                 ` [PATCH/RFC 0/3] Use sha1collisiondetection as a submodule Ævar Arnfjörð Bjarmason
2017-05-17 18:52                   ` Stefan Beller
2017-05-17 19:45                     ` Ævar Arnfjörð Bjarmason
2017-05-17 19:57                       ` Stefan Beller
2017-05-18  0:11                     ` Brandon Williams
2017-05-17 11:38                 ` [PATCH/RFC 1/3] sha1dc: update from my fork of upstream Ævar Arnfjörð Bjarmason
2017-05-17 11:38                 ` [PATCH/RFC 2/3] sha1dc: use sha1collisiondetection as a submodule Ævar Arnfjörð Bjarmason
2017-05-17 15:26             ` [PATCH] sha1dc: fix issues with a big endian platform Johannes Schindelin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).