git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* One failed self test on Fedora 29
@ 2019-03-08 10:48 Jeffrey Walton
  2019-03-08 17:43 ` Todd Zullinger
  0 siblings, 1 reply; 25+ messages in thread
From: Jeffrey Walton @ 2019-03-08 10:48 UTC (permalink / raw)
  To: Git List

Fedora 29, x86_64. One failed self test:

*** t0021-conversion.sh ***
ok 1 - setup
ok 2 - check
ok 3 - expanded_in_repo
ok 4 - filter shell-escaped filenames
ok 5 - required filter should filter data
ok 6 - required filter smudge failure
ok 7 - required filter clean failure
ok 8 - filtering large input to small output should use little memory
ok 9 - filter that does not read is fine
ok 10 # skip filter large file (missing EXPENSIVE)
ok 11 - filter: clean empty file
ok 12 - filter: smudge empty file
not ok 13 - disable filter with empty override
#
#               test_config_global filter.disable.smudge false &&
#               test_config_global filter.disable.clean false &&
#               test_config filter.disable.smudge false &&
#               test_config filter.disable.clean false &&
#
#               echo "*.disable filter=disable" >.gitattributes &&
#
#               echo test >test.disable &&
#               git -c filter.disable.clean= add test.disable 2>err &&
#               test_must_be_empty err &&
#               rm -f test.disable &&
#               git -c filter.disable.smudge= checkout -- test.disable 2>err &&
#               test_must_be_empty err
#
ok 14 - diff does not reuse worktree files that need cleaning
ok 15 - required process filter should filter data
ok 16 - required process filter takes precedence
ok 17 - required process filter should be used only for "clean" operation only
ok 18 - required process filter should process multiple packets
ok 19 - required process filter with clean error should fail
ok 20 - process filter should restart after unexpected write failure
ok 21 - process filter should not be restarted if it signals an error
ok 22 - process filter abort stops processing of all further files
ok 23 - invalid process filter must fail (and not hang!)
ok 24 - delayed checkout in process filter
ok 25 - missing file in delayed checkout
ok 26 - invalid file in delayed checkout
# failed 1 among 26 test(s)
1..26
gmake[2]: *** [Makefile:56: t0021-conversion.sh] Error 1

Does anyone need a config.log or other test data?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: One failed self test on Fedora 29
  2019-03-08 10:48 One failed self test on Fedora 29 Jeffrey Walton
@ 2019-03-08 17:43 ` Todd Zullinger
  2019-03-09 12:34   ` Jeffrey Walton
  0 siblings, 1 reply; 25+ messages in thread
From: Todd Zullinger @ 2019-03-08 17:43 UTC (permalink / raw)
  To: Jeffrey Walton; +Cc: Git List

Hi,

Jeffrey Walton wrote:
> Fedora 29, x86_64. One failed self test:
> 
> *** t0021-conversion.sh ***
[...]
> not ok 13 - disable filter with empty override
> #
> #               test_config_global filter.disable.smudge false &&
> #               test_config_global filter.disable.clean false &&
> #               test_config filter.disable.smudge false &&
> #               test_config filter.disable.clean false &&
> #
> #               echo "*.disable filter=disable" >.gitattributes &&
> #
> #               echo test >test.disable &&
> #               git -c filter.disable.clean= add test.disable 2>err &&
> #               test_must_be_empty err &&
> #               rm -f test.disable &&
> #               git -c filter.disable.smudge= checkout -- test.disable 2>err &&
> #               test_must_be_empty err
> #
[...]
> # failed 1 among 26 test(s)
> 1..26
> gmake[2]: *** [Makefile:56: t0021-conversion.sh] Error 1
> 
> Does anyone need a config.log or other test data?

It would probably help to know what commit you're building.
The verbose test output would also be useful, e.g.:

    cd t && ./t0021-conversion.sh -v -i

If it's not reliably reproducible, the --stress* options
might help catch a failing run.

FWIW, I just built and ran the tests on a Fedora 29
container for master, next, and pu a few times (some with
various --stress options) without any test failures.

I did this with and without a config.mak from the fedora git
packages.  I've never used the configure script, it seems
like unnecessary overhead.

    $ git branch -v
      master 6e0cc67761 Start 2.22 cycle
      next   541d9dca55 Merge branch 'yb/utf-16le-bom-spellfix' into next
    * pu     7eadd8ba98 Merge branch 'js/remote-curl-i18n' into pu

-- 
Todd

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: One failed self test on Fedora 29
  2019-03-08 17:43 ` Todd Zullinger
@ 2019-03-09 12:34   ` Jeffrey Walton
  2019-03-09 13:12     ` Jeffrey Walton
  2019-03-11  3:29     ` One failed self test on Fedora 29 Jeff King
  0 siblings, 2 replies; 25+ messages in thread
From: Jeffrey Walton @ 2019-03-09 12:34 UTC (permalink / raw)
  To: Todd Zullinger; +Cc: Git List

On Fri, Mar 8, 2019 at 12:43 PM Todd Zullinger <tmz@pobox.com> wrote:
>
> Jeffrey Walton wrote:
> > Fedora 29, x86_64. One failed self test:
> >
> > *** t0021-conversion.sh ***
> [...]
> > not ok 13 - disable filter with empty override
> > #
> > #               test_config_global filter.disable.smudge false &&
> > #               test_config_global filter.disable.clean false &&
> > #               test_config filter.disable.smudge false &&
> > #               test_config filter.disable.clean false &&
> > #
> > #               echo "*.disable filter=disable" >.gitattributes &&
> > #
> > #               echo test >test.disable &&
> > #               git -c filter.disable.clean= add test.disable 2>err &&
> > #               test_must_be_empty err &&
> > #               rm -f test.disable &&
> > #               git -c filter.disable.smudge= checkout -- test.disable 2>err &&
> > #               test_must_be_empty err
> > #
> [...]
> > # failed 1 among 26 test(s)
> > 1..26
> > gmake[2]: *** [Makefile:56: t0021-conversion.sh] Error 1
> >
> > Does anyone need a config.log or other test data?
>
> It would probably help to know what commit you're building.
> The verbose test output would also be useful, e.g.:

I built with CFLAGS += -fsanitize=undefined. It looks like the
misaligned accesses generate UBsan findings, which is causing
t0021-conversion to fail.

git-2.21.0$ grep -IR 'runtime error'
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:392:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:397:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:402:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:407:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:412:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:417:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:422:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:427:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:432:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:437:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:442:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:447:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:452:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:457:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:462:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/err:sha1dc/sha1.c:467:2: runtime
error: load of misaligned address 0x0000024fc245 for type 'const
uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:392:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:397:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:402:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:407:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:412:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:417:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:422:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:427:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:432:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:437:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:442:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:447:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:452:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:457:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:462:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment
t/trash directory.t0021-conversion/git-stderr.log:sha1dc/sha1.c:467:2:
runtime error: load of misaligned address 0x000001a39cf5 for type
'const uint32_t', which requires 4 byte alignment

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: One failed self test on Fedora 29
  2019-03-09 12:34   ` Jeffrey Walton
@ 2019-03-09 13:12     ` Jeffrey Walton
  2019-03-11  2:00       ` Junio C Hamano
  2019-03-11  3:29     ` One failed self test on Fedora 29 Jeff King
  1 sibling, 1 reply; 25+ messages in thread
From: Jeffrey Walton @ 2019-03-09 13:12 UTC (permalink / raw)
  To: Todd Zullinger; +Cc: Git List

[-- Attachment #1: Type: text/plain, Size: 2342 bytes --]

On Sat, Mar 9, 2019 at 7:34 AM Jeffrey Walton <noloader@gmail.com> wrote:
>
> On Fri, Mar 8, 2019 at 12:43 PM Todd Zullinger <tmz@pobox.com> wrote:
> >
> > Jeffrey Walton wrote:
> > > Fedora 29, x86_64. One failed self test:
> > >
> > > *** t0021-conversion.sh ***
> > [...]
> > > not ok 13 - disable filter with empty override
> > > #
> > > #               test_config_global filter.disable.smudge false &&
> > > #               test_config_global filter.disable.clean false &&
> > > #               test_config filter.disable.smudge false &&
> > > #               test_config filter.disable.clean false &&
> > > #
> > > #               echo "*.disable filter=disable" >.gitattributes &&
> > > #
> > > #               echo test >test.disable &&
> > > #               git -c filter.disable.clean= add test.disable 2>err &&
> > > #               test_must_be_empty err &&
> > > #               rm -f test.disable &&
> > > #               git -c filter.disable.smudge= checkout -- test.disable 2>err &&
> > > #               test_must_be_empty err
> > > #
> > [...]
> > > # failed 1 among 26 test(s)
> > > 1..26
> > > gmake[2]: *** [Makefile:56: t0021-conversion.sh] Error 1
> > >
> > > Does anyone need a config.log or other test data?
> >
> > It would probably help to know what commit you're building.
> > The verbose test output would also be useful, e.g.:
>
> I built with CFLAGS += -fsanitize=undefined. It looks like the
> misaligned accesses generate UBsan findings, which is causing
> t0021-conversion to fail.
>
> git-2.21.0$ grep -IR 'runtime error'
> t/trash directory.t0021-conversion/err:sha1dc/sha1.c:392:2: runtime
> error: load of misaligned address 0x0000024fc245 for type 'const
> uint32_t', which requires 4 byte alignment
> t/trash directory.t0021-conversion/err:sha1dc/sha1.c:397:2: runtime
> error: load of misaligned address 0x0000024fc245 for type 'const
> uint32_t', which requires 4 byte alignment
> t/trash directory.t0021-conversion/err:sha1dc/sha1.c:402:2: runtime
> error: load of misaligned address 0x0000024fc245 for type 'const
> uint32_t', which requires 4 byte alignment

I think this is the patch for sha1dc/sha1.c . It stops using unaligned
accesses by default, but still honors SHA1DC_FORCE_UNALIGNED_ACCESS
for those who want it. Folks who want the undefined behavior have to
do something special.

Jeff

[-- Attachment #2: git.patch --]
[-- Type: application/octet-stream, Size: 966 bytes --]

--- sha1dc/sha1.c
+++ sha1dc/sha1.c
@@ -26,14 +26,6 @@
 #include "sha1.h"
 #include "ubc_check.h"
 
-#if (defined(__amd64__) || defined(__amd64) || defined(__x86_64__) || defined(__x86_64) || \
-     defined(i386) || defined(__i386) || defined(__i386__) || defined(__i486__)  || \
-     defined(__i586__) || defined(__i686__) || defined(_M_IX86) || defined(__X86__) || \
-     defined(_X86_) || defined(__THW_INTEL__) || defined(__I86__) || defined(__INTEL__) || \
-     defined(__386) || defined(_M_X64) || defined(_M_AMD64))
-#define SHA1DC_ON_INTEL_LIKE_PROCESSOR
-#endif
-
 /*
    Because Little-Endian architectures are most common,
    we only set SHA1DC_BIGENDIAN if one of these conditions is met.
@@ -124,7 +116,7 @@
 #endif
 /*ENDIANNESS SELECTION*/
 
-#if defined(SHA1DC_FORCE_UNALIGNED_ACCESS) || defined(SHA1DC_ON_INTEL_LIKE_PROCESSOR)
+#if defined(SHA1DC_FORCE_UNALIGNED_ACCESS)
 #define SHA1DC_ALLOW_UNALIGNED_ACCESS
 #endif /*UNALIGNMENT DETECTION*/

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: One failed self test on Fedora 29
  2019-03-09 13:12     ` Jeffrey Walton
@ 2019-03-11  2:00       ` Junio C Hamano
  2019-03-11  2:16         ` Jeffrey Walton
  2019-03-11  3:37         ` disabling sha1dc unaligned access, was " Jeff King
  0 siblings, 2 replies; 25+ messages in thread
From: Junio C Hamano @ 2019-03-11  2:00 UTC (permalink / raw)
  To: Jeffrey Walton; +Cc: Todd Zullinger, Git List

Jeffrey Walton <noloader@gmail.com> writes:

> I think this is the patch for sha1dc/sha1.c . It stops using unaligned
> accesses by default, but still honors SHA1DC_FORCE_UNALIGNED_ACCESS
> for those who want it. Folks who want the undefined behavior have to
> do something special.

Hmph, I somehow thought that folks who want to stick to the
standard printed on paper penalizing what practicaly works well in
the real world would be the one doing extra things.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: One failed self test on Fedora 29
  2019-03-11  2:00       ` Junio C Hamano
@ 2019-03-11  2:16         ` Jeffrey Walton
  2019-03-11  3:37         ` disabling sha1dc unaligned access, was " Jeff King
  1 sibling, 0 replies; 25+ messages in thread
From: Jeffrey Walton @ 2019-03-11  2:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Todd Zullinger, Git List

On Sun, Mar 10, 2019 at 10:00 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jeffrey Walton <noloader@gmail.com> writes:
>
> > I think this is the patch for sha1dc/sha1.c . It stops using unaligned
> > accesses by default, but still honors SHA1DC_FORCE_UNALIGNED_ACCESS
> > for those who want it. Folks who want the undefined behavior have to
> > do something special.
>
> Hmph, I somehow thought that folks who want to stick to the
> standard printed on paper penalizing what practicaly works well in
> the real world would be the one doing extra things.

In the paper example, we do. I prefer printed books. Just last month I
went to Office Depot and had portions of an ARM manual distributed as
PDF printed and spiral bound. I don't mind doing extra work as the
special case.

Jeff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: One failed self test on Fedora 29
  2019-03-09 12:34   ` Jeffrey Walton
  2019-03-09 13:12     ` Jeffrey Walton
@ 2019-03-11  3:29     ` Jeff King
  1 sibling, 0 replies; 25+ messages in thread
From: Jeff King @ 2019-03-11  3:29 UTC (permalink / raw)
  To: Jeffrey Walton; +Cc: Todd Zullinger, Git List

On Sat, Mar 09, 2019 at 07:34:15AM -0500, Jeffrey Walton wrote:

> > It would probably help to know what commit you're building.
> > The verbose test output would also be useful, e.g.:
> 
> I built with CFLAGS += -fsanitize=undefined. It looks like the
> misaligned accesses generate UBsan findings, which is causing
> t0021-conversion to fail.

You probably should use SANITIZE=undefined instead. The Makefile has
some smarts to tweak build parameters based on your sanitize flag (e.g.,
defining NO_UNALIGNED_LOADS).

That said, I do not think sha1dc works with UBsan at this point at all.
I usually do error-checking builds with:

  make SANITIZE=address,undefined BLK_SHA1=Yes

What puzzles me is not that t0021 failed, but that everything else
_didn't_. Almost every script fails for me when Git is built with UBSan
and sha1dc.

It would be nice to make sha1dc respect NO_UNALIGNED_LOADS. But barring
that, we should probably default to BLK_SHA1 when we see
SANITIZE=undefined.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-11  2:00       ` Junio C Hamano
  2019-03-11  2:16         ` Jeffrey Walton
@ 2019-03-11  3:37         ` Jeff King
  2019-03-11 10:40           ` Jeffrey Walton
                             ` (2 more replies)
  1 sibling, 3 replies; 25+ messages in thread
From: Jeff King @ 2019-03-11  3:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeffrey Walton, Todd Zullinger, Git List

On Mon, Mar 11, 2019 at 11:00:25AM +0900, Junio C Hamano wrote:

> Jeffrey Walton <noloader@gmail.com> writes:
> 
> > I think this is the patch for sha1dc/sha1.c . It stops using unaligned
> > accesses by default, but still honors SHA1DC_FORCE_UNALIGNED_ACCESS
> > for those who want it. Folks who want the undefined behavior have to
> > do something special.
> 
> Hmph, I somehow thought that folks who want to stick to the
> standard printed on paper penalizing what practicaly works well in
> the real world would be the one doing extra things.

Unfortunately, I don't think sha1dc currently supports #defines in that
direction. The only logic is "if we are on intel, do unaligned loads"
and "even if we are not on intel, do it anyway". There is no "even if we
are on intel, do not do unaligned loads".

I think you'd need something like this:

diff --git a/Makefile b/Makefile
index 148668368b..705c54dcd8 100644
--- a/Makefile
+++ b/Makefile
@@ -1194,6 +1194,7 @@ BASIC_CFLAGS += -fsanitize=$(SANITIZE) -fno-sanitize-recover=$(SANITIZE)
 BASIC_CFLAGS += -fno-omit-frame-pointer
 ifneq ($(filter undefined,$(SANITIZERS)),)
 BASIC_CFLAGS += -DNO_UNALIGNED_LOADS
+BASIC_CFLAGS += -DSHA1DC_DISALLOW_UNALIGNED_ACCESS
 endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
index df0630bc6d..0bdf80d778 100644
--- a/sha1dc/sha1.c
+++ b/sha1dc/sha1.c
@@ -124,9 +124,11 @@
 #endif
 /*ENDIANNESS SELECTION*/
 
+#ifndef SHA1DC_DISALLOW_UNALIGNED_ACCESS
 #if defined(SHA1DC_FORCE_UNALIGNED_ACCESS) || defined(SHA1DC_ON_INTEL_LIKE_PROCESSOR)
 #define SHA1DC_ALLOW_UNALIGNED_ACCESS
 #endif /*UNALIGNMENT DETECTION*/
+#endif
 
 
 #define rotate_right(x,n) (((x)>>(n))|((x)<<(32-(n))))

but of course we cannot touch sha1dc/*, because we might actually be
using the submodule copy instead. And AFAIK there is no good way to
modify the submodule-provided content as part of the build. Why do we
even have the submodule again? ;P

I guess the same would be true for DC_SHA1_EXTERNAL, too, though.

So anyway, I think this needs a patch to the upstream sha1dc project.

-Peff

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-11  3:37         ` disabling sha1dc unaligned access, was " Jeff King
@ 2019-03-11 10:40           ` Jeffrey Walton
  2019-03-11 18:19             ` Jeff King
  2019-03-11 11:58           ` Duy Nguyen
  2019-03-12 21:06           ` [PATCH] Makefile: fix unaligned loads in sha1dc with UBSan Jeff King
  2 siblings, 1 reply; 25+ messages in thread
From: Jeffrey Walton @ 2019-03-11 10:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List

On Sun, Mar 10, 2019 at 11:37 PM Jeff King <peff@peff.net> wrote:
>
> On Mon, Mar 11, 2019 at 11:00:25AM +0900, Junio C Hamano wrote:
>
> > Jeffrey Walton <noloader@gmail.com> writes:
> >
> > > I think this is the patch for sha1dc/sha1.c . It stops using unaligned
> > > accesses by default, but still honors SHA1DC_FORCE_UNALIGNED_ACCESS
> > > for those who want it. Folks who want the undefined behavior have to
> > > do something special.
> >
> > Hmph, I somehow thought that folks who want to stick to the
> > standard printed on paper penalizing what practicaly works well in
> > the real world would be the one doing extra things.
>
> Unfortunately, I don't think sha1dc currently supports #defines in that
> direction. The only logic is "if we are on intel, do unaligned loads"
> and "even if we are not on intel, do it anyway". There is no "even if we
> are on intel, do not do unaligned loads".
>
> I think you'd need something like this:
>
> diff --git a/Makefile b/Makefile
> index 148668368b..705c54dcd8 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1194,6 +1194,7 @@ BASIC_CFLAGS += -fsanitize=$(SANITIZE) -fno-sanitize-recover=$(SANITIZE)
>  BASIC_CFLAGS += -fno-omit-frame-pointer
>  ifneq ($(filter undefined,$(SANITIZERS)),)
>  BASIC_CFLAGS += -DNO_UNALIGNED_LOADS
> +BASIC_CFLAGS += -DSHA1DC_DISALLOW_UNALIGNED_ACCESS
>  endif
>  ifneq ($(filter leak,$(SANITIZERS)),)
>  BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
> diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
> index df0630bc6d..0bdf80d778 100644
> --- a/sha1dc/sha1.c
> +++ b/sha1dc/sha1.c
> @@ -124,9 +124,11 @@
>  #endif
>  /*ENDIANNESS SELECTION*/
>
> +#ifndef SHA1DC_DISALLOW_UNALIGNED_ACCESS
>  #if defined(SHA1DC_FORCE_UNALIGNED_ACCESS) || defined(SHA1DC_ON_INTEL_LIKE_PROCESSOR)
>  #define SHA1DC_ALLOW_UNALIGNED_ACCESS
>  #endif /*UNALIGNMENT DETECTION*/
> +#endif
>
>
>  #define rotate_right(x,n) (((x)>>(n))|((x)<<(32-(n))))
>
> but of course we cannot touch sha1dc/*, because we might actually be
> using the submodule copy instead. And AFAIK there is no good way to
> modify the submodule-provided content as part of the build. Why do we
> even have the submodule again? ;P
>
> I guess the same would be true for DC_SHA1_EXTERNAL, too, though.
>
> So anyway, I think this needs a patch to the upstream sha1dc project.

https://github.com/cr-marcstevens/sha1collisiondetection/issues/47

Jeff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-11  3:37         ` disabling sha1dc unaligned access, was " Jeff King
  2019-03-11 10:40           ` Jeffrey Walton
@ 2019-03-11 11:58           ` Duy Nguyen
  2019-03-11 18:15             ` Thomas Braun
  2019-03-12 21:06           ` [PATCH] Makefile: fix unaligned loads in sha1dc with UBSan Jeff King
  2 siblings, 1 reply; 25+ messages in thread
From: Duy Nguyen @ 2019-03-11 11:58 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Jeffrey Walton, Todd Zullinger, Git List

On Mon, Mar 11, 2019 at 10:48 AM Jeff King <peff@peff.net> wrote:
> And AFAIK there is no good way to
> modify the submodule-provided content as part of the build. Why do we
> even have the submodule again? ;P

Because of dogfooding of course. This is an interesting use case
though. I wonder if people often want to "patch" submodules like this
(and what we could do if that's the case)
-- 
Duy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-11 11:58           ` Duy Nguyen
@ 2019-03-11 18:15             ` Thomas Braun
  2019-03-11 18:23               ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Thomas Braun @ 2019-03-11 18:15 UTC (permalink / raw)
  To: Duy Nguyen, Jeff King
  Cc: Junio C Hamano, Jeffrey Walton, Todd Zullinger, Git List

Am 11.03.2019 um 12:58 schrieb Duy Nguyen:
> On Mon, Mar 11, 2019 at 10:48 AM Jeff King <peff@peff.net> wrote:
>> And AFAIK there is no good way to
>> modify the submodule-provided content as part of the build. Why do we
>> even have the submodule again? ;P
> 
> Because of dogfooding of course. This is an interesting use case
> though. I wonder if people often want to "patch" submodules like this
> (and what we could do if that's the case)

I usually do the following:

- Fork the sub-project
- Add a branch with my proposed patches
- Update the URL and the commit of the submodule in the super-project

This of course requires all users to do

git submodule sync

which is a bit incovenient, but works.






^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-11 10:40           ` Jeffrey Walton
@ 2019-03-11 18:19             ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2019-03-11 18:19 UTC (permalink / raw)
  To: Jeffrey Walton; +Cc: Git List

On Mon, Mar 11, 2019 at 06:40:21AM -0400, Jeffrey Walton wrote:

> > So anyway, I think this needs a patch to the upstream sha1dc project.
> 
> https://github.com/cr-marcstevens/sha1collisiondetection/issues/47

Thanks, it looks like the turnaround on that may be pretty quick. Once
it's merged there, we'll need a local patch to bump the gitlink sha1 and
tweak the Makefile.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-11 18:15             ` Thomas Braun
@ 2019-03-11 18:23               ` Jeff King
  2019-03-12  7:27                 ` Junio C Hamano
  2019-03-12  8:53                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 25+ messages in thread
From: Jeff King @ 2019-03-11 18:23 UTC (permalink / raw)
  To: Thomas Braun
  Cc: Duy Nguyen, Junio C Hamano, Jeffrey Walton, Todd Zullinger,
	Git List

On Mon, Mar 11, 2019 at 07:15:12PM +0100, Thomas Braun wrote:

> Am 11.03.2019 um 12:58 schrieb Duy Nguyen:
> > On Mon, Mar 11, 2019 at 10:48 AM Jeff King <peff@peff.net> wrote:
> >> And AFAIK there is no good way to
> >> modify the submodule-provided content as part of the build. Why do we
> >> even have the submodule again? ;P
> > 
> > Because of dogfooding of course. This is an interesting use case
> > though. I wonder if people often want to "patch" submodules like this
> > (and what we could do if that's the case)
> 
> I usually do the following:
> 
> - Fork the sub-project
> - Add a branch with my proposed patches
> - Update the URL and the commit of the submodule in the super-project
> 
> This of course requires all users to do
> 
> git submodule sync
> 
> which is a bit incovenient, but works.

The problem to me is not that the steps that a developer has to do, but
rather that we are dependent on the upstream project to make a simple
fix (which they may not agree to do, or may take a long time to do).

Whereas if we import the content into our repo as a subtree, we are free
to hack it up as we see fit, and then occasionally pull from upstream
and reconcile the changes. Changing upstream isn't advisable in the
general case, but I think makes a lot of sense for small changes
(especially if you have the discipline to actually get the same or
similar change pushed upstream).

In this particular case, though, the sha1dc project is pretty
responsive, so I don't think it's going to be a big deal. It just seems
like an anti-pattern in general.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-11 18:23               ` Jeff King
@ 2019-03-12  7:27                 ` Junio C Hamano
  2019-03-12 10:51                   ` Jeff King
  2019-03-12  8:53                 ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2019-03-12  7:27 UTC (permalink / raw)
  To: Jeff King
  Cc: Thomas Braun, Duy Nguyen, Jeffrey Walton, Todd Zullinger,
	Git List

Jeff King <peff@peff.net> writes:

> The problem to me is not that the steps that a developer has to do, but
> rather that we are dependent on the upstream project to make a simple
> fix (which they may not agree to do, or may take a long time to do).

Yeah.  In practice, I think the recommended way to work for a
depending project like us is to keep a fork in a separate repository
we control of the submodule project, and allow our fork to be
slightly ahead of the upstream while feeding our change to them.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-11 18:23               ` Jeff King
  2019-03-12  7:27                 ` Junio C Hamano
@ 2019-03-12  8:53                 ` Ævar Arnfjörð Bjarmason
  2019-03-12 11:05                   ` Jeff King
  1 sibling, 1 reply; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-12  8:53 UTC (permalink / raw)
  To: Jeff King
  Cc: Thomas Braun, Duy Nguyen, Junio C Hamano, Jeffrey Walton,
	Todd Zullinger, Git List, Marc Stevens


On Mon, Mar 11 2019, Jeff King wrote:

> On Mon, Mar 11, 2019 at 07:15:12PM +0100, Thomas Braun wrote:
>
>> Am 11.03.2019 um 12:58 schrieb Duy Nguyen:
>> > On Mon, Mar 11, 2019 at 10:48 AM Jeff King <peff@peff.net> wrote:
>> >> And AFAIK there is no good way to
>> >> modify the submodule-provided content as part of the build. Why do we
>> >> even have the submodule again? ;P
>> >
>> > Because of dogfooding of course. This is an interesting use case
>> > though. I wonder if people often want to "patch" submodules like this
>> > (and what we could do if that's the case)
>>
>> I usually do the following:
>>
>> - Fork the sub-project
>> - Add a branch with my proposed patches
>> - Update the URL and the commit of the submodule in the super-project
>>
>> This of course requires all users to do
>>
>> git submodule sync
>>
>> which is a bit incovenient, but works.
>
> The problem to me is not that the steps that a developer has to do, but
> rather that we are dependent on the upstream project to make a simple
> fix (which they may not agree to do, or may take a long time to do).
>
> Whereas if we import the content into our repo as a subtree, we are free
> to hack it up as we see fit, and then occasionally pull from upstream
> and reconcile the changes. Changing upstream isn't advisable in the
> general case, but I think makes a lot of sense for small changes
> (especially if you have the discipline to actually get the same or
> similar change pushed upstream).
>
> In this particular case, though, the sha1dc project is pretty
> responsive, so I don't think it's going to be a big deal. It just seems
> like an anti-pattern in general.

There's a at least a couple of aspects to this.

One is whether we should have the submodule in
sha1collisiondetection/. I agree that's probably a bad idea now
per-se. Honestly I wasn't expecting the answer when I submitted the
final patch to switch to it fully to be to the effect of submodules
being too immature for the git project itself to use. So now we're
effectively mid-series, and should maybe just back out.

But the other is the developer social engineering question of how we
strike the right trade-off when we import upstream code.

I fully agree with what you've said in theory, but if we look at what's
happened in practice we as a project are demonstrably not disciplined
enough to manage upstream code like this without overtly perma-forking
it.

E.g. I gave up on updating compat/regex some time ago because of the
various cross-tree patches that had ended up modifying it. Now we can't
just upstream a new engine anymore.

Someone needs to first go through those various modifications, upstream
them one-by-one or prove they're not needed anymore (and many are
portability / obscure compiler fixes, so that's hard...). The
compat/regex isn't unique here, e.g. compat/poll/ is another example of
this.

As far as I can tell none of the people changing that code went through
the process of submitting a parallel upstream fix or seeing if the issue
was fixed upstream and we could just update the code we were carrying,
and of course that gets progressively harder for any one contributor as
our divergence grows.

So even though the theory of the sha1collisiondetection/ submodule +
sha1dc/ code fork is silly, perhaps we've stumbled upon some way where
we at least file an upstream bug for issues we find and fix. As
demonstrated by other such changes that's already leaps and bounds ahead
of what we're usually doing.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-12  7:27                 ` Junio C Hamano
@ 2019-03-12 10:51                   ` Jeff King
  2019-03-13 11:47                     ` Thomas Braun
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2019-03-12 10:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Thomas Braun, Duy Nguyen, Jeffrey Walton, Todd Zullinger,
	Git List

On Tue, Mar 12, 2019 at 04:27:57PM +0900, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > The problem to me is not that the steps that a developer has to do, but
> > rather that we are dependent on the upstream project to make a simple
> > fix (which they may not agree to do, or may take a long time to do).
> 
> Yeah.  In practice, I think the recommended way to work for a
> depending project like us is to keep a fork in a separate repository
> we control of the submodule project, and allow our fork to be
> slightly ahead of the upstream while feeding our change to them.

Reading Thomas's email again, that might actually have been what he was
recommending. If so, sorry for the confusion. And I agree that's a valid
solution.

That said, I do wonder at some point if there's a huge value in using a
submodule at that point. I think there is if the dependent project is
large (and if it's optional, and some people might not need it). But in
this case, it is not a big deal to just carry the sha1dc code in-tree.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-12  8:53                 ` Ævar Arnfjörð Bjarmason
@ 2019-03-12 11:05                   ` Jeff King
  2019-03-12 12:09                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2019-03-12 11:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Thomas Braun, Duy Nguyen, Junio C Hamano, Jeffrey Walton,
	Todd Zullinger, Git List, Marc Stevens

On Tue, Mar 12, 2019 at 09:53:41AM +0100, Ævar Arnfjörð Bjarmason wrote:

> There's a at least a couple of aspects to this.
> 
> One is whether we should have the submodule in
> sha1collisiondetection/. I agree that's probably a bad idea now
> per-se. Honestly I wasn't expecting the answer when I submitted the
> final patch to switch to it fully to be to the effect of submodules
> being too immature for the git project itself to use. So now we're
> effectively mid-series, and should maybe just back out.

I think it's especially funky because we have three different ways of
getting sha1dc (in-tree, submodule, or against an external library). And
I almost blindly submitted a patch making the in-tree version work
(since that's what's used by default, and what I use) which could have
totally broken things for the other use cases without anybody realizing
until the change trickled down to somebody who uses those flags.

(Technically in this case it wouldn't actually have _broken_ them, but
just not helped them, so they'd be no worse off. But hopefully you get
the point).

Speaking of external libraries, in some ways the issue I raised is no
different than it is for any external library, where we're at the mercy
of whatever version is on the system. The big dependency for us is
usually libcurl, and we do have to sometimes work around old versions
there.

But I do think there's one thing that make the sha1dc submodule approach
more painful is that we don't control the content of the code, but we
_do_ build it ourselves with our usual compiler flags. So we're weirdly
intimate with it (and in fact, an external library would not have the
problem being discussed here, since it would have been built separately
without UBSan).

> I fully agree with what you've said in theory, but if we look at what's
> happened in practice we as a project are demonstrably not disciplined
> enough to manage upstream code like this without overtly perma-forking
> it.

I'm not sure I agree completely. Most of the things we've imported are
small enough that we're reasonably happy to accept them as a snapshot in
time and take ownership. I.e., I do not recall a lot of instances of
fixing bugs in compat/regex or compat/poll that we could have gotten
more easily by merging from upstream. But I admit I don't actually pay
much attention to those areas, so I might be completely off-base.

The one place I really _would_ have liked to remain compatible with
upstream is xdiff. And we were traditionally pretty hesitant to clean
things up there for fear of diverging. But in practice, upstream there
has been stagnant, and we've done most of the bug fixes and improvements
to it (in-tree).

> As far as I can tell none of the people changing that code went through
> the process of submitting a parallel upstream fix or seeing if the issue
> was fixed upstream and we could just update the code we were carrying,
> and of course that gets progressively harder for any one contributor as
> our divergence grows.

To be clear, I do sympathize with the notion that not pulling things
in-tree keeps our relationship with upstream more disciplined, and that
has value. I'm just not altogether clear how much it's really hurt us
overall to be undisciplined.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-12 11:05                   ` Jeff King
@ 2019-03-12 12:09                     ` Ævar Arnfjörð Bjarmason
  2019-03-12 21:01                       ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-12 12:09 UTC (permalink / raw)
  To: Jeff King
  Cc: Thomas Braun, Duy Nguyen, Junio C Hamano, Jeffrey Walton,
	Todd Zullinger, Git List, Marc Stevens


On Tue, Mar 12 2019, Jeff King wrote:

> On Tue, Mar 12, 2019 at 09:53:41AM +0100, Ævar Arnfjörð Bjarmason wrote:
>
>> There's a at least a couple of aspects to this.
>>
>> One is whether we should have the submodule in
>> sha1collisiondetection/. I agree that's probably a bad idea now
>> per-se. Honestly I wasn't expecting the answer when I submitted the
>> final patch to switch to it fully to be to the effect of submodules
>> being too immature for the git project itself to use. So now we're
>> effectively mid-series, and should maybe just back out.
>
> I think it's especially funky because we have three different ways of
> getting sha1dc (in-tree, submodule, or against an external library). And
> I almost blindly submitted a patch making the in-tree version work
> (since that's what's used by default, and what I use) which could have
> totally broken things for the other use cases without anybody realizing
> until the change trickled down to somebody who uses those flags.
>
> (Technically in this case it wouldn't actually have _broken_ them, but
> just not helped them, so they'd be no worse off. But hopefully you get
> the point).
>
> Speaking of external libraries, in some ways the issue I raised is no
> different than it is for any external library, where we're at the mercy
> of whatever version is on the system. The big dependency for us is
> usually libcurl, and we do have to sometimes work around old versions
> there.
>
> But I do think there's one thing that make the sha1dc submodule approach
> more painful is that we don't control the content of the code, but we
> _do_ build it ourselves with our usual compiler flags. So we're weirdly
> intimate with it (and in fact, an external library would not have the
> problem being discussed here, since it would have been built separately
> without UBSan).
>
>> I fully agree with what you've said in theory, but if we look at what's
>> happened in practice we as a project are demonstrably not disciplined
>> enough to manage upstream code like this without overtly perma-forking
>> it.
>
> I'm not sure I agree completely. Most of the things we've imported are
> small enough that we're reasonably happy to accept them as a snapshot in
> time and take ownership. I.e., I do not recall a lot of instances of
> fixing bugs in compat/regex or compat/poll that we could have gotten
> more easily by merging from upstream. But I admit I don't actually pay
> much attention to those areas, so I might be completely off-base.
>
> The one place I really _would_ have liked to remain compatible with
> upstream is xdiff. And we were traditionally pretty hesitant to clean
> things up there for fear of diverging. But in practice, upstream there
> has been stagnant, and we've done most of the bug fixes and improvements
> to it (in-tree).
>
>> As far as I can tell none of the people changing that code went through
>> the process of submitting a parallel upstream fix or seeing if the issue
>> was fixed upstream and we could just update the code we were carrying,
>> and of course that gets progressively harder for any one contributor as
>> our divergence grows.
>
> To be clear, I do sympathize with the notion that not pulling things
> in-tree keeps our relationship with upstream more disciplined, and that
> has value. I'm just not altogether clear how much it's really hurt us
> overall to be undisciplined.

I agree that say the compat/regex divergence hasn't hurt us much if at
all. Just that we have a few conflicting desires:

 A. Make sure you can "make" git by default without pulling down a bunch
    of libraries, especially if they're not ubiquitous. Thus shipping
    the likes of sha1dc.

 B. Being able to hotfix those libraries.

 C. Upstreaming those hotfixes when they happen.

 D. Updating the library we pulled in due to "A" from upstream.

In practice because we've wanted "A" we've felt the need to do "B", but
then also not "C" without ever having the discussion that skipping that
part was a good idea, and as a result "D" is hard.

Do we urgently need "D"? No. I'm just pointing out that in addition to
optimzing for "B" being easy we should also weigh being good free
software citizens and coordinate fixes with upstream, which also makes a
future "D" easier.

Also, while I don't know of any bugs in compat/regex. I think it's a bit
concerning that we're carrying 2010-era code for something like the
regex engine that we expose e.g. over gitweb.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-12 12:09                     ` Ævar Arnfjörð Bjarmason
@ 2019-03-12 21:01                       ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2019-03-12 21:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Thomas Braun, Duy Nguyen, Junio C Hamano, Jeffrey Walton,
	Todd Zullinger, Git List, Marc Stevens

On Tue, Mar 12, 2019 at 01:09:42PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > To be clear, I do sympathize with the notion that not pulling things
> > in-tree keeps our relationship with upstream more disciplined, and that
> > has value. I'm just not altogether clear how much it's really hurt us
> > overall to be undisciplined.
> 
> I agree that say the compat/regex divergence hasn't hurt us much if at
> all. Just that we have a few conflicting desires:
> 
>  A. Make sure you can "make" git by default without pulling down a bunch
>     of libraries, especially if they're not ubiquitous. Thus shipping
>     the likes of sha1dc.
> 
>  B. Being able to hotfix those libraries.
> 
>  C. Upstreaming those hotfixes when they happen.
> 
>  D. Updating the library we pulled in due to "A" from upstream.
> 
> In practice because we've wanted "A" we've felt the need to do "B", but
> then also not "C" without ever having the discussion that skipping that
> part was a good idea, and as a result "D" is hard.
> 
> Do we urgently need "D"? No. I'm just pointing out that in addition to
> optimzing for "B" being easy we should also weigh being good free
> software citizens and coordinate fixes with upstream, which also makes a
> future "D" easier.

Yeah, I think your mental model there makes some sense.

> Also, while I don't know of any bugs in compat/regex. I think it's a bit
> concerning that we're carrying 2010-era code for something like the
> regex engine that we expose e.g. over gitweb.

TBH, I think it's probably a bad idea to open any version of that code
to untrusted people, because it's easy to write a DoS regex against it.
There are libraries (like re2) that would be a better fit (I seem to
recall that there may be some DFA-only support in pcre, too, but I
haven't looked at it in a long time; if there is, it might be a sane
gitweb feature to only expose that engine).

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH] Makefile: fix unaligned loads in sha1dc with UBSan
  2019-03-11  3:37         ` disabling sha1dc unaligned access, was " Jeff King
  2019-03-11 10:40           ` Jeffrey Walton
  2019-03-11 11:58           ` Duy Nguyen
@ 2019-03-12 21:06           ` Jeff King
  2019-03-12 21:17             ` Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2019-03-12 21:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Ævar Arnfjörð Bjarmason, Jeffrey Walton,
	Todd Zullinger, Git List

On Sun, Mar 10, 2019 at 11:37:55PM -0400, Jeff King wrote:

> Unfortunately, I don't think sha1dc currently supports #defines in that
> direction. The only logic is "if we are on intel, do unaligned loads"
> and "even if we are not on intel, do it anyway". There is no "even if we
> are on intel, do not do unaligned loads".
> 
> I think you'd need something like this:
> [...]

The sha1dc folks gave us a very nice and quick turnaround on this.
Thanks to them, and to Jeffrey for opening an issue there.

Here's a commit which updates Git to use the new feature. I've tested it
with both the in-tree and submodule builds like:

  make DC_SHA1_SUBMODULE=Yes SANITIZE=undefined && (cd t && ./t0001-*)
  make DC_SHA1_SUBMODULE=    SANITIZE=undefined && (cd t && ./t0001-*)

both of which fail without this patch and succeed without it.

-- >8 --
Subject: [PATCH] Makefile: fix unaligned loads in sha1dc with UBSan

The sha1dc library uses unaligned loads on platforms that support them.
This is normally what you'd want for performance, but it does cause
UBSan to complain when we compile with SANITIZE=undefined. Just like we
set -DNO_UNALIGNED_LOADS for our own code in that case, we should set
-DSHA1DC_FORCE_ALIGNED_ACCESS.

Of course that does nothing without pulling in the patches from sha1dc
to respect that define. So let's do that, too, updating both the
submodule link and our in-tree copy (from the same commit).

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile               | 1 +
 sha1collisiondetection | 2 +-
 sha1dc/sha1.c          | 5 +++--
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/Makefile b/Makefile
index 537493822b..593c2c729a 100644
--- a/Makefile
+++ b/Makefile
@@ -1195,6 +1195,7 @@ BASIC_CFLAGS += -fsanitize=$(SANITIZE) -fno-sanitize-recover=$(SANITIZE)
 BASIC_CFLAGS += -fno-omit-frame-pointer
 ifneq ($(filter undefined,$(SANITIZERS)),)
 BASIC_CFLAGS += -DNO_UNALIGNED_LOADS
+BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
 endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
diff --git a/sha1collisiondetection b/sha1collisiondetection
index 232357eb2e..16033998da 160000
--- a/sha1collisiondetection
+++ b/sha1collisiondetection
@@ -1 +1 @@
-Subproject commit 232357eb2ea0397388254a4b188333a227bf5b10
+Subproject commit 16033998da4b273aebd92c84b1e1b12e4aaf7009
diff --git a/sha1dc/sha1.c b/sha1dc/sha1.c
index df0630bc6d..5931cf25d5 100644
--- a/sha1dc/sha1.c
+++ b/sha1dc/sha1.c
@@ -124,10 +124,11 @@
 #endif
 /*ENDIANNESS SELECTION*/
 
+#ifndef SHA1DC_FORCE_ALIGNED_ACCESS
 #if defined(SHA1DC_FORCE_UNALIGNED_ACCESS) || defined(SHA1DC_ON_INTEL_LIKE_PROCESSOR)
 #define SHA1DC_ALLOW_UNALIGNED_ACCESS
-#endif /*UNALIGNMENT DETECTION*/
-
+#endif /*UNALIGNED ACCESS DETECTION*/
+#endif /*FORCE ALIGNED ACCESS*/
 
 #define rotate_right(x,n) (((x)>>(n))|((x)<<(32-(n))))
 #define rotate_left(x,n)  (((x)<<(n))|((x)>>(32-(n))))
-- 
2.21.0.539.gcf54785f87


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] Makefile: fix unaligned loads in sha1dc with UBSan
  2019-03-12 21:06           ` [PATCH] Makefile: fix unaligned loads in sha1dc with UBSan Jeff King
@ 2019-03-12 21:17             ` Ævar Arnfjörð Bjarmason
  2019-03-12 21:19               ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-12 21:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Jeffrey Walton, Todd Zullinger, Git List


On Tue, Mar 12 2019, Jeff King wrote:

> On Sun, Mar 10, 2019 at 11:37:55PM -0400, Jeff King wrote:
>
>> Unfortunately, I don't think sha1dc currently supports #defines in that
>> direction. The only logic is "if we are on intel, do unaligned loads"
>> and "even if we are not on intel, do it anyway". There is no "even if we
>> are on intel, do not do unaligned loads".
>>
>> I think you'd need something like this:
>> [...]
>
> The sha1dc folks gave us a very nice and quick turnaround on this.
> Thanks to them, and to Jeffrey for opening an issue there.

Thanks. Good to have it resolved this way.

> Here's a commit which updates Git to use the new feature. I've tested it
> with both the in-tree and submodule builds like:
>
>   make DC_SHA1_SUBMODULE=Yes SANITIZE=undefined && (cd t && ./t0001-*)
>   make DC_SHA1_SUBMODULE=    SANITIZE=undefined && (cd t && ./t0001-*)
>
> both of which fail without this patch and succeed without it.

FWIW I've reproduced this testing and found the same thing. Looks good
to me.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] Makefile: fix unaligned loads in sha1dc with UBSan
  2019-03-12 21:17             ` Ævar Arnfjörð Bjarmason
@ 2019-03-12 21:19               ` Jeff King
  0 siblings, 0 replies; 25+ messages in thread
From: Jeff King @ 2019-03-12 21:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Jeffrey Walton, Todd Zullinger, Git List

On Tue, Mar 12, 2019 at 10:17:56PM +0100, Ævar Arnfjörð Bjarmason wrote:

> > Here's a commit which updates Git to use the new feature. I've tested it
> > with both the in-tree and submodule builds like:
> >
> >   make DC_SHA1_SUBMODULE=Yes SANITIZE=undefined && (cd t && ./t0001-*)
> >   make DC_SHA1_SUBMODULE=    SANITIZE=undefined && (cd t && ./t0001-*)
> >
> > both of which fail without this patch and succeed without it.
> 
> FWIW I've reproduced this testing and found the same thing. Looks good
> to me.

Er, that second "without" should be "with", but hopefully you figured
that out during your testing. :)

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-12 10:51                   ` Jeff King
@ 2019-03-13 11:47                     ` Thomas Braun
  2019-03-13 15:39                       ` Jeff King
  0 siblings, 1 reply; 25+ messages in thread
From: Thomas Braun @ 2019-03-13 11:47 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano
  Cc: Duy Nguyen, Jeffrey Walton, Todd Zullinger, Git List

Am 12.03.2019 um 11:51 schrieb Jeff King:
> On Tue, Mar 12, 2019 at 04:27:57PM +0900, Junio C Hamano wrote:
> 
>> Jeff King <peff@peff.net> writes:
>>
>>> The problem to me is not that the steps that a developer has to do, but
>>> rather that we are dependent on the upstream project to make a simple
>>> fix (which they may not agree to do, or may take a long time to do).
>>
>> Yeah.  In practice, I think the recommended way to work for a
>> depending project like us is to keep a fork in a separate repository
>> we control of the submodule project, and allow our fork to be
>> slightly ahead of the upstream while feeding our change to them.
> 
> Reading Thomas's email again, that might actually have been what he was
> recommending. If so, sorry for the confusion. And I agree that's a valid
> solution.

Yes that is what I tried to explain. Looks like it was lost in translation.

> That said, I do wonder at some point if there's a huge value in using a
> submodule at that point. I think there is if the dependent project is
> large (and if it's optional, and some people might not need it). But in
> this case, it is not a big deal to just carry the sha1dc code in-tree.

A big win with submodules is that you have separate histories and can,
quite easily, update to newer versions without manual copying.

One grievance with submodules is the URL switching if you need to go
with a forked repo for some time and then back to the original.
Is it possible to have multiple remotes for a submodule?

Something like:

[submodule "libfoo"]
	path = include/foo
	url1 = git://foo.com/upstream/lib.git
	url2 = git://foo.com/myFork/lib.git

With that the error prone git submodule sync step is not required anymore.

submodule.alternateLocation looks like it is going into the right direction.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-13 11:47                     ` Thomas Braun
@ 2019-03-13 15:39                       ` Jeff King
  2019-03-13 16:00                         ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2019-03-13 15:39 UTC (permalink / raw)
  To: Thomas Braun
  Cc: Junio C Hamano, Duy Nguyen, Jeffrey Walton, Todd Zullinger,
	Git List

On Wed, Mar 13, 2019 at 12:47:51PM +0100, Thomas Braun wrote:

> > Reading Thomas's email again, that might actually have been what he was
> > recommending. If so, sorry for the confusion. And I agree that's a valid
> > solution.
> 
> Yes that is what I tried to explain. Looks like it was lost in translation.

I think the problem was on the reading end. :)

> > That said, I do wonder at some point if there's a huge value in using a
> > submodule at that point. I think there is if the dependent project is
> > large (and if it's optional, and some people might not need it). But in
> > this case, it is not a big deal to just carry the sha1dc code in-tree.
> 
> A big win with submodules is that you have separate histories and can,
> quite easily, update to newer versions without manual copying.

True. We'd generally be picking up snapshots in our in-tree sha1dc/, so
bisecting on it is not as fine-grained. We _could_ pull in the full
history using something like git-subtree, but that comes with its own
complications.

> One grievance with submodules is the URL switching if you need to go
> with a forked repo for some time and then back to the original.
> Is it possible to have multiple remotes for a submodule?
> 
> Something like:
> 
> [submodule "libfoo"]
> 	path = include/foo
> 	url1 = git://foo.com/upstream/lib.git
> 	url2 = git://foo.com/myFork/lib.git
> 
> With that the error prone git submodule sync step is not required anymore.

I assume you'd fetch from _all_ of them during a fetch, and assume that
one of them will get you the objects you need (or I guess if you are
looking for a specific object, you'd try them one at a time until you
get the object).

That makes sense, though it might be kind of annoying when fetching is
expensive (especially if it involves manually authenticating).

> submodule.alternateLocation looks like it is going into the right direction.

I think that's mostly about pointing back to the superproject for local
storage. Though I think there's a pretty reasonable solution to the
problem we're discussing there: git.git could carry a "sha1dc" branch
that points to our modified submodule history. So it's "in-tree" in the
sense that that it is in our repo, and under our full control, but still
managed like a submodule.

And we'd probably not even duplicate a lot of storage in the actual
clone of the upstream project, because it would be pointing to us as an
alternate.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29
  2019-03-13 15:39                       ` Jeff King
@ 2019-03-13 16:00                         ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2019-03-13 16:00 UTC (permalink / raw)
  To: Jeff King
  Cc: Thomas Braun, Junio C Hamano, Duy Nguyen, Jeffrey Walton,
	Todd Zullinger, Git List


On Wed, Mar 13 2019, Jeff King wrote:

> On Wed, Mar 13, 2019 at 12:47:51PM +0100, Thomas Braun wrote:
>
>> > Reading Thomas's email again, that might actually have been what he was
>> > recommending. If so, sorry for the confusion. And I agree that's a valid
>> > solution.
>>
>> Yes that is what I tried to explain. Looks like it was lost in translation.
>
> I think the problem was on the reading end. :)
>
>> > That said, I do wonder at some point if there's a huge value in using a
>> > submodule at that point. I think there is if the dependent project is
>> > large (and if it's optional, and some people might not need it). But in
>> > this case, it is not a big deal to just carry the sha1dc code in-tree.
>>
>> A big win with submodules is that you have separate histories and can,
>> quite easily, update to newer versions without manual copying.
>
> True. We'd generally be picking up snapshots in our in-tree sha1dc/, so
> bisecting on it is not as fine-grained. We _could_ pull in the full
> history using something like git-subtree, but that comes with its own
> complications.
>
>> One grievance with submodules is the URL switching if you need to go
>> with a forked repo for some time and then back to the original.
>> Is it possible to have multiple remotes for a submodule?
>>
>> Something like:
>>
>> [submodule "libfoo"]
>> 	path = include/foo
>> 	url1 = git://foo.com/upstream/lib.git
>> 	url2 = git://foo.com/myFork/lib.git
>>
>> With that the error prone git submodule sync step is not required anymore.
>
> I assume you'd fetch from _all_ of them during a fetch, and assume that
> one of them will get you the objects you need (or I guess if you are
> looking for a specific object, you'd try them one at a time until you
> get the object).
>
> That makes sense, though it might be kind of annoying when fetching is
> expensive (especially if it involves manually authenticating).
>
>> submodule.alternateLocation looks like it is going into the right direction.
>
> I think that's mostly about pointing back to the superproject for local
> storage. Though I think there's a pretty reasonable solution to the
> problem we're discussing there: git.git could carry a "sha1dc" branch
> that points to our modified submodule history. So it's "in-tree" in the
> sense that that it is in our repo, and under our full control, but still
> managed like a submodule.
>
> And we'd probably not even duplicate a lot of storage in the actual
> clone of the upstream project, because it would be pointing to us as an
> alternate.

Now if only we could think of some way to give the people best
positioned to fix some of these UI issues for git users everywhere the
incentive to do so[1] :)

1. https://public-inbox.org/git/20171208223001.556-5-avarab@gmail.com/

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2019-03-13 16:00 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-08 10:48 One failed self test on Fedora 29 Jeffrey Walton
2019-03-08 17:43 ` Todd Zullinger
2019-03-09 12:34   ` Jeffrey Walton
2019-03-09 13:12     ` Jeffrey Walton
2019-03-11  2:00       ` Junio C Hamano
2019-03-11  2:16         ` Jeffrey Walton
2019-03-11  3:37         ` disabling sha1dc unaligned access, was " Jeff King
2019-03-11 10:40           ` Jeffrey Walton
2019-03-11 18:19             ` Jeff King
2019-03-11 11:58           ` Duy Nguyen
2019-03-11 18:15             ` Thomas Braun
2019-03-11 18:23               ` Jeff King
2019-03-12  7:27                 ` Junio C Hamano
2019-03-12 10:51                   ` Jeff King
2019-03-13 11:47                     ` Thomas Braun
2019-03-13 15:39                       ` Jeff King
2019-03-13 16:00                         ` Ævar Arnfjörð Bjarmason
2019-03-12  8:53                 ` Ævar Arnfjörð Bjarmason
2019-03-12 11:05                   ` Jeff King
2019-03-12 12:09                     ` Ævar Arnfjörð Bjarmason
2019-03-12 21:01                       ` Jeff King
2019-03-12 21:06           ` [PATCH] Makefile: fix unaligned loads in sha1dc with UBSan Jeff King
2019-03-12 21:17             ` Ævar Arnfjörð Bjarmason
2019-03-12 21:19               ` Jeff King
2019-03-11  3:29     ` One failed self test on Fedora 29 Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).