git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Christian Couder <christian.couder@gmail.com>
Cc: git <git@vger.kernel.org>, "Junio C Hamano" <gitster@pobox.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Jonathan Tan" <jonathantanmy@google.com>,
	"Brandon Williams" <bmwill@google.com>,
	"Christian Couder" <chriscool@tuxfamily.org>
Subject: Re: [PATCH 2/3] t: add t0016-oidmap.sh
Date: Sun, 9 Jun 2019 23:21:30 +0200	[thread overview]
Message-ID: <20190609212130.GC24208@szeder.dev> (raw)
In-Reply-To: <CAP8UFD2AD9NzOUcLfN+NuWp_9JzwdV9oUo9rGAPXt3EP95=_og@mail.gmail.com>

On Sun, Jun 09, 2019 at 10:24:55PM +0200, Christian Couder wrote:
> On Sun, Jun 9, 2019 at 11:23 AM SZEDER Gábor <szeder.dev@gmail.com> wrote:
> >
> > On Sun, Jun 09, 2019 at 06:49:06AM +0200, Christian Couder wrote:
> > > +
> > > +test_oidmap() {
> > > +     echo "$1" | test-tool oidmap $3 > actual &&
> > > +     echo "$2" > expect &&
> >
> > Style nit: space between redirection op and filename.
> 
> Thanks for spotting this. It's fixed in my current version.
> 
> > > +test_oidhash() {
> > > +     git rev-parse "$1" | perl -ne 'print hex("$4$3$2$1") . "\n" if m/^(..)(..)(..)(..).*/;'
> >
> > New Perl dependencies always make Dscho sad... :)
> 
> Yeah, I was not sure how to do it properly in shell so I was hoping I
> would get suggestions about this. Thanks for looking at this!
> 
> I could have hardcoded the values as it is done in t0011-hashmap.sh,
> but I thought it was better to find a function that does he job.

Well, I'm fine with hardcoding the expected hash values (in network
byte order) as well, because then we won't add another git process
upstream of a pipe that would pop up during audit later...

> > So, 'test oidmap' from the previous patch prints the value we want to
> > check with:
> >
> >     printf("%u\n", sha1hash(oid.hash));
> 
> Yeah, I did it this way because "test-hashmap.c" does the same kind of
> thing to print hashes:
> 
>             printf("%u %u %u %u\n",
>                    strhash(p1), memhash(p1, strlen(p1)),
>                    strihash(p1), memihash(p1, strlen(p1)));
> 
> > First, since object ids inherently make more sense as hex values, it
> > would be more appropriate to print that hash with the '%x' format
> > specifier,
> 
> I would be ok with that, but then I think it would make sense to also
> print hex values in "test-hashmap.c".
> 
> > and then we wouldn't need Perl's hex() anymore, and thus
> > could swap the order of the first four bytes in oidmap's hash without
> > relying on Perl, e.g. with:
> >
> >   sed -e 's/^\(..\)\(..\)\(..\)\(..\).*/\4\3\2\1/'
> >
> > Second, and more importantly, the need for swapping the byte order
> > indicates that this test would fail on big-endian systems, I'm afraid.
> > So I think we need an additional bswap32() on the printing side,
> 
> Ok, but then shouldn't we also use bswap32() in "test-hashmap.c"?

No.  The two test scripts/helpers work with different hashes.  t0011
and 'test-hashmap.c' uses the various FNV-1-based hash functions
(strhash(), memhash(), ...) to calculate an unsigned int hash of the
items stored in the hashmap, therefore their hashes will be the same
regardless of endianness.  In an oidmap, however, the hash is simply
the first four bytes of the object id as an unsigned int as is, and
look at how sha1hash() does it, and indeed at the last sentence of the
comment in front of it:

 * [...] Note that
 * the results will be different on big-endian and little-endian
 * platforms, so they should not be stored or transferred over the net.
 */
static inline unsigned int sha1hash(const unsigned char *sha1)
{
        /*
         * Equivalent to 'return *(unsigned int *)sha1;', but safe on
         * platforms that don't support unaligned reads.
         */
        unsigned int hash;
        memcpy(&hash, sha1, sizeof(hash));
        return hash;
}

> By the way it seems that we use ntohl() or htonl() instead of
> bswap32() in the source code.

OK.

> > and then could further simplify 'test_oidhash':
> >
> > diff --git a/t/helper/test-oidmap.c b/t/helper/test-oidmap.c
> > index 0ba122a264..4177912f9a 100644
> > --- a/t/helper/test-oidmap.c
> > +++ b/t/helper/test-oidmap.c
> > @@ -51,7 +51,7 @@ int cmd__oidmap(int argc, const char **argv)
> >
> >                         /* print hash of oid */
> >                         if (!get_oid(p1, &oid))
> > -                               printf("%u\n", sha1hash(oid.hash));
> > +                               printf("%x\n", bswap32(sha1hash(oid.hash)));
> >                         else
> >                                 printf("Unknown oid: %s\n", p1);
> >
> > diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh
> > index 3a8e8bdb3d..9c0d88a316 100755
> > --- a/t/t0016-oidmap.sh
> > +++ b/t/t0016-oidmap.sh
> > @@ -22,10 +22,10 @@ test_expect_success 'setup' '
> >  '
> >
> >  test_oidhash() {
> > -       git rev-parse "$1" | perl -ne 'print hex("$4$3$2$1") . "\n" if m/^(..)(..)(..)(..).*/;'
> > +       git rev-parse "$1" | cut -c1-8
> >  }
> >
> > -test_expect_success PERL 'hash' '
> > +test_expect_success 'hash' '
> 
> Yeah, I agree that it seems better to me this way.

  reply	other threads:[~2019-06-09 21:21 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-09  4:49 [PATCH 0/3] Test oidmap Christian Couder
2019-06-09  4:49 ` [PATCH 1/3] t/helper: add test-oidmap.c Christian Couder
2019-06-09  4:49 ` [PATCH 2/3] t: add t0016-oidmap.sh Christian Couder
2019-06-09  9:22   ` SZEDER Gábor
2019-06-09 20:24     ` Christian Couder
2019-06-09 21:21       ` SZEDER Gábor [this message]
2019-06-09 21:51         ` Christian Couder
2019-06-10 16:46     ` Junio C Hamano
2019-06-13 17:19     ` Jeff King
2019-06-13 17:52       ` SZEDER Gábor
2019-06-13 19:02         ` Jeff King
2019-06-13 22:22           ` Junio C Hamano
2019-06-14 10:36             ` Christian Couder
2019-06-09  4:49 ` [PATCH 3/3] oidmap: use sha1hash() instead of static hash() function Christian Couder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190609212130.GC24208@szeder.dev \
    --to=szeder.dev@gmail.com \
    --cc=avarab@gmail.com \
    --cc=bmwill@google.com \
    --cc=chriscool@tuxfamily.org \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).