From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_NONE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 3906E1F462 for ; Wed, 31 Jul 2019 00:59:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727129AbfGaA7f (ORCPT ); Tue, 30 Jul 2019 20:59:35 -0400 Received: from cloud.peff.net ([104.130.231.41]:56180 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1726369AbfGaA7f (ORCPT ); Tue, 30 Jul 2019 20:59:35 -0400 Received: (qmail 10007 invoked by uid 109); 31 Jul 2019 00:59:35 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Wed, 31 Jul 2019 00:59:35 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 12352 invoked by uid 111); 31 Jul 2019 01:01:18 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 30 Jul 2019 21:01:18 -0400 Authentication-Results: peff.net; auth=none Date: Tue, 30 Jul 2019 20:59:34 -0400 From: Jeff King To: SZEDER =?utf-8?B?R8OhYm9y?= Cc: Todd Zullinger , =?utf-8?B?UmVuw6k=?= Scharfe , "Randall S. Becker" , 'Junio C Hamano' , Christian Couder , git@vger.kernel.org, git-packagers@googlegroups.com Subject: Re: [ANNOUNCE] Git v2.23.0-rc0 - Initial test failures on NonStop Message-ID: <20190731005933.GA9610@sigill.intra.peff.net> References: <049a01d546f9$70be7a30$523b6e90$@nexbridge.com> <20190730194938.GZ4545@pobox.com> <20190730200203.GA4882@sigill.intra.peff.net> <20190730205624.GR20404@szeder.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190730205624.GR20404@szeder.dev> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Tue, Jul 30, 2019 at 10:56:24PM +0200, SZEDER Gábor wrote: > > Ah, of course. Our oid hashing is done by just picking off the first > > bytes of the sha1, and it doesn't care about endianness (because these > > are just internal-to-memory hashes). > > Yeah. > > > We _could_ reconcile that like this: > > Do we really want that, though? It's a hashmap, after all, and the > order of iteration over various hashmap implementations tends to be > arbitrary. So an argument could be made that this test is overly > specific by expecting a particular order of elements (and perhaps by > checking the elements' oid as well), and it would be sufficient to > check that it iterates over all elements, no matter the order (IOW > sorting 'actual' before the comparison). I'd agree that this test is being overly specific. I guess what I'm feeling is a vague notion that it might be better if Git behaves deterministically regardless of endian-ness. Not because it _should_ matter for this test, but there could literally be a bug on big-endian platforms that nobody knows about because it's very rare for anybody to test there. I admit that's pretty hand-wavy though. And there may actually be a benefit in finding such a bug, because it means that some part of the code (or a test) is relying on something it ought not to. > OTOH, this is not just any hashmap, but an oidmap, and I could imagine > that there might be use cases where it would be beneficial if the > iteration order were to match the oid order (but don't know whether we > actually have such a use case). I don't think we can promise anything about iteration order. This test is relying on the order at least being deterministic between runs, but if we added a new entry and had to grow the table, all bets are off. So regardless of the endian thing above, it probably does make sense for any hashmap iteration output to be sorted before comparing. That goes for t0011, too; it doesn't have this endian thing, but it looks to be relying on hash order that could change if we swapped out hash functions. > > I > > wonder if it's appreciably less efficient. I'll bet I could nerd-snipe > > René into doing a bunch of measurements and explorations of the > > disassembled code. ;) > > Maybe it shows up in an oidmap-specific performance test, but with all > that's usually going on in Git hashmap performance tends to be > negligible (e.g. it's rarely visible in flame graphs). That's my guess, too, but data trumps guesses (you'll note that I'm not volunteering to _collect_ the data, though, which perhaps gives a sense of how invested I am in it. ;) ). -Peff