From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 876BA1F885 for ; Fri, 10 Jan 2020 06:37:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731596AbgAJGhm (ORCPT ); Fri, 10 Jan 2020 01:37:42 -0500 Received: from cloud.peff.net ([104.130.231.41]:33324 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1731455AbgAJGhm (ORCPT ); Fri, 10 Jan 2020 01:37:42 -0500 Received: (qmail 2077 invoked by uid 109); 10 Jan 2020 06:37:41 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Fri, 10 Jan 2020 06:37:41 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 22951 invoked by uid 111); 10 Jan 2020 06:43:31 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Fri, 10 Jan 2020 01:43:31 -0500 Authentication-Results: peff.net; auth=none Date: Fri, 10 Jan 2020 01:37:41 -0500 From: Jeff King To: Emily Shaffer Cc: Junio C Hamano , git@vger.kernel.org Subject: Re: [RFC PATCH] unpack-trees: watch for out-of-range index position Message-ID: <20200110063741.GA409153@coredump.intra.peff.net> References: <20200108023127.219429-1-emilyshaffer@google.com> <20200108071525.GB1675456@coredump.intra.peff.net> <20200108193833.GD181522@google.com> <20200109075250.GA3978837@coredump.intra.peff.net> <20200109224641.GF181522@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200109224641.GF181522@google.com> Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Thu, Jan 09, 2020 at 02:46:41PM -0800, Emily Shaffer wrote: > > Perhaps. The integrity check only protects against an index that was > > modified after the fact, not one that was generated by a buggy Git. I'm > > not sure we know how the index that led to this patch got into this > > state (though it sounds like Emily has a copy and could check the hash > > on it), but other cache-tree segfault I found recently was with an index > > with an intact integrity hash. > > Yeah, I can do that, although I'm not sure how. The index itself is very > small - it only contains one file and one tree extension - so I'll go > ahead and paste some poking and prodding, and if it's not what you > wanted then please let me know what else to run. I was thinking you would run something like: size=$(stat --format=%s "$file") actual=$(head -c $(($size-20)) "$file" | sha1sum | awk '{print $1}') expect=$(xxd -s -20 -g 20 -c 20 "$file" | awk '{print $2}') if test "$actual" = "$expect"; then echo "OK ($actual)" else echo "FAIL ($actual != $expect)" fi to manually check the sha1. But... > $ g fsck --cache > Checking object directories: 100% (256/256), done. > Checking objects: 100% (20/20), done. > broken link from commit 153a9a100eae7fdba5989ce39a5dd1782075517f > to commit cca7ecaa5d8c398f41bfec7938cc6a526803579b > broken link from commit 7d6bb91e31d18eadfaf855a9fb7ad6ba81b8b6d9 > to commit 03087a617bfe55f862cb1ef43273a2bd08e8b6d6 > missing commit 03087a617bfe55f862cb1ef43273a2bd08e8b6d6 > missing commit cca7ecaa5d8c398f41bfec7938cc6a526803579b > dangling commit 5e2c635433bc46b13061b276e481f63b1f6642c8 ...fsck would have reported a problem there, since we explicitly kept the check there in a33fc72fe9 (read-cache: force_verify_index_checksum, 2017-04-14). And just to be double-sure, I used this: > $ hexdump -C .git/index > 00000000 44 49 52 43 00 00 00 02 00 00 00 01 5d 89 5e 22 |DIRC........].^"| > 00000010 23 bf a3 c4 5d 89 5e 22 23 bf a3 c4 00 00 fe 02 |#...].^"#.......| > 00000020 02 c8 f5 83 00 00 81 a4 00 06 c1 dc 00 01 5f 53 |.............._S| > 00000030 00 00 06 b3 78 88 a4 f4 22 34 7d ad b0 c4 73 0f |....x..."4}...s.| > 00000040 c5 bc f6 ea 1d 2d f0 3a 00 09 52 45 41 44 4d 45 |.....-.:..README| > 00000050 2e 6d 64 00 54 52 45 45 00 00 00 3a 00 31 37 20 |.md.TREE...:.17 | > 00000060 31 0a da 7f 67 25 40 7d 4e ce 9f d3 72 ce 4c e8 |1...g%@}N...r.L.| > 00000070 40 6d 5d ad e9 79 67 69 74 6c 69 6e 74 00 34 20 |@m]..ygitlint.4 | > 00000080 30 0a 93 63 25 17 69 e6 d6 92 78 97 55 4b 0f 8b |0..c%.i...x.UK..| > 00000090 ff a0 e8 2d 6d 71 32 d1 69 fc f2 38 42 f8 5a 6e |...-mq2.i..8B.Zn| > 000000a0 05 35 d6 94 41 c0 9f c7 ba 43 |.5..A....C| > 000000aa to reconstruct the file and check its sha1, and indeed it is fine. So this bogus index was probably actually created by Git, not an after-the-fact byte corruption. -Peff