git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Emily Shaffer <emilyshaffer@google.com>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [RFC PATCH] unpack-trees: watch for out-of-range index position
Date: Fri, 10 Jan 2020 01:37:41 -0500	[thread overview]
Message-ID: <20200110063741.GA409153@coredump.intra.peff.net> (raw)
In-Reply-To: <20200109224641.GF181522@google.com>

On Thu, Jan 09, 2020 at 02:46:41PM -0800, Emily Shaffer wrote:

> > Perhaps. The integrity check only protects against an index that was
> > modified after the fact, not one that was generated by a buggy Git. I'm
> > not sure we know how the index that led to this patch got into this
> > state (though it sounds like Emily has a copy and could check the hash
> > on it), but other cache-tree segfault I found recently was with an index
> > with an intact integrity hash.
> 
> Yeah, I can do that, although I'm not sure how. The index itself is very
> small - it only contains one file and one tree extension - so I'll go
> ahead and paste some poking and prodding, and if it's not what you
> wanted then please let me know what else to run.

I was thinking you would run something like:

  size=$(stat --format=%s "$file")
  actual=$(head -c $(($size-20)) "$file" | sha1sum | awk '{print $1}')
  expect=$(xxd -s -20 -g 20 -c 20 "$file" | awk '{print $2}')
  if test "$actual" = "$expect"; then
          echo "OK ($actual)"
  else
          echo "FAIL ($actual != $expect)"
  fi

to manually check the sha1. But...

>   $ g fsck --cache
>   Checking object directories: 100% (256/256), done.
>   Checking objects: 100% (20/20), done.
>   broken link from  commit 153a9a100eae7fdba5989ce39a5dd1782075517f
>                 to  commit cca7ecaa5d8c398f41bfec7938cc6a526803579b
>   broken link from  commit 7d6bb91e31d18eadfaf855a9fb7ad6ba81b8b6d9
>                 to  commit 03087a617bfe55f862cb1ef43273a2bd08e8b6d6
>   missing commit 03087a617bfe55f862cb1ef43273a2bd08e8b6d6
>   missing commit cca7ecaa5d8c398f41bfec7938cc6a526803579b
>   dangling commit 5e2c635433bc46b13061b276e481f63b1f6642c8

...fsck would have reported a problem there, since we explicitly kept
the check there in a33fc72fe9 (read-cache: force_verify_index_checksum,
2017-04-14).

And just to be double-sure, I used this:

>   $ hexdump -C .git/index
>   00000000  44 49 52 43 00 00 00 02  00 00 00 01 5d 89 5e 22  |DIRC........].^"|
>   00000010  23 bf a3 c4 5d 89 5e 22  23 bf a3 c4 00 00 fe 02  |#...].^"#.......|
>   00000020  02 c8 f5 83 00 00 81 a4  00 06 c1 dc 00 01 5f 53  |.............._S|
>   00000030  00 00 06 b3 78 88 a4 f4  22 34 7d ad b0 c4 73 0f  |....x..."4}...s.|
>   00000040  c5 bc f6 ea 1d 2d f0 3a  00 09 52 45 41 44 4d 45  |.....-.:..README|
>   00000050  2e 6d 64 00 54 52 45 45  00 00 00 3a 00 31 37 20  |.md.TREE...:.17 |
>   00000060  31 0a da 7f 67 25 40 7d  4e ce 9f d3 72 ce 4c e8  |1...g%@}N...r.L.|
>   00000070  40 6d 5d ad e9 79 67 69  74 6c 69 6e 74 00 34 20  |@m]..ygitlint.4 |
>   00000080  30 0a 93 63 25 17 69 e6  d6 92 78 97 55 4b 0f 8b  |0..c%.i...x.UK..|
>   00000090  ff a0 e8 2d 6d 71 32 d1  69 fc f2 38 42 f8 5a 6e  |...-mq2.i..8B.Zn|
>   000000a0  05 35 d6 94 41 c0 9f c7  ba 43                    |.5..A....C|
>   000000aa

to reconstruct the file and check its sha1, and indeed it is fine.

So this bogus index was probably actually created by Git, not an
after-the-fact byte corruption.

-Peff

  reply	other threads:[~2020-01-10  6:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-08  2:31 [RFC PATCH] unpack-trees: watch for out-of-range index position Emily Shaffer
2020-01-08  7:15 ` Jeff King
2020-01-08 17:30   ` Junio C Hamano
2020-01-08 19:38     ` Emily Shaffer
2020-01-08 20:35       ` Junio C Hamano
2020-01-09  7:52         ` Jeff King
2020-01-09 22:46           ` Emily Shaffer
2020-01-10  6:37             ` Jeff King [this message]
2020-01-10 23:07               ` Emily Shaffer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200110063741.GA409153@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).