From: Brandon Williams <bwilliamseng@gmail.com>
To: git <git@vger.kernel.org>
Cc: Jeff King <peff@peff.net>
Subject: invalid tree and commit object
Date: Fri, 8 May 2020 23:19:38 -0700 [thread overview]
Message-ID: <CALN-EhTpiLERuB16-WPZaLub6GdaRHJW8xDeaOEqSFtKe0kCYw@mail.gmail.com> (raw)
Hey!
Its been a minute since I've written to the list but I was recently looking
into the rules fsck uses to identify valid or invalid objects and I believe I
found a case that I believe fsck is currently missing. One of the things fsck
looks for when validating a tree object is that it doesn't contain any
duplicate entries. It even has a nice comment about how `git-write-tree` used
to write out trees with duplicate entries:
/*
* git-write-tree used to write out a nonsense tree that has
* entries with the same name, one blob and one tree. Make
* sure we do not have duplicate entries.
*/
Here's the setup:
tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6
blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689
$ git ls-tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 hello
100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 hello.c
040000 tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6 hello
$ git ls-tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6
100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 hello
# '%' here indicates that there is no newline at the end of the object
$ git cat-file blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689
Hello World%
fsck currently passes when being passed these objects despite c63d067eae having
a duplicate entry. This seems to be due to the duplicate entry check in
`fsck_tree` only checking if adjacent entries are duplicates but due to the
sorting rules its unable to realize that there is both a blob and a tree with
the name "hello".
I was even able to produce a commit and push it to Github[1] (which
didn't complain)
$ git show --pretty=raw 62f1ff6e109f8b77edd7eeb65f6634faa76a93b2
commit 62f1ff6e109f8b77edd7eeb65f6634faa76a93b2
tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
author Brandon Williams <bwilliams.eng@gmail.com> 1589004242 -0700
committer Brandon Williams <bwilliams.eng@gmail.com> 1589004242 -0700
hello
Checking out that commit leaves your working directory in a somewhat
broken and 'unclean' state (although Github's UI seems to be able to handle
displaying it properly).
Am I correct in assuming that this object is indeed invalid and should be
rejected by fsck?
-Brandon
[1]: https://github.com/bmwill/invalid-commit
next reply other threads:[~2020-05-09 6:19 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-09 6:19 Brandon Williams [this message]
2020-05-09 10:16 ` invalid tree and commit object René Scharfe
2020-05-09 7:16 ` Johannes Schindelin
2020-05-09 11:51 ` René Scharfe
2020-05-09 17:28 ` Junio C Hamano
2020-05-09 19:24 ` René Scharfe
2020-05-09 20:27 ` Junio C Hamano
2020-05-10 9:07 ` René Scharfe
2020-05-10 16:12 ` René Scharfe
2020-05-11 16:25 ` Junio C Hamano
2020-05-13 16:27 ` Brandon Williams
2020-05-21 9:51 ` René Scharfe
2020-05-21 9:52 ` [PATCH 1/4] fsck: fix a typo in a comment René Scharfe
2020-05-21 10:10 ` Denton Liu
2020-05-21 11:15 ` René Scharfe
2020-05-21 9:52 ` [PATCH 2/4] t1450: increase test coverage of in-tree d/f detection René Scharfe
2020-05-21 10:20 ` Denton Liu
2020-05-21 13:31 ` René Scharfe
2020-05-21 18:01 ` Junio C Hamano
2020-05-21 9:52 ` [PATCH 3/4] t1450: demonstrate undetected in-tree d/f conflict René Scharfe
2020-05-21 9:52 ` [PATCH 4/4] fsck: detect more in-tree d/f conflicts René Scharfe
2020-05-10 16:37 ` invalid tree and commit object Junio C Hamano
2020-05-21 9:51 ` René Scharfe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CALN-EhTpiLERuB16-WPZaLub6GdaRHJW8xDeaOEqSFtKe0kCYw@mail.gmail.com \
--to=bwilliamseng@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).