git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Brandon Williams <bwilliamseng@gmail.com>
To: git <git@vger.kernel.org>
Cc: Jeff King <peff@peff.net>
Subject: invalid tree and commit object
Date: Fri, 8 May 2020 23:19:38 -0700	[thread overview]
Message-ID: <CALN-EhTpiLERuB16-WPZaLub6GdaRHJW8xDeaOEqSFtKe0kCYw@mail.gmail.com> (raw)

Hey!

Its been a minute since I've written to the list but I was recently looking
into the rules fsck uses to identify valid or invalid objects and I believe I
found a case that I believe fsck is currently missing. One of the things fsck
looks for when validating a tree object is that it doesn't contain any
duplicate entries. It even has a nice comment about how `git-write-tree` used
to write out trees with duplicate entries:

    /*
     * git-write-tree used to write out a nonsense tree that has
     * entries with the same name, one blob and one tree.  Make
     * sure we do not have duplicate entries.
     */

Here's the setup:
    tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
    tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6
    blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689

    $ git ls-tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
    100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689    hello
    100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689    hello.c
    040000 tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6    hello

    $ git ls-tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6
    100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689    hello

    # '%' here indicates that there is no newline at the end of the object
    $ git cat-file blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689
    Hello World%

fsck currently passes when being passed these objects despite c63d067eae having
a duplicate entry. This seems to be due to the duplicate entry check in
`fsck_tree` only checking if adjacent entries are duplicates but due to the
sorting rules its unable to realize that there is both a blob and a tree with
the name "hello".

I was even able to produce a commit and push it to Github[1] (which
didn't complain)

    $ git show --pretty=raw 62f1ff6e109f8b77edd7eeb65f6634faa76a93b2
    commit 62f1ff6e109f8b77edd7eeb65f6634faa76a93b2
    tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
    author Brandon Williams <bwilliams.eng@gmail.com> 1589004242 -0700
    committer Brandon Williams <bwilliams.eng@gmail.com> 1589004242 -0700

        hello

Checking out that commit leaves your working directory in a somewhat
broken and 'unclean' state (although Github's UI seems to be able to handle
displaying it properly).

Am I correct in assuming that this object is indeed invalid and should be
rejected by fsck?

-Brandon

[1]: https://github.com/bmwill/invalid-commit

             reply	other threads:[~2020-05-09  6:19 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-09  6:19 Brandon Williams [this message]
2020-05-09 10:16 ` invalid tree and commit object René Scharfe
2020-05-09  7:16   ` Johannes Schindelin
2020-05-09 11:51     ` René Scharfe
2020-05-09 17:28   ` Junio C Hamano
2020-05-09 19:24     ` René Scharfe
2020-05-09 20:27       ` Junio C Hamano
2020-05-10  9:07         ` René Scharfe
2020-05-10 16:12           ` René Scharfe
2020-05-11 16:25             ` Junio C Hamano
2020-05-13 16:27               ` Brandon Williams
2020-05-21  9:51               ` René Scharfe
2020-05-21  9:52               ` [PATCH 1/4] fsck: fix a typo in a comment René Scharfe
2020-05-21 10:10                 ` Denton Liu
2020-05-21 11:15                 ` René Scharfe
2020-05-21  9:52               ` [PATCH 2/4] t1450: increase test coverage of in-tree d/f detection René Scharfe
2020-05-21 10:20                 ` Denton Liu
2020-05-21 13:31                   ` René Scharfe
2020-05-21 18:01                     ` Junio C Hamano
2020-05-21  9:52               ` [PATCH 3/4] t1450: demonstrate undetected in-tree d/f conflict René Scharfe
2020-05-21  9:52               ` [PATCH 4/4] fsck: detect more in-tree d/f conflicts René Scharfe
2020-05-10 16:37           ` invalid tree and commit object Junio C Hamano
2020-05-21  9:51             ` René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALN-EhTpiLERuB16-WPZaLub6GdaRHJW8xDeaOEqSFtKe0kCYw@mail.gmail.com \
    --to=bwilliamseng@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).