From: "René Scharfe" <l.s.r@web.de>
To: Brandon Williams <bwilliamseng@gmail.com>, git <git@vger.kernel.org>
Cc: Jeff King <peff@peff.net>
Subject: Re: invalid tree and commit object
Date: Sat, 9 May 2020 12:16:08 +0200 [thread overview]
Message-ID: <d963242a-72f3-7f42-7c95-ea5148f74804@web.de> (raw)
In-Reply-To: <CALN-EhTpiLERuB16-WPZaLub6GdaRHJW8xDeaOEqSFtKe0kCYw@mail.gmail.com>
Am 09.05.20 um 08:19 schrieb Brandon Williams:
> Here's the setup:
> tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
> tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6
> blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689
>
> $ git ls-tree c63d067eaeed0cbc68b7e4fdf40d267c6b152fe8
> 100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 hello
> 100644 blob 5e1c309dae7f45e0f39b1bf3ac3cd9db12e7d689 hello.c
> 040000 tree 6241ab2a5314798183b5c4ee8a7b0ccd12c651e6 hello
> Am I correct in assuming that this object is indeed invalid and should be
> rejected by fsck?
I'd say yes twice -- what good is a tree that you can't check out because
it contains a d/f conflict?
So I got curious if such trees might be in popular repos, wrote the patch
below and checked around a bit, but couldn't find any.
Is there a smarter way to check for duplicates? One that doesn't need
allocations? Perhaps by having a version of tree_entry_extract() that
seeks backwards somehow?
---
fsck.c | 10 ++++++++++
t/t1450-fsck.sh | 16 ++++++++++++++++
2 files changed, 26 insertions(+)
diff --git a/fsck.c b/fsck.c
index 087a7f1ffc..f47b35fee8 100644
--- a/fsck.c
+++ b/fsck.c
@@ -587,6 +587,8 @@ static int fsck_tree(const struct object_id *oid,
struct tree_desc desc;
unsigned o_mode;
const char *o_name;
+ struct string_list names = STRING_LIST_INIT_NODUP;
+ size_t nr;
if (init_tree_desc_gently(&desc, buffer, size)) {
retval += report(options, oid, OBJ_TREE, FSCK_MSG_BAD_TREE, "cannot be parsed as a tree");
@@ -680,8 +682,16 @@ static int fsck_tree(const struct object_id *oid,
o_mode = mode;
o_name = name;
+ string_list_append(&names, name);
}
+ nr = names.nr;
+ string_list_sort(&names);
+ string_list_remove_duplicates(&names, 0);
+ if (names.nr != nr)
+ has_dup_entries = 1;
+ string_list_clear(&names, 0);
+
if (has_null_sha1)
retval += report(options, oid, OBJ_TREE, FSCK_MSG_NULL_SHA1, "contains entries pointing to null sha1");
if (has_full_path)
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 449ebc5657..91a6e34f38 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -257,6 +257,22 @@ test_expect_success 'tree object with duplicate entries' '
test_i18ngrep "error in tree .*contains duplicate file entries" out
'
+test_expect_success 'tree object with dublicate names' '
+ test_when_finished "remove_object \$blob" &&
+ test_when_finished "remove_object \$tree" &&
+ test_when_finished "remove_object \$badtree" &&
+ blob=$(echo blob | git hash-object -w --stdin) &&
+ printf "100644 blob %s\t%s\n" $blob x.2 >tree &&
+ tree=$(git mktree <tree) &&
+ printf "100644 blob %s\t%s\n" $blob x.1 >badtree &&
+ printf "100644 blob %s\t%s\n" $blob x >>badtree &&
+ printf "040000 tree %s\t%s\n" $tree x >>badtree &&
+ badtree=$(git mktree <badtree) &&
+ test_must_fail git fsck 2>out &&
+ test_i18ngrep "$badtree" out &&
+ test_i18ngrep "error in tree .*contains duplicate file entries" out
+'
+
test_expect_success 'unparseable tree object' '
test_oid_cache <<-\EOF &&
junk sha1:twenty-bytes-of-junk
--
2.26.2
next prev parent reply other threads:[~2020-05-09 10:16 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-09 6:19 invalid tree and commit object Brandon Williams
2020-05-09 10:16 ` René Scharfe [this message]
2020-05-09 7:16 ` Johannes Schindelin
2020-05-09 11:51 ` René Scharfe
2020-05-09 17:28 ` Junio C Hamano
2020-05-09 19:24 ` René Scharfe
2020-05-09 20:27 ` Junio C Hamano
2020-05-10 9:07 ` René Scharfe
2020-05-10 16:12 ` René Scharfe
2020-05-11 16:25 ` Junio C Hamano
2020-05-13 16:27 ` Brandon Williams
2020-05-21 9:51 ` René Scharfe
2020-05-21 9:52 ` [PATCH 1/4] fsck: fix a typo in a comment René Scharfe
2020-05-21 10:10 ` Denton Liu
2020-05-21 11:15 ` René Scharfe
2020-05-21 9:52 ` [PATCH 2/4] t1450: increase test coverage of in-tree d/f detection René Scharfe
2020-05-21 10:20 ` Denton Liu
2020-05-21 13:31 ` René Scharfe
2020-05-21 18:01 ` Junio C Hamano
2020-05-21 9:52 ` [PATCH 3/4] t1450: demonstrate undetected in-tree d/f conflict René Scharfe
2020-05-21 9:52 ` [PATCH 4/4] fsck: detect more in-tree d/f conflicts René Scharfe
2020-05-10 16:37 ` invalid tree and commit object Junio C Hamano
2020-05-21 9:51 ` René Scharfe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d963242a-72f3-7f42-7c95-ea5148f74804@web.de \
--to=l.s.r@web.de \
--cc=bwilliamseng@gmail.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).