git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Mantas Mikulėnas" <grawity@gmail.com>, git@vger.kernel.org
Subject: [PATCH 3/4] fsck: check "tagger" lines
Date: Mon, 25 Feb 2013 13:46:17 -0500	[thread overview]
Message-ID: <20130225184617.GC14438@sigill.intra.peff.net> (raw)
In-Reply-To: <20130225183009.GB13912@sigill.intra.peff.net>

The fsck_tag function does not check very much about tags at
all; it just makes sure that we were able to load the
pointed-to object during the parse_tag phase. This does
check some basic things (the "object" line is OK, and the
pointed-to object exists with the expected type).

We did not, however, check the "tagger" line at all; if they
exist, we should feed them to fsck_ident (and it is OK if
they do not, as early versions of git did not include them).

This patch runs through the whole tag object during
fsck_tag, similar to what we do in fsck_commit. Some of
these checks are technically redundant with just checking
that parse_tag filled in the "tag->tagged" field. However:

  1. We have to parse through those lines anyway to get to
     the tagger line, so we need to sanity check our
     parsing.

  2. We can give more specific errors (e.g., report a
     malformed "object" line).

  3. Previously we depended on implementation details of
     parse_tag for our fsck (e.g., that it would never fill
     in "tagged" if the types did not match). Now our
     exhaustive checks are in one place, which makes it
     easier to verify exactly what fsck is checking.

Signed-off-by: Jeff King <peff@peff.net>
---
Unfortunately, this causes t1050 to fail in an interesting way. It runs
"index-pack --strict" while setting GIT_DIR=nonexistent. As a result,
when we try to read the tag object from disk, we can't find it. We
don't run into the same problem verifying commits and trees, because
those objects leave the raw object data in their "buffer" field.

I'm tempted to call what that test is doing insane, but I wonder if
there is another corner case with running "index-pack --strict" as part
of an incoming push or fetch. I haven't investigated that yet.

 fsck.c          | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 t/t1450-fsck.sh | 43 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 104 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 99c0497..20d55c4 100644
--- a/fsck.c
+++ b/fsck.c
@@ -340,15 +340,75 @@ static int fsck_tag(struct tag *tag, fsck_error error_func)
 	return 0;
 }
 
-static int fsck_tag(struct tag *tag, fsck_error error_func)
+static int fsck_tag_buffer(char *buf, struct tag *tag, fsck_error error_func)
 {
 	struct object *tagged = tag->tagged;
+	unsigned char sha1[20];
+	char *eol;
+
+	buf = skip_prefix(buf, "object ");
+	if (!buf)
+		return error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
+	if (get_sha1_hex(buf, sha1) || buf[40] != '\n')
+		return error_func(&tag->object, FSCK_ERROR, "invalid 'object' line format - bad sha1");
+	buf += 41;
 
+	/*
+	 * We already called parse_tag, so we don't have to bother looking up
+	 * the sha1 again.
+	 */
 	if (!tagged)
 		return error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
+
+	buf = skip_prefix(buf, "type ");
+	if (!buf)
+		return error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'type' line");
+	eol = strchr(buf, '\n');
+	if (!eol)
+		return error_func(&tag->object, FSCK_ERROR, "invalid format - truncation at 'type' line");
+	*eol = '\0';
+	if (type_from_string(buf) != tagged->type)
+		return error_func(&tag->object, FSCK_ERROR, "'type' line does not match type of tagged object");
+	*eol = '\n';
+
+	buf = skip_prefix(eol + 1, "tag ");
+	if (!buf)
+		return error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
+	eol = strchr(buf, '\n');
+	if (!eol)
+		return error_func(&tag->object, FSCK_ERROR, "invalid format - truncation at 'tag' line");
+
+	/*
+	 * A missing tagger is OK, as very old versions of git did not produce
+	 * such a line. But if we do have it, we should verify its contents.
+	 */
+	buf = skip_prefix(eol + 1, "tagger ");
+	if (buf) {
+		int err = fsck_ident(&buf, &tag->object, error_func);
+		if (err)
+			return err;
+	}
+
 	return 0;
 }
 
+static int fsck_tag(struct tag *tag, fsck_error error_func)
+{
+	char *buf;
+	unsigned long size;
+	enum object_type type;
+	int err;
+
+	buf = read_sha1_file(tag->object.sha1, &type, &size);
+	if (!buf)
+		return error_func(&tag->object, FSCK_ERROR, "could not read tag object");
+
+	err = fsck_tag_buffer(buf, tag, error_func);
+
+	free(buf);
+	return err;
+}
+
 int fsck_object(struct object *obj, int strict, fsck_error error_func)
 {
 	if (!obj)
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index d730734..3a3bce6 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -180,6 +180,49 @@ test_expect_success 'tag pointing to something else than its type' '
 	test_must_fail git fsck --tags
 '
 
+test_expect_success 'tag with missing tagger is OK' '
+	sha=$(echo blob | git hash-object -w --stdin) &&
+	test_when_finished "remove_object $sha" &&
+	cat >good-tag <<-EOF &&
+	object $sha
+	type blob
+	tag missing-tagger-ok
+
+	This is totally fine.
+	EOF
+
+	tag=$(git hash-object -t tag -w --stdin <good-tag) &&
+	test_when_finished "remove_object $tag" &&
+	git update-ref refs/tags/good $tag &&
+	test_when_finished "git update-ref -d refs/tags/good" &&
+	git fsck >out 2>&1 &&
+	>expect &&
+	test_cmp expect out
+'
+
+test_expect_success 'tag with bogus tagger is not OK' '
+	sha=$(echo blob | git hash-object -w --stdin) &&
+	test_when_finished "remove_object $sha" &&
+	cat >wrong-tag <<-EOF &&
+	object $sha
+	type blob
+	tag bogus-tagger
+	tagger T A Gger <tagger@example.com> Mon Feb 25 12:32:51 2013 -0500
+
+	That date is bogus (it should be an epoch + timezone). Any bogus ident
+	line should trigger, but this was chosen to match a breakage seen in
+	the wild.
+	EOF
+
+	tag=$(git hash-object -t tag -w --stdin <wrong-tag) &&
+	test_when_finished "remove_object $tag" &&
+	git update-ref refs/tags/wrong $tag &&
+	test_when_finished "git update-ref -d refs/tags/wrong" &&
+	git fsck --tags >out 2>&1 &&
+	cat out &&
+	grep "error in tag $tag.* - bad date" out
+'
+
 test_expect_success 'cleaned up' '
 	git fsck >actual 2>&1 &&
 	test_cmp empty actual
-- 
1.8.1.4.4.g265d2fa

  parent reply	other threads:[~2013-02-25 18:46 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-22 22:30 Crashes while trying to show tag objects with bad timestamps Mantas Mikulėnas
2013-02-22 22:46 ` Jeff King
2013-02-22 22:53   ` Junio C Hamano
2013-02-22 23:04     ` Jeff King
2013-02-22 23:14       ` Mantas Mikulėnas
2013-02-25 18:21         ` Jeff King
2013-02-22 23:20       ` Junio C Hamano
2013-02-25 18:30         ` Jeff King
2013-02-25 18:38           ` [PATCH 1/4] handle malformed dates in ident lines Jeff King
2013-02-25 18:39           ` [PATCH/RFC 2/4] skip_prefix: return a non-const pointer Jeff King
2013-02-25 18:46           ` Jeff King [this message]
2013-02-25 18:50           ` [PATCH 4/4] cat-file: print tags raw for "cat-file -p" Jeff King
2013-02-25 19:33             ` Mantas Mikulėnas
2013-02-22 23:01 ` [RFC/PATCH] hash-object doc: "git hash-object -w" can write invalid objects Jonathan Nieder
2013-02-22 23:07   ` Junio C Hamano
2013-02-22 23:09   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130225184617.GC14438@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=grawity@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).