git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: David Turner <dturner@twopensource.com>,
	git mailing list <git@vger.kernel.org>,
	Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH] fsck: return non-zero status on missing ref tips
Date: Fri, 12 Sep 2014 00:29:39 -0400	[thread overview]
Message-ID: <20140912042939.GA5968@peff.net> (raw)
In-Reply-To: <20140912033830.GA5507@peff.net>

[+cc mhagger for packed-refs wisdom]

On Thu, Sep 11, 2014 at 11:38:30PM -0400, Jeff King wrote:

> Fsck tries hard to detect missing objects, and will complain
> (and exit non-zero) about any inter-object links that are
> missing. However, it will not exit non-zero for any missing
> ref tips, meaning that a severely broken repository may
> still pass "git fsck && echo ok".
> 
> The problem is that we use for_each_ref to iterate over the
> ref tips, which hides broken tips. It does at least print an
> error from the refs.c code, but fsck does not ever see the
> ref and cannot note the problem in its exit code. We can solve
> this by using for_each_rawref and noting the error ourselves.

There's a possibly related problem with packed-refs that I noticed while
looking at this.

When we call pack-refs, it will refuse to pack any broken loose refs,
and leave them loose. Which is sane. But when we delete a ref, we need
to rewrite the packed-refs file, and we omit any broken packed refs. We
wouldn't have written a broken entry, but we may get broken later (i.e.,
the tip object may go missing after the packed-refs file is written).

If we only have a packed copy of "refs/heads/master" and it is broken,
then deleting any _other_ unrelated ref will cause refs/heads/master to
be dropped from the packed-refs file entirely. We get an error message,
but that's easy to miss, and the pointer to master's sha1 is lost
forever.

This test (on top of the patch I just sent) demonstrates:

diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index b52397a..b0f4545 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -317,6 +317,7 @@ create_repo_missing () {
 		git commit -m two &&
 		unrelated=$(echo unrelated | git hash-object --stdin -w) &&
 		git tag -m foo tag $unrelated &&
+		git pack-refs --all --prune &&
 		sha1=$(git rev-parse --verify "$1") &&
 		path=$(echo $sha1 | sed 's|..|&/|') &&
 		rm .git/objects/$path
@@ -358,4 +359,10 @@ test_expect_success 'fsck notices ref pointing to missing tag' '
 	test_must_fail git -C missing fsck
 '
 
+test_expect_success 'ref deletion does not eat broken refs' '
+	create_repo_missing HEAD &&
+	git -C missing update-ref -d refs/tags/tag &&
+	test_must_fail git -C missing fsck
+'
+
 test_done


The fsck will succeed even though "master" should be broken, because
we appear to have no refs at all.

The fault is in curate_packed_ref_fn, which drops crufty from
packed-refs as we repack. That in turn is representing an older:

         if (!ref_resolves_to_object(entry))
                 return 0; /* Skip broken refs */

which comes from 624cac3 (refs: change the internal reference-iteration
API, 2013-04-22). But that was just maintaining the existing behavior,
which was using a do_for_each_ref iteration without INCLUDE_BROKEN. So I
think this problem has a similar root, though the fix is now slightly
different.

I am tempted to say that we do not need to do curate_each_ref_fn at all.
Any entry with a broken sha1 is either:

  1. A truly broken ref, in which case we should make sure to keep it
     (i.e., it is not cruft at all).

  2. A crufty entry that has been replaced by a loose reference that has
     not yet been packed. Such a crufty entry may point to broken
     objects, and that is OK.

In case 2, we _could_ delete the cruft. But I do not think we need to.
The loose ref will take precedence to anybody who actually does a ref
lookup, so the cruft is not hurting anybody.

Dropping curate_packed_ref_fn (as below) fixes the test above. And
miraculously does not even seem to conflict with ref patches in pu. :)

Am I missing any case that it is actually helping?

diff --git a/refs.c b/refs.c
index a7853cc..83c2bf7 100644
--- a/refs.c
+++ b/refs.c
@@ -2484,70 +2484,11 @@ int pack_refs(unsigned int flags)
 }
 
 /*
- * If entry is no longer needed in packed-refs, add it to the string
- * list pointed to by cb_data.  Reasons for deleting entries:
- *
- * - Entry is broken.
- * - Entry is overridden by a loose ref.
- * - Entry does not point at a valid object.
- *
- * In the first and third cases, also emit an error message because these
- * are indications of repository corruption.
- */
-static int curate_packed_ref_fn(struct ref_entry *entry, void *cb_data)
-{
-	struct string_list *refs_to_delete = cb_data;
-
-	if (entry->flag & REF_ISBROKEN) {
-		/* This shouldn't happen to packed refs. */
-		error("%s is broken!", entry->name);
-		string_list_append(refs_to_delete, entry->name);
-		return 0;
-	}
-	if (!has_sha1_file(entry->u.value.sha1)) {
-		unsigned char sha1[20];
-		int flags;
-
-		if (read_ref_full(entry->name, sha1, 0, &flags))
-			/* We should at least have found the packed ref. */
-			die("Internal error");
-		if ((flags & REF_ISSYMREF) || !(flags & REF_ISPACKED)) {
-			/*
-			 * This packed reference is overridden by a
-			 * loose reference, so it is OK that its value
-			 * is no longer valid; for example, it might
-			 * refer to an object that has been garbage
-			 * collected.  For this purpose we don't even
-			 * care whether the loose reference itself is
-			 * invalid, broken, symbolic, etc.  Silently
-			 * remove the packed reference.
-			 */
-			string_list_append(refs_to_delete, entry->name);
-			return 0;
-		}
-		/*
-		 * There is no overriding loose reference, so the fact
-		 * that this reference doesn't refer to a valid object
-		 * indicates some kind of repository corruption.
-		 * Report the problem, then omit the reference from
-		 * the output.
-		 */
-		error("%s does not point to a valid object!", entry->name);
-		string_list_append(refs_to_delete, entry->name);
-		return 0;
-	}
-
-	return 0;
-}
-
-/*
  * Must be called with packed refs already locked (and sorted)
  */
 static int repack_without_refs(const char **refnames, int n, struct strbuf *err)
 {
 	struct ref_dir *packed;
-	struct string_list refs_to_delete = STRING_LIST_INIT_DUP;
-	struct string_list_item *ref_to_delete;
 	int i, ret;
 
 	/* Look for a packed ref */
@@ -2561,15 +2502,6 @@ static int repack_without_refs(const char **refnames, int n, struct strbuf *err)
 	for (i = 0; i < n; i++)
 		remove_entry(packed, refnames[i]);
 
-	/* Remove any other accumulated cruft */
-	do_for_each_entry_in_dir(packed, 0, curate_packed_ref_fn, &refs_to_delete);
-	for_each_string_list_item(ref_to_delete, &refs_to_delete) {
-		if (remove_entry(packed, ref_to_delete->string) == -1) {
-			rollback_packed_refs();
-			die("internal error");
-		}
-	}
-
 	/* Write what remains */
 	ret = commit_packed_refs();
 	if (ret && err)

  reply	other threads:[~2014-09-12  4:29 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-27 22:10 git fsck exit code? David Turner
2014-08-29 18:53 ` Jeff King
2014-08-29 19:21   ` Junio C Hamano
2014-08-29 20:18     ` David Turner
2014-08-29 20:31       ` Jeff King
2014-08-29 20:47         ` Junio C Hamano
2014-09-09 22:03         ` [PATCH] fsck: exit with non-zero status upon error from fsck_obj() Junio C Hamano
2014-09-09 22:07           ` Jeff King
2014-09-12  3:38             ` [PATCH] fsck: return non-zero status on missing ref tips Jeff King
2014-09-12  4:29               ` Jeff King [this message]
2014-09-12  4:38                 ` Jeff King
2014-09-12  4:58                 ` Junio C Hamano
2014-09-12  5:12                   ` Jeff King
2014-09-15 14:42                   ` Michael Haggerty
2014-09-15 14:57                 ` Michael Haggerty
2014-09-09 22:21           ` [PATCH] fsck: exit with non-zero status upon error from fsck_obj() Junio C Hamano
2014-09-09 22:29             ` Jonathan Nieder
2014-08-31 18:54       ` git fsck exit code? Øyvind A. Holm
2014-09-01 11:54         ` Øyvind A. Holm
2014-09-01 18:17         ` David Turner
2014-09-09 22:09           ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140912042939.GA5968@peff.net \
    --to=peff@peff.net \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).