git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: peff@peff.net, gitster@pobox.com
Subject: [PATCH 6/7] rev-list: let traversal die when --missing is not in use
Date: Thu, 4 Apr 2019 20:37:54 -0700	[thread overview]
Message-ID: <a3a80b4b2a988eb65d85a5acd54c584d047073c7.1554435033.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1554435033.git.me@ttaylorr.com>

From: Jeff King <peff@peff.net>

Commit 7c0fe330d5 (rev-list: handle missing tree objects properly,
2018-10-05) taught the traversal machinery used by git-rev-list to
ignore missing trees, so that rev-list could handle them itself.

However, it does so only by checking via oid_object_info_extended() that
the object exists at all. This can miss several classes of errors that
were previously detected by rev-list:

  - type mismatches (e.g., we expected a tree but got a blob)

  - failure to read the object data (e.g., due to bitrot on disk)

This is especially important because we use "rev-list --objects" as our
connectivity check to admit new objects to the repository, and it will
now miss these cases (though the bitrot one is less important here,
because we'd typically have just hashed and stored the object).

There are a few options to fix this:

 1. we could check these properties in rev-list when we do the existence
    check. This is probably too expensive in practice (perhaps even for
    a type check, but definitely for checking the whole content again,
    which implies loading each object into memory twice).

 2. teach the traversal machinery to differentiate between a missing
    object, and one that could not be loaded as expected. This probably
    wouldn't be too hard to detect type mismatches, but detecting bitrot
    versus a truly missing object would require deep changes to the
    object-loading code.

 3. have the traversal machinery communicate the failure to the caller,
    so that it can decide how to proceed without re-evaluting the object
    itself.

Of those, I think (3) is probably the best path forward. However, this
patch does none of them. In the name of expediently fixing the
regression to a normal "rev-list --objects" that we use for connectivity
checks, this simply restores the pre-7c0fe330d5 behavior of having the
traversal die as soon as it fails to load a tree (when --missing is set
to MA_ERROR, which is the default).

Note that we can't get rid of the object-existence check in
finish_object(), because this also handles blobs (which are not
otherwise checked at all by the traversal code).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/rev-list.c                     | 4 +++-
 t/t6102-rev-list-unexpected-objects.sh | 4 ++--
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 425a5774db..9f31837d30 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -379,7 +379,6 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 	repo_init_revisions(the_repository, &revs, prefix);
 	revs.abbrev = DEFAULT_ABBREV;
 	revs.commit_format = CMIT_FMT_UNSPECIFIED;
-	revs.do_not_die_on_missing_tree = 1;
 
 	/*
 	 * Scan the argument list before invoking setup_revisions(), so that we
@@ -409,6 +408,9 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix)
 		}
 	}
 
+	if (arg_missing_action)
+		revs.do_not_die_on_missing_tree = 1;
+
 	argc = setup_revisions(argc, argv, &revs, &s_r_opt);
 
 	memset(&info, 0, sizeof(info));
diff --git a/t/t6102-rev-list-unexpected-objects.sh b/t/t6102-rev-list-unexpected-objects.sh
index 30976385a8..c8d4b31f8f 100755
--- a/t/t6102-rev-list-unexpected-objects.sh
+++ b/t/t6102-rev-list-unexpected-objects.sh
@@ -29,7 +29,7 @@ test_expect_success 'setup unexpected non-tree entry' '
 	broken_tree="$(git hash-object -w --literally -t tree broken-tree)"
 '
 
-test_expect_failure 'traverse unexpected non-tree entry (lone)' '
+test_expect_success 'traverse unexpected non-tree entry (lone)' '
 	test_must_fail git rev-list --objects $broken_tree
 '
 
@@ -64,7 +64,7 @@ test_expect_success 'setup unexpected non-tree root' '
 		broken-commit)"
 '
 
-test_expect_failure 'traverse unexpected non-tree root (lone)' '
+test_expect_success 'traverse unexpected non-tree root (lone)' '
 	test_must_fail git rev-list --objects $broken_commit
 '
 
-- 
2.21.0.203.g358da99528


  parent reply	other threads:[~2019-04-05  3:37 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-05  3:37 [PATCH 0/7] harden unexpected object types checks Taylor Blau
2019-04-05  3:37 ` [PATCH 1/7] t: move 'hex2oct' into test-lib-functions.sh Taylor Blau
2019-04-05  3:37 ` [PATCH 2/7] t: introduce tests for unexpected object types Taylor Blau
2019-04-05 10:50   ` SZEDER Gábor
2019-04-05 18:24     ` Jeff King
2019-04-05 18:42       ` SZEDER Gábor
2019-04-05 18:52         ` Jeff King
2019-04-07 21:00           ` Ævar Arnfjörð Bjarmason
2019-04-09  2:29             ` Taylor Blau
2019-04-09  9:14               ` Ævar Arnfjörð Bjarmason
2019-04-10  1:59                 ` Taylor Blau
2019-04-08  5:27           ` Junio C Hamano
2019-04-05 19:25       ` Eric Sunshine
2019-04-05 20:53         ` Jeff King
2019-04-06  5:33           ` Taylor Blau
2019-04-08  6:44         ` Junio C Hamano
2019-04-09  2:30           ` Taylor Blau
2019-04-09  3:28             ` Eric Sunshine
2019-04-09  5:08               ` Taylor Blau
2019-04-09  8:02                 ` Eric Sunshine
2019-04-10  1:54                   ` Taylor Blau
2019-04-06  5:31       ` Taylor Blau
2019-04-05 18:31   ` Jeff King
2019-04-06  5:23     ` Taylor Blau
2019-04-05  3:37 ` [PATCH 3/7] list-objects.c: handle unexpected non-blob entries Taylor Blau
2019-04-05  3:37 ` [PATCH 4/7] list-objects.c: handle unexpected non-tree entries Taylor Blau
2019-04-05  3:37 ` [PATCH 5/7] get_commit_tree(): return NULL for broken tree Taylor Blau
2019-04-05  3:37 ` Taylor Blau [this message]
2019-04-05 18:41   ` [PATCH 6/7] rev-list: let traversal die when --missing is not in use Jeff King
2019-04-06  5:36     ` Taylor Blau
2019-04-07 13:41       ` Jeff King
2019-04-09  2:11         ` Taylor Blau
2019-04-05  3:37 ` [PATCH 7/7] rev-list: detect broken root trees Taylor Blau
2019-04-10  2:13 ` [PATCH v2 0/7] harden unexpected object types checks Taylor Blau
2019-04-10  2:13   ` [PATCH v2 1/7] t: move 'hex2oct' into test-lib-functions.sh Taylor Blau
2019-04-10  2:13   ` [PATCH v2 2/7] t: introduce tests for unexpected object types Taylor Blau
2019-04-10  2:13   ` [PATCH v2 3/7] list-objects.c: handle unexpected non-blob entries Taylor Blau
2019-04-10  2:13   ` [PATCH v2 4/7] list-objects.c: handle unexpected non-tree entries Taylor Blau
2019-04-10  2:13   ` [PATCH v2 5/7] get_commit_tree(): return NULL for broken tree Taylor Blau
2019-04-10  2:13   ` [PATCH v2 6/7] rev-list: let traversal die when --missing is not in use Taylor Blau
2019-04-10  2:13   ` [PATCH v2 7/7] rev-list: detect broken root trees Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a3a80b4b2a988eb65d85a5acd54c584d047073c7.1554435033.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).