From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH] has_uncommitted_changes(): fall back to empty tree
Date: Wed, 11 Jul 2018 16:41:02 +0200 [thread overview]
Message-ID: <87in5ldedd.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20180711141406.GE23835@sigill.intra.peff.net>
On Wed, Jul 11 2018, Jeff King wrote:
> On Wed, Jul 11, 2018 at 09:34:02AM -0400, Jeff King wrote:
>
>> I do worry that other callers of run_diff_index() might have similar
>> problems, though. Grepping around, the other callers seem to fall into
>> one of three categories:
>>
>> - they resolve the object themselves and put it in the pending list
>> (and often fallback to the empty tree, which is more or less what the
>> patch above is doing)
>>
>> - they resolve the object themselves and avoid calling run_diff_index()
>> if it's not valid
>>
>> - they use setup_revisions(), which will barf on the broken object
>>
>> So I think this may be sufficient. We probably should also add an
>> assertion to run_diff_index(), since that's better than segfaulting.
>
> Here's a patch to do that. I tweaked it slightly from what I showed
> earlier to use the empty tree, which matches what other code (e.g.,
> git-diff) would do.
>
> -- >8 --
> Subject: has_uncommitted_changes(): fall back to empty tree
>
> If has_uncommitted_changes() can't resolve HEAD (e.g.,
> because it's unborn or corrupt), then we end up calling
> run_diff_index() with an empty revs.pending array. This
> causes a segfault, as run_diff_index() blindly looks at the
> first pending item.
>
> Fixing this raises a question of fault: should
> run_diff_index() handle this case, or is the caller wrong to
> pass an empty pending list?
>
> Looking at the other callers of run_diff_index(), they
> handle this in one of three ways:
>
> - they resolve the object themselves, and avoid doing the
> diff if it's not valid
>
> - they resolve the object themselves, and fall back to the
> empty tree
>
> - they use setup_revisions(), which will die() if the
> object isn't valid
>
> Since this is the only broken caller, that argues that the
> fix should go there. Falling back to the empty tree makes
> sense here, as we'd claim uncommitted changes if and only if
> the index is non-empty. This may be a little funny in the
> case of corruption (the corrupt HEAD probably _isn't_
> empty), but:
>
> - we don't actually know the reason here that HEAD didn't
> resolve (the much more likely case is that we have an
> unborn HEAD, in which case the empty tree comparison is
> the right thing)
>
> - this matches how other code, like "git diff", behaves
>
> While we're thinking about it, let's add an assertion to
> run_diff_index(). It should always be passed a single
> object, and as this bug shows, it's easy to get it wrong
> (and an assertion is easier to hunt down than a segfault, or
> a quietly ignored extra tree).
>
> Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> diff-lib.c | 3 +++
> t/t5520-pull.sh | 12 ++++++++++++
> wt-status.c | 10 ++++++++++
> 3 files changed, 25 insertions(+)
>
> diff --git a/diff-lib.c b/diff-lib.c
> index a9f38eb5a3..732f684a49 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -520,6 +520,9 @@ int run_diff_index(struct rev_info *revs, int cached)
> struct object_array_entry *ent;
> uint64_t start = getnanotime();
>
> + if (revs->pending.nr != 1)
> + BUG("run_diff_index must be passed exactly one tree");
> +
> ent = revs->pending.objects;
> if (diff_cache(revs, &ent->item->oid, ent->name, cached))
> exit(128);
> diff --git a/t/t5520-pull.sh b/t/t5520-pull.sh
> index 59c4b778d3..68aa5f0340 100755
> --- a/t/t5520-pull.sh
> +++ b/t/t5520-pull.sh
> @@ -618,6 +618,18 @@ test_expect_success 'pull --rebase fails on unborn branch with staged changes' '
> )
> '
>
> +test_expect_success 'pull --rebase fails on corrupt HEAD' '
> + test_when_finished "rm -rf corrupt" &&
> + git init corrupt &&
> + (
> + cd corrupt &&
> + test_commit one &&
> + obj=$(git rev-parse --verify HEAD | sed "s#^..#&/#") &&
> + rm -f .git/objects/$obj &&
> + test_must_fail git pull --rebase
> + )
> +'
> +
> test_expect_success 'setup for detecting upstreamed changes' '
> mkdir src &&
> (cd src &&
> diff --git a/wt-status.c b/wt-status.c
> index d1c05145a4..d89c41ba10 100644
> --- a/wt-status.c
> +++ b/wt-status.c
> @@ -2340,7 +2340,17 @@ int has_uncommitted_changes(int ignore_submodules)
> if (ignore_submodules)
> rev_info.diffopt.flags.ignore_submodules = 1;
> rev_info.diffopt.flags.quick = 1;
> +
> add_head_to_pending(&rev_info);
> + if (!rev_info.pending.nr) {
> + /*
> + * We have no head (or it's corrupt); use the empty tree,
> + * which will complain if the index is non-empty.
> + */
> + struct tree *tree = lookup_tree(the_hash_algo->empty_tree);
> + add_pending_object(&rev_info, &tree->object, "");
> + }
> +
> diff_setup_done(&rev_info.diffopt);
> result = run_diff_index(&rev_info, 1);
> return diff_result_code(&rev_info.diffopt, result);
Thanks a lot. I've tested this and it looks good to me and fixes the
bug.
> [From upthread]: It took me a minute to reproduce this. It needs "pull
> --rebase" if you don't have that setup in your config.
Yeah, sorry about that. I tested without --rebase while simplifying the
testcase, but forgot that I have pull.rebase=true in my ~/.gitconfig, so
of course that didn't imply --no-rebase.
next prev parent reply other threads:[~2018-07-11 14:41 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-11 11:00 BUG: Segfault on "git pull" on "bad object HEAD" Ævar Arnfjörð Bjarmason
2018-07-11 13:34 ` Jeff King
2018-07-11 14:14 ` [PATCH] has_uncommitted_changes(): fall back to empty tree Jeff King
2018-07-11 14:41 ` Ævar Arnfjörð Bjarmason [this message]
2018-07-11 15:00 ` Jeff King
2018-07-11 17:09 ` BUG: Segfault on "git pull" on "bad object HEAD" Junio C Hamano
2018-07-11 15:56 ` Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87in5ldedd.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).