* BUG: Segfault on "git pull" on "bad object HEAD" @ 2018-07-11 11:00 Ævar Arnfjörð Bjarmason 2018-07-11 13:34 ` Jeff King 2018-07-11 15:56 ` Duy Nguyen 0 siblings, 2 replies; 7+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-07-11 11:00 UTC (permalink / raw) To: Git Mailing List; +Cc: Junio C Hamano This segfaults, but should print an error instead, have a repo with a corrupt HEAD: ( rm -rf /tmp/git && git clone --single-branch --branch todo git@github.com:git/git.git /tmp/git && echo 1111111111111111111111111111111111111111 >/tmp/git/.git/refs/heads/todo && git -C /tmp/git pull ) On this repository e.g. "git log" will print "fatal: bad object HEAD", but for some reason "git pull" makes it this far: $ git pull Segmentation fault The immediate reason is that in run_diff_index() we have this: ent = revs->pending.objects; And that in this case that's NULL: (gdb) bt #0 0x000055555565993f in run_diff_index (revs=0x7fffffffcb90, cached=1) at diff-lib.c:524 #1 0x00005555557633da in has_uncommitted_changes (ignore_submodules=1) at wt-status.c:2345 #2 0x00005555557634c9 in require_clean_work_tree (action=0x555555798f18 "pull with rebase", hint=0x555555798efb "please commit or stash them.", ignore_submodules=1, gently=0) at wt-status.c:2370 #3 0x00005555555dbdee in cmd_pull (argc=0, argv=0x7fffffffd868, prefix=0x0) at builtin/pull.c:885 #4 0x000055555556c9da in run_builtin (p=0x555555a2de50 <commands+1872>, argc=1, argv=0x7fffffffd868) at git.c:417 #5 0x000055555556cce2 in handle_builtin (argc=1, argv=0x7fffffffd868) at git.c:633 #6 0x000055555556ce8a in run_argv (argcp=0x7fffffffd71c, argv=0x7fffffffd710) at git.c:685 #7 0x000055555556d03f in cmd_main (argc=1, argv=0x7fffffffd868) at git.c:762 #8 0x0000555555611786 in main (argc=3, argv=0x7fffffffd858) at common-main.c:45 (gdb) p revs $4 = (struct rev_info *) 0x7fffffffcb90 (gdb) p revs->pending $5 = {nr = 0, alloc = 0, objects = 0x0} (gdb) This has been an issue since at least v2.8.0 (didn't test back further). I'm not familiar with the status / diff code, so I'm not sure where the assertion should be added. This came up in the wild due to a user with a corrupt repo (don't know how it got corrupt) trying "git pull" and seeing git segfault. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BUG: Segfault on "git pull" on "bad object HEAD" 2018-07-11 11:00 BUG: Segfault on "git pull" on "bad object HEAD" Ævar Arnfjörð Bjarmason @ 2018-07-11 13:34 ` Jeff King 2018-07-11 14:14 ` [PATCH] has_uncommitted_changes(): fall back to empty tree Jeff King 2018-07-11 17:09 ` BUG: Segfault on "git pull" on "bad object HEAD" Junio C Hamano 2018-07-11 15:56 ` Duy Nguyen 1 sibling, 2 replies; 7+ messages in thread From: Jeff King @ 2018-07-11 13:34 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List, Junio C Hamano On Wed, Jul 11, 2018 at 01:00:57PM +0200, Ævar Arnfjörð Bjarmason wrote: > This segfaults, but should print an error instead, have a repo with a > corrupt HEAD: > > ( > rm -rf /tmp/git && > git clone --single-branch --branch todo git@github.com:git/git.git /tmp/git && > echo 1111111111111111111111111111111111111111 >/tmp/git/.git/refs/heads/todo && > git -C /tmp/git pull > ) It took me a minute to reproduce this. It needs "pull --rebase" if you don't have that setup in your config. > The immediate reason is that in run_diff_index() we have this: > > ent = revs->pending.objects; > > And that in this case that's NULL: > > (gdb) bt > #0 0x000055555565993f in run_diff_index (revs=0x7fffffffcb90, cached=1) at diff-lib.c:524 > #1 0x00005555557633da in has_uncommitted_changes (ignore_submodules=1) at wt-status.c:2345 These two are the interesting functions. has_uncommitted_changes() calls add_head_to_pending(). So it could realize then that there is no valid HEAD to compare against. But as you note, it's run_diff_index() that blindly dereferences revs->pending.objects without seeing if it's non-empty. Normally setup_revisions() would barf on a bad object, but the manual add_head_to_pending() quietly returns (as it must for some cases, like unborn branches). So I feel like the right answer here is probably this: diff --git a/wt-status.c b/wt-status.c index d1c05145a4..5fcaa3d0f8 100644 --- a/wt-status.c +++ b/wt-status.c @@ -2340,7 +2340,16 @@ int has_uncommitted_changes(int ignore_submodules) if (ignore_submodules) rev_info.diffopt.flags.ignore_submodules = 1; rev_info.diffopt.flags.quick = 1; + add_head_to_pending(&rev_info); + if (!rev_info.pending.nr) { + /* + * We have no head (or it's corrupt), but the index is not + * unborn; declare it as uncommitted changes. + */ + return 1; + } + diff_setup_done(&rev_info.diffopt); result = run_diff_index(&rev_info, 1); return diff_result_code(&rev_info.diffopt, result); That does quietly paper over the corruption, but it does the conservative thing, and a follow-up "git status" would yield "bad object: HEAD". I do worry that other callers of run_diff_index() might have similar problems, though. Grepping around, the other callers seem to fall into one of three categories: - they resolve the object themselves and put it in the pending list (and often fallback to the empty tree, which is more or less what the patch above is doing) - they resolve the object themselves and avoid calling run_diff_index() if it's not valid - they use setup_revisions(), which will barf on the broken object So I think this may be sufficient. We probably should also add an assertion to run_diff_index(), since that's better than segfaulting. -Peff ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH] has_uncommitted_changes(): fall back to empty tree 2018-07-11 13:34 ` Jeff King @ 2018-07-11 14:14 ` Jeff King 2018-07-11 14:41 ` Ævar Arnfjörð Bjarmason 2018-07-11 17:09 ` BUG: Segfault on "git pull" on "bad object HEAD" Junio C Hamano 1 sibling, 1 reply; 7+ messages in thread From: Jeff King @ 2018-07-11 14:14 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List, Junio C Hamano On Wed, Jul 11, 2018 at 09:34:02AM -0400, Jeff King wrote: > I do worry that other callers of run_diff_index() might have similar > problems, though. Grepping around, the other callers seem to fall into > one of three categories: > > - they resolve the object themselves and put it in the pending list > (and often fallback to the empty tree, which is more or less what the > patch above is doing) > > - they resolve the object themselves and avoid calling run_diff_index() > if it's not valid > > - they use setup_revisions(), which will barf on the broken object > > So I think this may be sufficient. We probably should also add an > assertion to run_diff_index(), since that's better than segfaulting. Here's a patch to do that. I tweaked it slightly from what I showed earlier to use the empty tree, which matches what other code (e.g., git-diff) would do. -- >8 -- Subject: has_uncommitted_changes(): fall back to empty tree If has_uncommitted_changes() can't resolve HEAD (e.g., because it's unborn or corrupt), then we end up calling run_diff_index() with an empty revs.pending array. This causes a segfault, as run_diff_index() blindly looks at the first pending item. Fixing this raises a question of fault: should run_diff_index() handle this case, or is the caller wrong to pass an empty pending list? Looking at the other callers of run_diff_index(), they handle this in one of three ways: - they resolve the object themselves, and avoid doing the diff if it's not valid - they resolve the object themselves, and fall back to the empty tree - they use setup_revisions(), which will die() if the object isn't valid Since this is the only broken caller, that argues that the fix should go there. Falling back to the empty tree makes sense here, as we'd claim uncommitted changes if and only if the index is non-empty. This may be a little funny in the case of corruption (the corrupt HEAD probably _isn't_ empty), but: - we don't actually know the reason here that HEAD didn't resolve (the much more likely case is that we have an unborn HEAD, in which case the empty tree comparison is the right thing) - this matches how other code, like "git diff", behaves While we're thinking about it, let's add an assertion to run_diff_index(). It should always be passed a single object, and as this bug shows, it's easy to get it wrong (and an assertion is easier to hunt down than a segfault, or a quietly ignored extra tree). Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Jeff King <peff@peff.net> --- diff-lib.c | 3 +++ t/t5520-pull.sh | 12 ++++++++++++ wt-status.c | 10 ++++++++++ 3 files changed, 25 insertions(+) diff --git a/diff-lib.c b/diff-lib.c index a9f38eb5a3..732f684a49 100644 --- a/diff-lib.c +++ b/diff-lib.c @@ -520,6 +520,9 @@ int run_diff_index(struct rev_info *revs, int cached) struct object_array_entry *ent; uint64_t start = getnanotime(); + if (revs->pending.nr != 1) + BUG("run_diff_index must be passed exactly one tree"); + ent = revs->pending.objects; if (diff_cache(revs, &ent->item->oid, ent->name, cached)) exit(128); diff --git a/t/t5520-pull.sh b/t/t5520-pull.sh index 59c4b778d3..68aa5f0340 100755 --- a/t/t5520-pull.sh +++ b/t/t5520-pull.sh @@ -618,6 +618,18 @@ test_expect_success 'pull --rebase fails on unborn branch with staged changes' ' ) ' +test_expect_success 'pull --rebase fails on corrupt HEAD' ' + test_when_finished "rm -rf corrupt" && + git init corrupt && + ( + cd corrupt && + test_commit one && + obj=$(git rev-parse --verify HEAD | sed "s#^..#&/#") && + rm -f .git/objects/$obj && + test_must_fail git pull --rebase + ) +' + test_expect_success 'setup for detecting upstreamed changes' ' mkdir src && (cd src && diff --git a/wt-status.c b/wt-status.c index d1c05145a4..d89c41ba10 100644 --- a/wt-status.c +++ b/wt-status.c @@ -2340,7 +2340,17 @@ int has_uncommitted_changes(int ignore_submodules) if (ignore_submodules) rev_info.diffopt.flags.ignore_submodules = 1; rev_info.diffopt.flags.quick = 1; + add_head_to_pending(&rev_info); + if (!rev_info.pending.nr) { + /* + * We have no head (or it's corrupt); use the empty tree, + * which will complain if the index is non-empty. + */ + struct tree *tree = lookup_tree(the_hash_algo->empty_tree); + add_pending_object(&rev_info, &tree->object, ""); + } + diff_setup_done(&rev_info.diffopt); result = run_diff_index(&rev_info, 1); return diff_result_code(&rev_info.diffopt, result); -- 2.18.0.400.g702e398724 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] has_uncommitted_changes(): fall back to empty tree 2018-07-11 14:14 ` [PATCH] has_uncommitted_changes(): fall back to empty tree Jeff King @ 2018-07-11 14:41 ` Ævar Arnfjörð Bjarmason 2018-07-11 15:00 ` Jeff King 0 siblings, 1 reply; 7+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2018-07-11 14:41 UTC (permalink / raw) To: Jeff King; +Cc: Git Mailing List, Junio C Hamano On Wed, Jul 11 2018, Jeff King wrote: > On Wed, Jul 11, 2018 at 09:34:02AM -0400, Jeff King wrote: > >> I do worry that other callers of run_diff_index() might have similar >> problems, though. Grepping around, the other callers seem to fall into >> one of three categories: >> >> - they resolve the object themselves and put it in the pending list >> (and often fallback to the empty tree, which is more or less what the >> patch above is doing) >> >> - they resolve the object themselves and avoid calling run_diff_index() >> if it's not valid >> >> - they use setup_revisions(), which will barf on the broken object >> >> So I think this may be sufficient. We probably should also add an >> assertion to run_diff_index(), since that's better than segfaulting. > > Here's a patch to do that. I tweaked it slightly from what I showed > earlier to use the empty tree, which matches what other code (e.g., > git-diff) would do. > > -- >8 -- > Subject: has_uncommitted_changes(): fall back to empty tree > > If has_uncommitted_changes() can't resolve HEAD (e.g., > because it's unborn or corrupt), then we end up calling > run_diff_index() with an empty revs.pending array. This > causes a segfault, as run_diff_index() blindly looks at the > first pending item. > > Fixing this raises a question of fault: should > run_diff_index() handle this case, or is the caller wrong to > pass an empty pending list? > > Looking at the other callers of run_diff_index(), they > handle this in one of three ways: > > - they resolve the object themselves, and avoid doing the > diff if it's not valid > > - they resolve the object themselves, and fall back to the > empty tree > > - they use setup_revisions(), which will die() if the > object isn't valid > > Since this is the only broken caller, that argues that the > fix should go there. Falling back to the empty tree makes > sense here, as we'd claim uncommitted changes if and only if > the index is non-empty. This may be a little funny in the > case of corruption (the corrupt HEAD probably _isn't_ > empty), but: > > - we don't actually know the reason here that HEAD didn't > resolve (the much more likely case is that we have an > unborn HEAD, in which case the empty tree comparison is > the right thing) > > - this matches how other code, like "git diff", behaves > > While we're thinking about it, let's add an assertion to > run_diff_index(). It should always be passed a single > object, and as this bug shows, it's easy to get it wrong > (and an assertion is easier to hunt down than a segfault, or > a quietly ignored extra tree). > > Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> > Signed-off-by: Jeff King <peff@peff.net> > --- > diff-lib.c | 3 +++ > t/t5520-pull.sh | 12 ++++++++++++ > wt-status.c | 10 ++++++++++ > 3 files changed, 25 insertions(+) > > diff --git a/diff-lib.c b/diff-lib.c > index a9f38eb5a3..732f684a49 100644 > --- a/diff-lib.c > +++ b/diff-lib.c > @@ -520,6 +520,9 @@ int run_diff_index(struct rev_info *revs, int cached) > struct object_array_entry *ent; > uint64_t start = getnanotime(); > > + if (revs->pending.nr != 1) > + BUG("run_diff_index must be passed exactly one tree"); > + > ent = revs->pending.objects; > if (diff_cache(revs, &ent->item->oid, ent->name, cached)) > exit(128); > diff --git a/t/t5520-pull.sh b/t/t5520-pull.sh > index 59c4b778d3..68aa5f0340 100755 > --- a/t/t5520-pull.sh > +++ b/t/t5520-pull.sh > @@ -618,6 +618,18 @@ test_expect_success 'pull --rebase fails on unborn branch with staged changes' ' > ) > ' > > +test_expect_success 'pull --rebase fails on corrupt HEAD' ' > + test_when_finished "rm -rf corrupt" && > + git init corrupt && > + ( > + cd corrupt && > + test_commit one && > + obj=$(git rev-parse --verify HEAD | sed "s#^..#&/#") && > + rm -f .git/objects/$obj && > + test_must_fail git pull --rebase > + ) > +' > + > test_expect_success 'setup for detecting upstreamed changes' ' > mkdir src && > (cd src && > diff --git a/wt-status.c b/wt-status.c > index d1c05145a4..d89c41ba10 100644 > --- a/wt-status.c > +++ b/wt-status.c > @@ -2340,7 +2340,17 @@ int has_uncommitted_changes(int ignore_submodules) > if (ignore_submodules) > rev_info.diffopt.flags.ignore_submodules = 1; > rev_info.diffopt.flags.quick = 1; > + > add_head_to_pending(&rev_info); > + if (!rev_info.pending.nr) { > + /* > + * We have no head (or it's corrupt); use the empty tree, > + * which will complain if the index is non-empty. > + */ > + struct tree *tree = lookup_tree(the_hash_algo->empty_tree); > + add_pending_object(&rev_info, &tree->object, ""); > + } > + > diff_setup_done(&rev_info.diffopt); > result = run_diff_index(&rev_info, 1); > return diff_result_code(&rev_info.diffopt, result); Thanks a lot. I've tested this and it looks good to me and fixes the bug. > [From upthread]: It took me a minute to reproduce this. It needs "pull > --rebase" if you don't have that setup in your config. Yeah, sorry about that. I tested without --rebase while simplifying the testcase, but forgot that I have pull.rebase=true in my ~/.gitconfig, so of course that didn't imply --no-rebase. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] has_uncommitted_changes(): fall back to empty tree 2018-07-11 14:41 ` Ævar Arnfjörð Bjarmason @ 2018-07-11 15:00 ` Jeff King 0 siblings, 0 replies; 7+ messages in thread From: Jeff King @ 2018-07-11 15:00 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List, Junio C Hamano On Wed, Jul 11, 2018 at 04:41:02PM +0200, Ævar Arnfjörð Bjarmason wrote: > > [From upthread]: It took me a minute to reproduce this. It needs "pull > > --rebase" if you don't have that setup in your config. > > Yeah, sorry about that. I tested without --rebase while simplifying the > testcase, but forgot that I have pull.rebase=true in my ~/.gitconfig, so > of course that didn't imply --no-rebase. In the grand scheme of things, it was still a pretty good bug report. :) Having the stack trace meant that the head-scratching was "why am I not calling the problem function", rather than "why am I not reproducing", which is much easier to track down. -Peff ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BUG: Segfault on "git pull" on "bad object HEAD" 2018-07-11 13:34 ` Jeff King 2018-07-11 14:14 ` [PATCH] has_uncommitted_changes(): fall back to empty tree Jeff King @ 2018-07-11 17:09 ` Junio C Hamano 1 sibling, 0 replies; 7+ messages in thread From: Junio C Hamano @ 2018-07-11 17:09 UTC (permalink / raw) To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Git Mailing List Jeff King <peff@peff.net> writes: > So I feel like the right answer here is probably this: > > diff --git a/wt-status.c b/wt-status.c > index d1c05145a4..5fcaa3d0f8 100644 > --- a/wt-status.c > +++ b/wt-status.c > @@ -2340,7 +2340,16 @@ int has_uncommitted_changes(int ignore_submodules) > if (ignore_submodules) > rev_info.diffopt.flags.ignore_submodules = 1; > rev_info.diffopt.flags.quick = 1; > + > add_head_to_pending(&rev_info); > + if (!rev_info.pending.nr) { > + /* > + * We have no head (or it's corrupt), but the index is not > + * unborn; declare it as uncommitted changes. > + */ > + return 1; > + } > + > diff_setup_done(&rev_info.diffopt); > result = run_diff_index(&rev_info, 1); > return diff_result_code(&rev_info.diffopt, result); > > That does quietly paper over the corruption, but it does the > conservative thing, and a follow-up "git status" would yield "bad > object: HEAD". Sounds quite sensible. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: BUG: Segfault on "git pull" on "bad object HEAD" 2018-07-11 11:00 BUG: Segfault on "git pull" on "bad object HEAD" Ævar Arnfjörð Bjarmason 2018-07-11 13:34 ` Jeff King @ 2018-07-11 15:56 ` Duy Nguyen 1 sibling, 0 replies; 7+ messages in thread From: Duy Nguyen @ 2018-07-11 15:56 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Git Mailing List, Junio C Hamano On Wed, Jul 11, 2018 at 1:02 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > This segfaults, but should print an error instead, have a repo with a > corrupt HEAD: > > ( > rm -rf /tmp/git && > git clone --single-branch --branch todo git@github.com:git/git.git /tmp/git && > echo 1111111111111111111111111111111111111111 >/tmp/git/.git/refs/heads/todo && > git -C /tmp/git pull > ) > > On this repository e.g. "git log" will print "fatal: bad object HEAD", > but for some reason "git pull" makes it this far: > > $ git pull > Segmentation fault > > The immediate reason is that in run_diff_index() we have this: > > ent = revs->pending.objects; > > And that in this case that's NULL: Probably because add_head_to_pending() in has_uncommitted_change() does not add anything to the "pending" list because HEAD is broken. I think if we make add_head_to_pending() return a boolean, then we can check that if no HEAD is added, there's no point to run_diff_index and has_uncommitted_changes() can return 0 immediately. A new BUG() could still be added in run_diff_index() though, to check if revs->pending.nr is non-zero before attempting to access revs->pending.objects. > > (gdb) bt > #0 0x000055555565993f in run_diff_index (revs=0x7fffffffcb90, cached=1) at diff-lib.c:524 > #1 0x00005555557633da in has_uncommitted_changes (ignore_submodules=1) at wt-status.c:2345 > #2 0x00005555557634c9 in require_clean_work_tree (action=0x555555798f18 "pull with rebase", hint=0x555555798efb "please commit or stash them.", ignore_submodules=1, gently=0) at wt-status.c:2370 > #3 0x00005555555dbdee in cmd_pull (argc=0, argv=0x7fffffffd868, prefix=0x0) at builtin/pull.c:885 > #4 0x000055555556c9da in run_builtin (p=0x555555a2de50 <commands+1872>, argc=1, argv=0x7fffffffd868) at git.c:417 > #5 0x000055555556cce2 in handle_builtin (argc=1, argv=0x7fffffffd868) at git.c:633 > #6 0x000055555556ce8a in run_argv (argcp=0x7fffffffd71c, argv=0x7fffffffd710) at git.c:685 > #7 0x000055555556d03f in cmd_main (argc=1, argv=0x7fffffffd868) at git.c:762 > #8 0x0000555555611786 in main (argc=3, argv=0x7fffffffd858) at common-main.c:45 > (gdb) p revs > $4 = (struct rev_info *) 0x7fffffffcb90 > (gdb) p revs->pending > $5 = {nr = 0, alloc = 0, objects = 0x0} > (gdb) > > This has been an issue since at least v2.8.0 (didn't test back > further). I'm not familiar with the status / diff code, so I'm not sure > where the assertion should be added. > > This came up in the wild due to a user with a corrupt repo (don't know > how it got corrupt) trying "git pull" and seeing git segfault. -- Duy ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-07-11 17:09 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-07-11 11:00 BUG: Segfault on "git pull" on "bad object HEAD" Ævar Arnfjörð Bjarmason 2018-07-11 13:34 ` Jeff King 2018-07-11 14:14 ` [PATCH] has_uncommitted_changes(): fall back to empty tree Jeff King 2018-07-11 14:41 ` Ævar Arnfjörð Bjarmason 2018-07-11 15:00 ` Jeff King 2018-07-11 17:09 ` BUG: Segfault on "git pull" on "bad object HEAD" Junio C Hamano 2018-07-11 15:56 ` Duy Nguyen
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).