From: Jeff King <peff@peff.net>
To: Derrick Stolee <stolee@gmail.com>
Cc: Taylor Blau <me@ttaylorr.com>, git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH 3/3] commit-graph.c: handle corrupt/missing trees
Date: Fri, 6 Sep 2019 13:37:07 -0400 [thread overview]
Message-ID: <20190906173707.GF23181@sigill.intra.peff.net> (raw)
In-Reply-To: <ad34871d-bdc4-5b52-eff4-da03c6be1004@gmail.com>
On Fri, Sep 06, 2019 at 12:51:56PM -0400, Derrick Stolee wrote:
> > This one in theory benefits lots of other callsites, too, since it means
> > we'll actually return NULL instead of nonsense like "8". But grepping
> > around for calls to this function, I found literally zero of them
> > actually bother checking for a NULL result. So there are probably dozens
> > of similar segfaults waiting to happen in other code paths.
> > Discouraging.
> >
> > This is sort-of attributable to my 834876630b (get_commit_tree(): return
> > NULL for broken tree, 2019-04-09). Before then it was a BUG(). However,
> > that state was relatively short-lived. Before 7b8a21dba1 (commit-graph:
> > lazy-load trees for commits, 2018-04-06), we'd have similarly returned
> > NULL (and anyway, BUG() is clearly wrong since it's a data error).
> >
> > None of which argues against your patches, but it's kind of sad that the
> > issue is present in so many code paths. I wonder if we could be handling
> > this in a more central way, but I don't see how short of dying.
>
> This is due to the mechanical conversion from using commit->tree->oid to
> get_commit_tree_oid(commit). Those consumers were not checking if the
> tree pointer was NULL, either, but they probably assumed that the
> parse_commit() call would have failed earlier. Now that we are using this
> method (for performance reasons to avoid creating too many 'struct tree's)
> it makes sense to convert some of them to checking the return value more
> carefully.
Right, none of this is new at all. We have historically been very loose
about assuming that things like commit->tree were valid. And they
_usually_ are. Even if we're missing the object on disk, lookup_tree()
is happy to assign it a struct (unless the object was already seen as
another type!). I think turning that case into an error from
parse_commit() would cover a lot of cases easily, without forcing each
caller to check for NULL.
-Peff
next prev parent reply other threads:[~2019-09-06 17:37 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-05 22:04 [PATCH 0/3] commit-graph: harden against various corruptions Taylor Blau
2019-09-05 22:04 ` [PATCH 1/3] t/t5318: introduce failing 'git commit-graph write' tests Taylor Blau
2019-09-06 16:48 ` Derrick Stolee
2019-09-05 22:04 ` [PATCH 2/3] commit-graph.c: handle commit parsing errors Taylor Blau
2019-09-05 22:04 ` [PATCH 3/3] commit-graph.c: handle corrupt/missing trees Taylor Blau
2019-09-06 6:19 ` Jeff King
2019-09-06 15:42 ` Taylor Blau
2019-09-06 17:34 ` Jeff King
2019-09-06 16:51 ` Derrick Stolee
2019-09-06 17:37 ` Jeff King [this message]
2019-09-06 16:57 ` Junio C Hamano
2019-09-06 17:11 ` Junio C Hamano
2019-09-06 17:30 ` Jeff King
2019-09-06 17:28 ` Jeff King
2019-09-09 17:55 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190906173707.GF23181@sigill.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=me@ttaylorr.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).