git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Ramsay Jones <ramsay@ramsayjones.plus.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Jason@zx2c4.com, GIT Mailing-list <git@vger.kernel.org>
Subject: Re: [PATCH] fsck: check skiplist for object in fsck_blob()
Date: Thu, 28 Jun 2018 13:45:01 -0400	[thread overview]
Message-ID: <20180628174501.GC31766@sigill.intra.peff.net> (raw)
In-Reply-To: <0a18acbd-0124-1c92-0046-05b8b035dd28@ramsayjones.plus.com>

On Thu, Jun 28, 2018 at 05:56:18PM +0100, Ramsay Jones wrote:

> > Yeah, this solution seems sensible. Given that we would never report any
> > error for that blob, there is no point in even looking at it. I wonder
> > if we ought to do the same for other types, too. Is there any point in
> > opening a tree that is in the skiplist?
> 
> Note that the 'blob' object has already been 'loaded' at this
> point anyway (and the basic 'object' checks have been done).

Yeah, you're right. If we wanted to avoid that, we should prevent it
from entering the gitmodules_found list in the first place.

Of course, we'd generally still load it once anyway in order to check
the sha1. So really, the most we could do is prevent it from being
loaded a _second_ time for no reason during the fsck_finish() stage.

But for the reasons I gave in the fsck_finish() patches (like pack
ordering), we will quite often end up hitting it in the first pass
anyway. So it's probably not worth spending too much time trying to
optimize it.

And thinking on that, my "should we do this for trees" is pretty dumb,
too. We load them and compute their sha1 anyway (I _think_ bitrot checks
like that are totally independent of skiplist). So all we'd be saving is
walking the buffer entries.

> I did think about this, briefly, but decided that we should
> only 'skip' the leaf nodes (blobs). (So, when processing
> commits, trees and tags, we should not report an error for
> that object-id, but that should not stop us from checking
> the tree object associated with a commit, just because of
> a problem with the commit message).
> 
> [Oh, wait - Hmm, each object could be checked independently
> of all others and not used for any object traversal right?
> (e.g. using packfile index). I saw fsck_walk() and didn't
> look any further ... Ah, broken link check, ... I obviously
> need to read the code some more ... :-D ]

Yes, I've been confused by this code before. I'm still not sure I
totally understand it. ;)

> > One thing we could do is turn the parse failure into a noop. The main
> > point of the check is to protect people against the malicious
> > .gitmodules bug. If the file can't be parsed, then it also can't be an
> > attack vector (assuming the parser used for the fsck check and the
> > parser used by the victim behave identically).
> 
> Hmm, yeah, but I would have to provide a means of suppressing
> the config parser error messages. Something I wanted to avoid. :(

Yes, though I don't think it's too bad. We already have a "die_on_error"
flag in the config code. I think it just needs to become a tristate:
die/error/silent (and probably get passed in via config_options, since I
think we tie it right now to the file/blob source).

> Junio, do you want me to address the above 'rejected push'
> issue in this patch, or with a follow-up patch? (It should
> be a pretty rare problem - famous last words!)

Hmm, if we end up doing the config thing above, then this patch would
become unnecessary.

And I think doing that would help _all_ cases, even ones without a
skiplist. They don't need to see random config error messages either,
even if we do eventually report an fsck error.

-Peff

  parent reply	other threads:[~2018-06-28 17:45 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-27 18:39 [PATCH] fsck: check skiplist for object in fsck_blob() Ramsay Jones
2018-06-28 11:49 ` Jeff King
2018-06-28 16:39   ` Junio C Hamano
2018-06-28 17:30     ` Jeff King
2018-06-28 16:56   ` Ramsay Jones
2018-06-28 17:28     ` Junio C Hamano
2018-06-28 17:45     ` Jeff King [this message]
2018-06-28 18:53       ` Ramsay Jones
2018-06-28 22:03         ` Jeff King
2018-06-28 22:05           ` [PATCH 1/4] config: turn die_on_error into caller-facing enum Jeff King
2018-06-28 22:05           ` [PATCH 2/4] config: add CONFIG_ERROR_SILENT handler Jeff King
2018-06-28 22:05           ` [PATCH 3/4] config: add options parameter to git_config_from_mem Jeff King
2018-06-28 22:06           ` [PATCH 4/4] fsck: silence stderr when parsing .gitmodules Jeff King
2018-06-28 22:12             ` Jeff King
2018-06-29  1:14               ` Ramsay Jones
2018-06-29  1:10           ` [PATCH] fsck: check skiplist for object in fsck_blob() Ramsay Jones
2018-07-03 14:34             ` Jeff King
2018-07-04  0:12               ` Ramsay Jones
2018-07-07  1:32                 ` Jeff King
2018-07-11 19:31                   ` Ramsay Jones
2018-07-13 19:37                     ` Ramsay Jones
2018-07-13 19:41                       ` Jeff King
2018-07-13 19:46                         ` Jeff King
2018-07-13 20:08                           ` Ramsay Jones
2018-07-13 19:38                     ` Jeff King
2018-07-13 19:39                       ` [PATCH 1/2] fsck: split ".gitmodules too large" error from parse failure Jeff King
2018-07-13 19:39                       ` [PATCH 2/2] fsck: downgrade gitmodulesParse default to "info" Jeff King
2018-07-13 20:21                         ` Stefan Beller
2018-07-16 18:04                         ` Junio C Hamano
2018-07-16 18:30                           ` Jeff King
2018-07-16 21:08                             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180628174501.GC31766@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=Jason@zx2c4.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=ramsay@ramsayjones.plus.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).