From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Patrick Steinhardt <ps@pks.im>, git@vger.kernel.org
Subject: Re: [PATCH 0/8] Speed up connectivity checks via quarantine dir
Date: Fri, 21 May 2021 05:30:59 -0400 [thread overview]
Message-ID: <YKd90xMgfOcuJfpI@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqqr1i1t6zl.fsf@gitster.g>
On Fri, May 21, 2021 at 06:45:02AM +0900, Junio C Hamano wrote:
> > 2. This tightening is actually important if we want to avoid letting
> > people _intentionally_ introduce the unreachable-but-incomplete
> > scenario. Without it, an easy denial-of-service corruption against
> > a repository you can push to is:
> >
> > - push an update to change a ref from X to Y. Include all objects
> > necessary for X..Y, but _also_ include a tree T which points to
> > a missing blob B. This will be accepted by the current rules
> > (but not by your patch).
> >
> > - push an update to change the ref from Y to C, where C is a
> > commit whose root tree is T. Your patch allows this (because we
> > already have T in the repository). But the resulting repository
> > is corrupt (the ref now points to an incomplete object graph).
>
> Hmph, the last step of that attack would not work with our current
> check; is this the same new hole the series brings in as you
> explained earlier for a case where a newly pushed tree/commit starts
> to reference a left-over dangling tree already in the repository
> whose content blobs are missing?
Right. The current state is immune to this attack; the rule is "refs
must be complete, but nothing else has to be". The state after Patrick's
series is (I think) likewise immune; the rule is "all objects must be
complete".
I'm not sure if that was intentional, or a lucky save because the series
tries to optimize both parts (both which objects we decide to check, as
well as which we consider we "already have"). But it's quite subtle. :)
> > If we wanted to keep the existing rule (requiring that any objects that
> > sender didn't provide are actually reachable from the current refs),
> > then we'd want to be able to check reachability quickly. And there I'd
> > probably turn to reachability bitmaps.
>
> True. As we are not "performance is king---a code that corrupts
> repositories as quickly as possible is an improvement" kind of
> project, we should keep the existing "an object can become part of
> DAG referred by ref tips only when the objects it refers to all
> exist in the object store, because we want to keep the invariant: an
> object that is reachable from a ref is guaranteed to have everything
> reachable from it in the object store" rule, and find a way to make
> it fast to enforce that rule somehow.
That's my feeling, too. A rule of "you are not allowed to introduce
objects which directly reference a missing object" would be likewise
correct, if consistently enforced. But it makes me nervous to switch
from one to the other, especially in just one place.
And I guess in that sense, this series isn't immune as I said earlier.
It looks like a push from a shallow clone would use the old "reachable
from refs" check even after this series.
-Peff
next prev parent reply other threads:[~2021-05-21 9:31 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-19 19:13 [PATCH 0/8] Speed up connectivity checks via quarantine dir Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 1/8] perf: fix when running with TEST_OUTPUT_DIRECTORY Patrick Steinhardt
2021-05-20 2:03 ` Chris Torek
2021-05-19 19:13 ` [PATCH 2/8] p5400: add perf tests for git-receive-pack(1) Patrick Steinhardt
2021-05-20 2:09 ` Chris Torek
2021-05-20 17:04 ` Jeff King
2021-05-21 15:03 ` SZEDER Gábor
2021-05-19 19:13 ` [PATCH 3/8] tmp-objdir: expose function to retrieve path Patrick Steinhardt
2021-05-20 0:16 ` Elijah Newren
2021-05-19 19:13 ` [PATCH 4/8] packfile: have `for_each_file_in_pack_dir()` return error codes Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 5/8] object-file: allow reading loose objects without reading their contents Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 6/8] connected: implement connectivity check via temporary object dirs Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 7/8] receive-pack: skip connectivity checks on delete-only commands Patrick Steinhardt
2021-05-21 18:53 ` Felipe Contreras
2021-05-27 14:38 ` Jeff King
2021-05-19 19:13 ` [PATCH 8/8] receive-pack: check connectivity via quarantined objects Patrick Steinhardt
2021-05-20 2:19 ` [PATCH 0/8] Speed up connectivity checks via quarantine dir Chris Torek
2021-05-20 16:50 ` Jeff King
2021-05-20 21:45 ` Junio C Hamano
2021-05-21 9:30 ` Jeff King [this message]
2021-05-21 9:42 ` Patrick Steinhardt
2021-05-21 11:20 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YKd90xMgfOcuJfpI@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ps@pks.im \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).