mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <>
To: Junio C Hamano <>
Cc: Patrick Steinhardt <>,
Subject: Re: [PATCH 0/8] Speed up connectivity checks via quarantine dir
Date: Fri, 21 May 2021 05:30:59 -0400	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <xmqqr1i1t6zl.fsf@gitster.g>

On Fri, May 21, 2021 at 06:45:02AM +0900, Junio C Hamano wrote:

> >   2. This tightening is actually important if we want to avoid letting
> >      people _intentionally_ introduce the unreachable-but-incomplete
> >      scenario. Without it, an easy denial-of-service corruption against
> >      a repository you can push to is:
> >
> >        - push an update to change a ref from X to Y. Include all objects
> > 	 necessary for X..Y, but _also_ include a tree T which points to
> > 	 a missing blob B. This will be accepted by the current rules
> > 	 (but not by your patch).
> >
> >        - push an update to change the ref from Y to C, where C is a
> > 	 commit whose root tree is T. Your patch allows this (because we
> > 	 already have T in the repository). But the resulting repository
> > 	 is corrupt (the ref now points to an incomplete object graph).
> Hmph, the last step of that attack would not work with our current
> check; is this the same new hole the series brings in as you
> explained earlier for a case where a newly pushed tree/commit starts
> to reference a left-over dangling tree already in the repository
> whose content blobs are missing?

Right. The current state is immune to this attack; the rule is "refs
must be complete, but nothing else has to be". The state after Patrick's
series is (I think) likewise immune; the rule is "all objects must be

I'm not sure if that was intentional, or a lucky save because the series
tries to optimize both parts (both which objects we decide to check, as
well as which we consider we "already have"). But it's quite subtle. :)

> > If we wanted to keep the existing rule (requiring that any objects that
> > sender didn't provide are actually reachable from the current refs),
> > then we'd want to be able to check reachability quickly. And there I'd
> > probably turn to reachability bitmaps.
> True.  As we are not "performance is king---a code that corrupts
> repositories as quickly as possible is an improvement" kind of
> project, we should keep the existing "an object can become part of
> DAG referred by ref tips only when the objects it refers to all
> exist in the object store, because we want to keep the invariant: an
> object that is reachable from a ref is guaranteed to have everything
> reachable from it in the object store" rule, and find a way to make
> it fast to enforce that rule somehow.

That's my feeling, too. A rule of "you are not allowed to introduce
objects which directly reference a missing object" would be likewise
correct, if consistently enforced. But it makes me nervous to switch
from one to the other, especially in just one place.

And I guess in that sense, this series isn't immune as I said earlier.
It looks like a push from a shallow clone would use the old "reachable
from refs" check even after this series.


  reply	other threads:[~2021-05-21  9:31 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19 19:13 [PATCH 0/8] Speed up connectivity checks via quarantine dir Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 1/8] perf: fix when running with TEST_OUTPUT_DIRECTORY Patrick Steinhardt
2021-05-20  2:03   ` Chris Torek
2021-05-19 19:13 ` [PATCH 2/8] p5400: add perf tests for git-receive-pack(1) Patrick Steinhardt
2021-05-20  2:09   ` Chris Torek
2021-05-20 17:04   ` Jeff King
2021-05-21 15:03   ` SZEDER Gábor
2021-05-19 19:13 ` [PATCH 3/8] tmp-objdir: expose function to retrieve path Patrick Steinhardt
2021-05-20  0:16   ` Elijah Newren
2021-05-19 19:13 ` [PATCH 4/8] packfile: have `for_each_file_in_pack_dir()` return error codes Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 5/8] object-file: allow reading loose objects without reading their contents Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 6/8] connected: implement connectivity check via temporary object dirs Patrick Steinhardt
2021-05-19 19:13 ` [PATCH 7/8] receive-pack: skip connectivity checks on delete-only commands Patrick Steinhardt
2021-05-21 18:53   ` Felipe Contreras
2021-05-27 14:38     ` Jeff King
2021-05-19 19:13 ` [PATCH 8/8] receive-pack: check connectivity via quarantined objects Patrick Steinhardt
2021-05-20  2:19 ` [PATCH 0/8] Speed up connectivity checks via quarantine dir Chris Torek
2021-05-20 16:50 ` Jeff King
2021-05-20 21:45   ` Junio C Hamano
2021-05-21  9:30     ` Jeff King [this message]
2021-05-21  9:42   ` Patrick Steinhardt
2021-05-21 11:20   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

  List information:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).