From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-3.6 required=3.0 tests=AWL,BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by dcvr.yhbt.net (Postfix) with ESMTP id 932041F54E for ; Wed, 13 Jul 2022 01:09:49 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="NdZoHnzL"; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232805AbiGMBJp (ORCPT ); Tue, 12 Jul 2022 21:09:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232716AbiGMBJn (ORCPT ); Tue, 12 Jul 2022 21:09:43 -0400 Received: from mail-ej1-x635.google.com (mail-ej1-x635.google.com [IPv6:2a00:1450:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F047C48E0 for ; Tue, 12 Jul 2022 18:09:42 -0700 (PDT) Received: by mail-ej1-x635.google.com with SMTP id j22so17304156ejs.2 for ; Tue, 12 Jul 2022 18:09:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:references:user-agent:in-reply-to :message-id:mime-version:content-transfer-encoding; bh=Sdnzk6ZfP5L0XH2fOUayrsbDiszJ8vLOAhD0tvWrvEs=; b=NdZoHnzLL4mgC1uZBtYnXx8JsNtI+Jm47TJ6FBlL9L4B5WnwTrKaberCKim/2nW3pc noFmbYjj6wfsPYwOIoJwYbZI234nq0Oe71BScm8tlknvkmHer1MNfpymkArW5xC2A6sd 8xp8XIc//qiqAzbANOO3J1ypFu1qUqBsrEMCe2HBDeq8ODZMyDwFShAgECfINUxqHf5J OFrJH8JDwFrInkByaUe1ZbSzulFgUpdBFuQYOS9JaVSrYEEuWc7z4JOZQTs4RpjSMj1J vuvhoBgm/MjRYwMjG1gdZTT943we6/zQy9LRho39ROdpYcg/g+tR/AVqyndb6NSD90SP Ip4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:references:user-agent :in-reply-to:message-id:mime-version:content-transfer-encoding; bh=Sdnzk6ZfP5L0XH2fOUayrsbDiszJ8vLOAhD0tvWrvEs=; b=wDaDbSIWjDXZBMYdDGDRe9N8FHDikIIymiKB74zp2jmSZnVgXSFLlYLfqTDBXW6S8T /mM+vFFYGdWVXnHubXuZQhgi1OOPyaxxNGJNWXOpP/shVvaEurQOTjrEnFwUm7UrrRNJ iqlqmhWliG/dzH7KB0CZ0oknHcnyULGainv2tDDrInNuCNdF4d1CcW96N7O5L0lPCruJ tN0thnuyhgzk+qJxHqQ4Xoev/tD2bSo+SxwLzFXUyFQefTkpyMheM48d/CmloFGqv89W 1ErpHPZb9qGGs0CVHZfNf8Az0RHhxDG1s3iVSS7AHRFoj3d89DhEDxbK2vf+QRB1rOXq TBwg== X-Gm-Message-State: AJIora/xDP8i8DK47q8Q+5KkyQHuwALmfkNAl6BEg6Zl/pNvoKJbzAj8 AtU5AsQdTJ8fpmPQApXAzDI= X-Google-Smtp-Source: AGRyM1tusjg3ngFzlcncsY8XVCV7earm/guhIzMbB7r8dwpcCJEhbrWW1evO/GEnjoaEZc1dzjsJZg== X-Received: by 2002:a17:906:5055:b0:6ff:1dfb:1e2c with SMTP id e21-20020a170906505500b006ff1dfb1e2cmr923436ejk.200.1657674580747; Tue, 12 Jul 2022 18:09:40 -0700 (PDT) Received: from gmgdl (dhcp-077-248-183-071.chello.nl. [77.248.183.71]) by smtp.gmail.com with ESMTPSA id w13-20020aa7dccd000000b00435a62d35b5sm6905831edu.45.2022.07.12.18.09.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Jul 2022 18:09:40 -0700 (PDT) Received: from avar by gmgdl with local (Exim 4.95) (envelope-from ) id 1oBQsl-001cJG-2r; Wed, 13 Jul 2022 03:09:39 +0200 From: =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason To: Glen Choo Cc: Glen Choo via GitGitGadget , git@vger.kernel.org, Taylor Blau , "brian m. carlson" , Derrick Stolee , Junio C Hamano , Emily Shaffer , Jonathan Tan Subject: Re: [PATCH v6 0/5] config: introduce discovery.bare and protected config Date: Wed, 13 Jul 2022 01:53:24 +0200 References: <220701.861qv5d8v7.gmgdl@evledraar.gmail.com> <220709.86zghj8d6i.gmgdl@evledraar.gmail.com> User-agent: Debian GNU/Linux bookworm/sid; Emacs 27.1; mu4e 1.7.12 In-reply-to: Message-ID: <220713.86v8s14y3w.gmgdl@evledraar.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Tue, Jul 12 2022, Glen Choo wrote: > Thanks for following up. I'm a concerned that this thread will be > unproductive if all we're doing is reiterating our own opinions. I'm ok > if the conclusion is "agree to disagree", but let's not spend too much > time talking circles around one another (myself included, of course:)). Yes, I have not been following up here to merely repeat what's been said before, but... > =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason writes: > >> On Fri, Jul 01 2022, Glen Choo wrote: >>>> The "more narrow" and "more secure" go hand-in-hand, since if you work >>>> on such servers you'd turn this to "always" because you want to read >>>> such config, but then be left vulnerable to the actual (and muche rare= r) >>>> exploit we're trying to prevent. >>> >>> The point that we're not defending bare repo users is fair, but maybe >>> the group we're trying to protect isn't really dedicated Git-serving >>> servers. This exploit requires you to have a bare repo inside the >>> working tree of a non-bare repo. So I think this is less of an issue for >>> a server, and more for "mixed-use" environments with both regular and >>> bare clones. >> >> Yes, but this is only something that's even a question because of an >> artificial limitation your proposal here suffers from. >> >> I.e. in trying to detect nefarious repos where you've got "looks like >> bare" content *tracked* in another repo you're conflating it with *any >> bare repo*. >> >> And the only reason we're doing that seems to me to be a premature >> optimization. > > Right, I hear you. Besides performance,[...] ...have been following up because it's still genuinely unclear to me what data or design constraints led to this solution. I.e. in [1] you noted ("[...]" interjection is mine): "I don't see how we could implement this [the "walk-up" method] without imposing a big penalty to all bare repo users[...]." [Continued below] > let me offer the perspective > that I should have led with in the previous email. In this thread and > the original "embedded bare repo" one [1], there is a huge diversity of > opinion on what the default behavior should be, e.g.: I read that thread over again, and some of the highlights were: * brian asking if we can't basically do the "walk up" method: https://lore.kernel.org/git/Yk9hONuCIVIq6ieV@camp.crustytoothpaste.net/ * Taylor wondering how much we need to worry about this attack (among other things) & worrying about legitimate "bare repo" workflows being broken: https://lore.kernel.org/git/YloTQH35r2xVdPm1@nand.local/ & https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/ But most importantly, here's something I hadn't noticed before: * Emily talking about the supposed slowness of the "walk up" method: https://lore.kernel.org/git/CAJoAoZkgnnvdymuBsM9Ja3+eYSnyohr=3DFQZMVX_uz= Z_pkQhgaw@mail.gmail.com/ I.e.: "wantonly scanning up the filesystem for any gitdir above the current one is really expensive. When I tried that approach for the purposes of including some shared config between superproject and submodules, it slowed down the Git test suite by something like 3-5x." Which I'm now 99.99% certain based on past context[2] is a misstatement or misrecollection about an early version of submodule.superprojectGitDir v.s. what setup.c would do. I.e. that 3-5x slowness referred to git-submodule.sh shelling out to "git rev-parse", it's not a reference to the expense of the few syscalls we'd need to make to discover a parent git directory. Did you hear about the directory walking being a performance concern from Emily, or was it an independent discovery? It seems as though this might have come about because of a misrecollection about the git-rev-parse(1)/git-submodule.sh v.s. setup.c performance with reference to submodule.superprojectGitDir, and that we've now got a design that's optimized to avoid a performance problem that doesn't exist, at the cost of accuracy. And not to reiterate, but I think the performance isn't a concern per-se, but rather that performance concerns seem to have driven one design over another. > - How do we detect an embedded bare repo (fsck check? walk up [and check > if it's tracked]?) > - What to do when we detect one (ignore the config? block the repo?) > - How to preserve workflows that rely on embedded bare repos (some kind > of (global|per-repo) exception list? allow the repo but not the > config?) > > And rightfully so! There are a lot of options here, so we want to make > sure we get the defaults right. But at the same time that implies a > pretty slow, difficult process. I saw some implementation discussion about how we'd do this with fsck, which is one thing, but I don't really see the trickyness or ambiguity on the client side. I.e. we know when we'd "find a repo", so that's the criteria we'd use to ignore such a contained repo or not. The only trickyness seems to come about if the approach we pick is one where we conflate embedded bare repos v.s. non-embedded bare repos. > On the other hand, I haven't seen nearly as much disagreement on "just > refuse to work with bare repos" because it's so restrictive that it > probably won't be the default. So it'll have no effect on most users, > but still confers protection for the subset of users who can benefit > from it. For those who want the problem fixed _today_ (e.g. my > employer), this seems like simple, low-hanging fruit that buys time for > us to find good default. > > FWIW, when time permits I'd be happy to work on that good default (which > will probably be some variant of "walk up"), and to pay off the tech > debt introduced by this implementation (I have some ideas about how we > could improve the config API to achieve this [2]). Hopefully that helps > allay some of your concerns? It really just seems like a dead end to me, sorry. I.e. we know what the security problem is, but the side-effects of this approach are such that we'll probably never turn it on by default. So that'll mean that the vast majority of users who could benefit from the security mitigation won't even know about the config, or if they do might not have it turned on. And yes, we might end up with a better design later, but then we'll have to still support this config mechanism, potentially deprecate it etc. > [1] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam= .corp.google.com > [2] https://lore.kernel.org/git/kl6lr13fi9qn.fsf@chooglen-macbookpro.roam= .corp.google.com 1. https://lore.kernel.org/git/kl6lee1z8mcm.fsf@chooglen-macbookpro.roam.co= rp.google.com/ 2. https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@evledraar.gmail.com/