From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-3.2 required=3.0 tests=AWL,BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 0857620958 for ; Wed, 29 Mar 2017 20:04:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932797AbdC2UDj (ORCPT ); Wed, 29 Mar 2017 16:03:39 -0400 Received: from mout.gmx.net ([212.227.17.22]:60029 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752944AbdC2UCs (ORCPT ); Wed, 29 Mar 2017 16:02:48 -0400 Received: from virtualbox ([37.201.193.73]) by mail.gmx.com (mrgmx101 [212.227.17.168]) with ESMTPSA (Nemesis) id 0Metpl-1cVSJk2m2j-00OZW6; Wed, 29 Mar 2017 22:02:33 +0200 Date: Wed, 29 Mar 2017 22:02:17 +0200 (CEST) From: Johannes Schindelin X-X-Sender: virtualbox@virtualbox To: Junio C Hamano cc: git@vger.kernel.org Subject: Re: [PATCH 0/7] PREVIEW: Introduce DC_AND_OPENSSL_SHA1 make flag In-Reply-To: Message-ID: References: User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Provags-ID: V03:K0:JNnFdxNRYJnh3mzYyIady1ZlLY06ZKTsZ3SNHw7Pr8yPQj52bgP lxR9yMWwLXqHtaeLFy1d7L82zbVMpnknRFrsFkJ10FZbvs3N1oaPQteGwbZWA6hxOAbJV1k oWQOAlBEA7OVu910qNyaVycs565cGg9No3nT1tieO1hb8sfIb1hP7Jal7hoiUB4g8rUZiHN vNbO6tiM7+m6F98IKtw4g== X-UI-Out-Filterresults: notjunk:1;V01:K0:10VzTFl0tSg=:Tluoe5wHN5X2vukSOH3D1q 653puGstJiP1DlC59oQQgaf7+ZSWw76k2VPxY8H3a3cA2mTgpLZV/LtGMa2PVXvD3kSgb2aO1 7Zug/BNoDDyQyx3vdTR4pJcmoBiia3toXI7B+mSHf7ZgYPb1BGQpMPfb15UXK2Asg4Rh1XyGl RLStn9fSs6q8tuJLoQvHcju964leNlYpZwuEac+2+zdD4kHlgVpzsteXipWPi2qSISZlr69Lk 7ayjOtC36KbhilCU/z/IkLAcfVTxVzh85wrLQpzhSwCbjhIF2ZXsWcCX04ixO+o/dp0fyu4vg bTTm4FPSP2ualeiVyxw2hq1/2uzYoFo+yLluhFVfDupmkeq4YFkf4n+RoeTvJZGsoPNHYTqw3 2s2HOQ7Zm23pGRbW5501lz9SsFmG89VC9DeG99kpvVe0jyeZ8+Bbfz4BfQDD5JLFSBgSZwSFK yVrjZEGAcw5hoENdc00i9l23l8ceSzw1k2Xz47APeCgPyI7Q9ZMHqtR9VOkjSe9GCJxV9qtfy aa2KdSx59KD/VBp3epL3MLLuh1N++UVaxeGU/EwOXxjCGc9u3DuN+7lLXRbFF54bGXMqPEMsh v6td/gG6KHyNCc2C3TTtE/Yr26+BhZYUODg9b4ek6tX22RO9kdRgH3rCkEJ6URPornougp1QV 1doOot7lnVTOu1dzBKzyQ1QXbmRsnCpSLef2Qxm6oIGVYJmANGwlmDDx+WFrpmAty50sqE9bm pU4pWuTl77pWTzwNuULwRSRpLFINx4LH/s3C86pyIt81e2mfxYrEweGRQyE/DwuaGFN9dB2ks Auy11IF Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Hi Junio, On Fri, 24 Mar 2017, Junio C Hamano wrote: > Johannes Schindelin writes: > > > - the most important part will be the patch turning core.enableSHA1DC > > into a tristate: "externalOnly" or "smart" or "auto" or something > > indicating that it switches on collision detection only for commands > > that accept objects from an outside source into the local repository, > > such as fetch, fast-import, etc > > There are different uses of SHA-1 hashing in Git, and I do agree > that depending on the use, some of them do not need the overhead for > the collision-attack detection. Indeed. I guess I should have clarified very clearly what I intended to accomplish with this preview. Let me first summarize the background: - Git originally used SHA-1 as a convenient catch-all hashing algorithm, with the security of the hash merely being a nice afterthought. Unfortunately, SHA-1 was hardcoded. Mistakes were made. They always are. - After the SHAttered blog post became public, Linus first made the case that it matters not all that much: the development of the Linux kernel is based on trust, and nobody would pull from a person they do not trust. This approach does obviously not extend to most other projects. - By switching the default to DC_SHA1 already in `master`, you now took the exact opposite position: it *always* matters, even when you trust people, and the 6x slowdown demonstrated by my perf test is something that everybody needs to accept, even if it is spent for no benefit in return. Between these two extremes ("collision attacks do not matter if you run your project based on trust" vs "we always mistrust everybody, including the user's own source code that they stage for commit"), I think there are many shades of green, and I think it would be delusional to believe that we can predict the trust model for each and every Git user [*1*], baking it into a single compile time setting. That is why I wanted to implement a tristate config so that users can adapt Git to their particular scenario. That way, maintainers of precompiled Git packages do not have to dictate to Git users what trust model they should use. One scenario seems to be common, and it is certainly one that I have a direct interest in supporting: inside a company, where the server as well as the developers are implicitly trusted not to fiddle with collision attacks (because there are much easier ways to hide malicious code, let's be frank). And in this scenario, slowing down the SHA-1 computation half an order of magnitude by trying to detect collision attacks is simply unacceptable, because there is no benefit, only cost to it. An even more common scenario is when a developer works on a local repository, adds a couple of files, then runs rebase, then merges, etc. In *none* of these cases does the developer distrust any of the objects flying about. Like above, forcing the developer to accept half an order of magnitude slow down of the SHA-1 computation is something I would consider disrespectful of those developers' time. Note: in this scenario, any object coming from elsewhere would most likely be subject to the collision detection, as the developer may not trust anybody but themselves. In other words: I disagree that, say, `git add` should use the collision detecting SHA-1. I also suspect that you had a much more elaborate (maybe even fragile) strategy than mine in mind when you tried to determine which code paths would need collision detection and which ones would not: we have *no* context in the object-oriented sense whenever we call the object hashing functions. Meaning that you would have to introduce such a context, or to add some sort of thread local state. I have to admit that I do not like either way. The approach I chose instead was to make the switch global, per command. Obviously, the next step is to identify the Git commands which accept objects from external sources (`clone`, `fetch`, `fast-import` and all the remote helpers come to mind) and let them set a global flag before asking git_default_config() to parse the core.enableSHA1DC setting, so that the special value `external` could trigger the collision detecting code for those, and only those, commands. That would still err on the safe side if these commands are used to hash objects originating from the same machine, but that is a price I would be willing to pay for the simplicity of this approach. Does my explanation manage to illustrate in a plausible way why I chose the approach that I did? Ciao, Johannes Footnote *1*: It is equally delusional, of course, to claim that every Git user can and should configure and compile Git for themselves.