From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by dcvr.yhbt.net (Postfix) with ESMTP id 86F531F54E for ; Thu, 8 Sep 2022 18:42:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232167AbiIHSki (ORCPT ); Thu, 8 Sep 2022 14:40:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232173AbiIHSke (ORCPT ); Thu, 8 Sep 2022 14:40:34 -0400 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0546CAC52 for ; Thu, 8 Sep 2022 11:40:29 -0700 (PDT) Received: (qmail 28460 invoked by uid 109); 8 Sep 2022 18:40:29 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Thu, 08 Sep 2022 18:40:29 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 23497 invoked by uid 111); 8 Sep 2022 18:40:29 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Thu, 08 Sep 2022 14:40:29 -0400 Authentication-Results: peff.net; auth=none Date: Thu, 8 Sep 2022 14:40:28 -0400 From: Jeff King To: =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason Cc: git@vger.kernel.org, Junio C Hamano Subject: Re: [PATCH] tests: opt "git --config-env" test out of SANITIZE=leak Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Thu, Sep 08, 2022 at 05:42:54PM +0200, Ævar Arnfjörð Bjarmason wrote: > This is arguably not a compiler issue or bug, as the very notion of a > variable being "in scope" as far as the leak checker is concerned is > fuzzy at best in the face of compiler optimizations. > > A similar issue came up before related to the config.c code in > [2]. There the consensus seemed to be to avoid hacks in the C code to > work around these compiler edge cases. In this case however skipping > one test is easy enough. We can deal with these "!SANITIZE_LEAK" > issues later, when we have fewer overall leaks. IMHO this is the wrong approach. It is playing whack-a-mole with individual false positives, when we know the root cause is compiler optimizations. We should be invoking the compiler in a way that gives us the most reliable result. In the long run, when all leaks are plugged, we'd want to ditch the whole SANITIZE_LEAK system anyway. So we are better off preventing false positives than trying to gloss over them. > > Maybe this is a known problem, but it was certainly surprising to me. > > It is surprising, but it's a (at least to me) known "issue". See > e.g. https://lore.kernel.org/git/patch-v3-5.6-9a44204c4c9-20211022T175227Z-avarab@gmail.com/ > for a previous discussion between us. I took the consensus there to be > that we shouldn't bend over backwards to worry about this. Thanks, I'd forgotten about that conversation. I stand by my comments there that we shouldn't be changing the code. But I really think just using -O0 for the leak-check is a better solution than trying to label false positives. > But I see this one slipped through because I locally changed my > default compiler from "clang" to "gcc" (again) a while ago, and thus > missed testing clang at higher optimization levels. Usually it's GCC > that's unhappy about this sort of thing. FWIW, I found this using clang-14, because the gcc on my system (12.2.0) does not seem to find actual leaks. It couldn't see any of the ones in t1060 that I fixed yesterday. > > But if you run this in a debugger, you'll find that under -O2 the "key" > > variable has been optimized out! So LSan is producing the wrong result > > due to the optimization. It doesn't know that "key" is conceptually > > still reachable. > > [...] > > I don't know if we (or the compiler implementors) can call it the > "wrong" result. Just as some warnings only pop up depending on > optimization levels, this behavior also depends on it. I think it's different than the case of warnings. In those cases the optimizations let the compiler see more of what's going on in the code, which lets them find a warn-able offense that they wouldn't otherwise see (e.g., inlining a function lets the compiler see an invariant). Those are real problems in the code that -O0 simply doesn't catch. But this is different: there is no problem in the code. It is just that the optimization of removing the variable does not work well with how the leak-checker works. There is nothing we can do in the code to fix it; the C conceptual model of variable scoping is all we have to work with, and according to that, there is no leak. C's "as if" rule means that the compiler can change the program as long as the behavior is the same. And it is, from the perspective of the rest of the program. The issue is that the leak-checker is itself a sort of undefined behavior; it expects to poke at random parts of the stack at any moment and find a consistent state, which is not something a normal C program can do. > I've seen quite a few of these around die() codepaths. Presumably some > re-arranging/eliminaton of the code around that NORETURN codepath... The compiler probably does this kind of re-arranging and elimination all the time. The reason it comes up around die() is that it means we invoke the leak-checker mid-function. Even though the "key" variable here was eliminated, the compiler would need to arrange that the pointer value is _somewhere_ so that when we hit the end of the function we can call that free(). But when we exit in the middle, we end up doing the leak-check in a funny state. It may be that this is influenced by the NORETURN flag. Presumably "key" is held in a register, and the call to die(), knowing that it does not need to return, may choose not to preserve that register. So turning NORETURN into a noop _might_ help, but only because it's giving the optimizer less information to work with (and it might not if the optimizer can figure it out on its own via inline functions, for example). The root cause is the optimizer, so disabling it is the most complete fix. -Peff