From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS31976 209.132.180.0/23 X-Spam-Status: No, score=-3.6 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD shortcircuit=no autolearn=ham autolearn_force=no version=3.4.0 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by dcvr.yhbt.net (Postfix) with ESMTP id 452FB202A5 for ; Mon, 25 Sep 2017 16:08:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965025AbdIYQIi (ORCPT ); Mon, 25 Sep 2017 12:08:38 -0400 Received: from cloud.peff.net ([104.130.231.41]:49314 "HELO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S964842AbdIYQIh (ORCPT ); Mon, 25 Sep 2017 12:08:37 -0400 Received: (qmail 21946 invoked by uid 109); 25 Sep 2017 16:08:37 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with SMTP; Mon, 25 Sep 2017 16:08:37 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 26517 invoked by uid 111); 25 Sep 2017 16:09:16 -0000 Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.94) with SMTP; Mon, 25 Sep 2017 12:09:16 -0400 Authentication-Results: peff.net; auth=none Received: by sigill.intra.peff.net (sSMTP sendmail emulation); Mon, 25 Sep 2017 12:08:35 -0400 Date: Mon, 25 Sep 2017 12:08:35 -0400 From: Jeff King To: Martin =?utf-8?B?w4VncmVu?= Cc: Git Mailing List Subject: Re: [PATCH v2 0/6] reroll ma/plugleaks; more `object_array`-fixes Message-ID: <20170925160835.aoomjaqrn2o2aosi@sigill.intra.peff.net> References: <20170920200229.bc4yniz6otng2zyz@sigill.intra.peff.net> <20170923043701.4s3xuytp5hdjwmsy@sigill.intra.peff.net> <20170923161316.yntnccqrhv5d3jza@sigill.intra.peff.net> <20170923163817.7ltmkav2ytk7n43k@sigill.intra.peff.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Sun, Sep 24, 2017 at 09:59:28PM +0200, Martin Ă…gren wrote: > > Anyway, doing: > > > > ASAN_OPTIONS=detect_leaks=1:abort_on_error=0:exitcode=0:log_path=/tmp/lsan/output \ > > make SANITIZE=address,leak test > > > > should pass the whole suite and give you a host of files to analyze. > > Thanks. My reading of the documentation was off. Turns out exitcode=0 > does not set the exit code to 0, but rather turns it off. Duh. Actually, the docs are quite confusing. The LSan "exitcode" option defaults to 23, and that is the exit code you get. So I think it probably is interpreted as the code to exit with, or 0 for "use the original exit code". > > I'm not sure of the best way to count things. > > Right. It's a tricky problem. And in the end, all we find out is where > we allocate and how we got there. Exactly where we lose/leak that piece > of allocated memory is another question... For hunting a particular trace, I think you have to walk up the list of called functions and see where the pointers go out of scope. I'm not sure how to make that easier (in theory a compiler-instrumentation like LSan could do it by performing a leak-check when pointers go out of scope. But it would also need to know about copies you've made of pointers, so I imagine it would be extremely slow to run). But at least on the topic of "how many unique leaks are there", I wrote the script below to try to give some basic answers. It just finds the first non-boring entry in each stack trace and reports that. Where "boring" is really "this function is not expected to free, but hands off memory ownership to somebody else". You can use it to do: perl leaks.pl /tmp/lsan/output.* | sort | uniq -c | sort -rn | head to see places that leak a lot. These are either boring calls that need to be annotated, or are high-value targets for de-leaking. I notice ref-filter.c has quite a few high entries on the list. I'm not sure yet which case it falls into. :) The other interesting thing is seeing how many "unique" leaks there are: perl leaks.pl /tmp/lsan/output.* | sort -u | wc -l I get a bit over 800 with a run of the test suite. Which is a lot, but fewer than I expected. And I'm sure quite a few of them are really "duplicates" that can be eliminated in chunks. So I don't know how useful any of that will be, but it at least should give _some_ metric that should be diminishing as we fix leaks. -- >8 -- #!/usr/bin/perl my $boring = join('|', # These are allocation functions that get called from a lot of places. qw( __interceptor_strdup __interceptor_calloc realloc malloc xstrdup xcalloc strbuf_ xmemdupz xstrvfmt xstrfmt xstrndup ), # These are really just the revision machinery not getting cleaned up; # for many we'd probably want to just UNLEAK() at the apex caller qw( add_rev_cmdline add_object_array_with_path add_pending_object_with_path add_pending_object_with_mode add_pending_object handle_revision_arg setup_revisions ), # More allocators that drop memory ownership qw( alloc_ref_with_prefix alloc_ref copy_ref commit_list_insert copy_pathspec ), ); my $boring_re = qr/^$boring/; my $skipping; while (<>) { if (/^\s*#[0-9]+ 0x[0-9a-f]+ in (.*)/) { next if $skipping; # we already reported this trace next if $1 =~ $boring_re; print $1, "\n"; $skipping = 1; } else { $skipping = 0; } }