From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS3215 2.6.0.0/16 X-Spam-Status: No, score=-3.8 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from out1.vger.email (out1.vger.email [IPv6:2620:137:e000::1:20]) by dcvr.yhbt.net (Postfix) with ESMTP id B79C71F506 for ; Thu, 22 Sep 2022 10:14:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230220AbiIVKON (ORCPT ); Thu, 22 Sep 2022 06:14:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231303AbiIVKNs (ORCPT ); Thu, 22 Sep 2022 06:13:48 -0400 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33A2AD98C1 for ; Thu, 22 Sep 2022 03:13:37 -0700 (PDT) Received: (qmail 5661 invoked by uid 109); 22 Sep 2022 10:13:37 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Thu, 22 Sep 2022 10:13:37 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 983 invoked by uid 111); 22 Sep 2022 10:13:39 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Thu, 22 Sep 2022 06:13:39 -0400 Authentication-Results: peff.net; auth=none Date: Thu, 22 Sep 2022 06:13:36 -0400 From: Jeff King To: John Cai Cc: git , Christian Couder Subject: [PATCH 2/3] fsck: turn off save_commit_buffer Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org When parsing a commit, the default behavior is to stuff the original buffer into a commit_slab (which takes ownership of it). But for a tool like fsck, this isn't useful. While we may look at the buffer further as part of fsck_commit(), we'll always do so through a separate pointer; attaching the buffer to the slab doesn't help. Worse, it means we have to remember to free the commit buffer in all call paths. We do so in fsck_obj(), which covers a regular "git fsck". But with "--connectivity-only", we forget to do so in both traverse_one_object(), which covers reachable objects, and mark_unreachable_referents(), which covers unreachable ones. As a result, that mode ends up storing an uncompressed copy of every commit on the heap at once. We could teach the code paths for --connectivity-only to also free commit buffers. But there's an even easier fix: we can just turn off the save_commit_buffer flag, and then we won't attach them to the commits in the first place. This reduces the peak heap of running "git fsck --connectivity-only" in a clone of linux.git from ~2GB to ~1GB. According to massif, the remaining memory goes where you'd expect: the object structs themselves, the obj_hash containing them, and the delta base cache. Note that we'll leave the call to free commit buffers in fsck_obj() for now; it's not quite redundant because of a related bug that we'll fix in a subsequent commit. Signed-off-by: Jeff King --- builtin/fsck.c | 1 + 1 file changed, 1 insertion(+) diff --git a/builtin/fsck.c b/builtin/fsck.c index 34e575a170..b45de003d4 100644 --- a/builtin/fsck.c +++ b/builtin/fsck.c @@ -855,6 +855,7 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) errors_found = 0; read_replace_refs = 0; + save_commit_buffer = 0; argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0); -- 2.38.0.rc1.583.ga560cd8328