From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS53758 23.128.96.0/24 X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by dcvr.yhbt.net (Postfix) with ESMTP id 1D1CE1F4B4 for ; Mon, 5 Apr 2021 22:17:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241421AbhDEWRY (ORCPT ); Mon, 5 Apr 2021 18:17:24 -0400 Received: from cloud.peff.net ([104.130.231.41]:42074 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241313AbhDEWRX (ORCPT ); Mon, 5 Apr 2021 18:17:23 -0400 Received: (qmail 3093 invoked by uid 109); 5 Apr 2021 22:17:16 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 05 Apr 2021 22:17:16 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 22116 invoked by uid 111); 5 Apr 2021 22:17:16 -0000 Received: from coredump.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Mon, 05 Apr 2021 18:17:16 -0400 Authentication-Results: peff.net; auth=none Date: Mon, 5 Apr 2021 18:17:15 -0400 From: Jeff King To: ZheNing Hu via GitGitGadget Cc: git@vger.kernel.org, Junio C Hamano , Christian Couder , Hariom Verma , ZheNing Hu Subject: Re: [PATCH] [GSOC] ref-filter: use single strbuf for all output Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org On Mon, Apr 05, 2021 at 02:01:19PM +0000, ZheNing Hu via GitGitGadget wrote: > When we use `git for-each-ref`, every ref will call > `show_ref_array_item()` and allocate its own final strbuf > and error strbuf. Instead, we can provide two single strbuf: > final_buf and error_buf that get reused for each output. > > When run it 100 times: > > $ git for-each-ref > > on git.git : > > 3.19s user > 3.88s system > 35% cpu > 20.199 total > > to: > > 2.89s user > 4.00s system > 34% cpu > 19.741 total That's a bigger performance improvement than I'd expect from this. I'm having trouble reproducing it here (I get the same time before and after). I also notice that you don't seem to be CPU bound, and we spend most of our time on system CPU (so program startup stuff, not the loop you're optimizing). I think a more interesting test is timing a single invocation with a large number of refs. If you are planning to do a lot of work on the formatting code, it might be worth adding such a test into t/perf (both to show off results, but also to catch any regressions after we make things faster). > This patch learned Jeff King's optimization measures in git > cat-file(79ed0a5): using a single strbuf for all objects output Instead > of allocating a large number of small strbuf for every object. I do think this is a good direction (for all the reasons laid out in 79ed0a5), though it wasn't actually the part I was most worried about for ref-filter performance. The bigger issue in ref-filter is that each individual atom will generally have its own allocated string, and then we'll format all of the atoms into the final strbuf output. In most cases we could avoid those intermediate copies entirely. I do think this would be a useful optimization to have in addition, though. As for the patch itself, I looked over the review that Eric gave, and he already said everything I would have. :) -Peff