From: ZheNing Hu <adlternative@gmail.com>
To: "René Scharfe." <l.s.r@web.de>
Cc: ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com>,
Git List <git@vger.kernel.org>, Jeff King <peff@peff.net>,
Junio C Hamano <gitster@pobox.com>,
Christian Couder <chriscool@tuxfamily.org>,
Hariom Verma <hariom18599@gmail.com>,
Eric Sunshine <sunshine@sunshineco.com>,
Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item
Date: Sun, 18 Apr 2021 19:22:05 +0800 [thread overview]
Message-ID: <CAOLTT8RY_8Ui7AAfU6K7O03DqbMTh0m5GJpmuwOyiV8Ghn+cQg@mail.gmail.com> (raw)
In-Reply-To: <4c4eded7-3bb3-7ae9-6455-468b9522978c@web.de>
René Scharfe. <l.s.r@web.de> 于2021年4月17日周六 下午5:11写道:
>
> Am 09.04.21 um 15:35 schrieb ZheNing Hu via GitGitGadget:
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > When we use `git for-each-ref`, every ref will call
> > `show_ref_array_item()` and allocate its own final strbuf.
> > But we can reuse the final strbuf for each step ref's output.
> > Since `show_ref_array_item()` is not used in many places,
> > we can directly delete `show_ref_array_item()` and use the
> > same logic code to replace it. In this way, the caller can
> > clearly see how the loop work.
>
> Inlining an exported function that is not providing the right level of
> abstraction is a bold move that simplifies the API and can unlock
> improvements at the former call sites, like the possibility to reuse an
> allocated buffer in this case. OK.
>
> > The performance for `git for-each-ref` on the Git repository
> > itself with performance testing tool `hyperfine` changes from
> > 23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms.
>
> I see a speedup as well, but it's within the noise.
>
Yes, the performance improvement is very small under a large number
of refs. It was almost completely drowned out by the noise.
> > At the same time, we apply this optimization to `git tag -l`
> > and `git branch -l`, the `git branch -l` performance upgrade
> > from 5.8 ms ± 0.8 ms to 2.7 ms ± 0.2 ms and `git tag -l`
> > performance upgrade from 5.9 ms ± 0.4 ms to 5.4 ms ± 0.4 ms.
>
> On my system there's no measurable change with these commands.
>
In our case, git branch -l has made obvious progress, but it may be because
the number of branches is far less than tags.
> Nevertheless I think reusing the buffer across the loops is a good
> idea.
>
> > Since the number of tags in git.git is much more than branches,
> > so this shows that the optimization will be more obvious in
> > those repositories that contain a small number of objects.
> >
> > This approach is similar to the one used by 79ed0a5
> > (cat-file: use a single strbuf for all output, 2018-08-14)
> > to speed up the cat-file builtin.
> >
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > ---
> > [GSOC] ref-filter: get rid of show_ref_array_item
> >
> > Now git for-each-ref can reuse final buf for all refs output, the
> > performance is slightly improved, This optimization is also applied to
> > git tag -l and git branch -l.
> >
> > Thanks.
> >
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-928%2Fadlternative%2Fref-filter-reuse-buf-v1
> > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-928/adlternative/ref-filter-reuse-buf-v1
> > Pull-Request: https://github.com/gitgitgadget/git/pull/928
> >
> > builtin/branch.c | 8 ++++----
> > builtin/for-each-ref.c | 13 +++++++++++--
> > builtin/tag.c | 13 +++++++++++--
> > ref-filter.c | 24 +++++++++---------------
> > ref-filter.h | 2 --
> > 5 files changed, 35 insertions(+), 25 deletions(-)
> >
> > diff --git a/builtin/branch.c b/builtin/branch.c
> > index bcc00bcf182d..5c797e992aa4 100644
> > --- a/builtin/branch.c
> > +++ b/builtin/branch.c
> > @@ -411,6 +411,8 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
> > {
> > int i;
> > struct ref_array array;
> > + struct strbuf out = STRBUF_INIT;
> > + struct strbuf err = STRBUF_INIT;
> > int maxwidth = 0;
> > const char *remote_prefix = "";
> > char *to_free = NULL;
> > @@ -440,8 +442,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
> > ref_array_sort(sorting, &array);
> >
> > for (i = 0; i < array.nr; i++) {
> > - struct strbuf out = STRBUF_INIT;
> > - struct strbuf err = STRBUF_INIT;
> > + strbuf_reset(&out);
> > if (format_ref_array_item(array.items[i], format, &out, &err))
>
> This function didn't call show_ref_array_item() to begin with, so
> strictly speaking it's not related to change in the title. It is a
> preexisting example of show_ref_array_item() not being flexible enough,
> though. I think it makes sense to have separate patches for inlining
> the function verbatim and reusing the output buffer when
> format_ref_array_item() is called in a loop.
>
I agree with you. I will divide this into a separate patch.
> > die("%s", err.buf);
> > if (column_active(colopts)) {
> > @@ -452,10 +453,9 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
> > fwrite(out.buf, 1, out.len, stdout);
> > putchar('\n');
> > }
> > - strbuf_release(&err);
> > - strbuf_release(&out);
> > }
> >
> > + strbuf_release(&out);
>
> err is no longer released, and it is also not reset in the loop.
> That change is not mentioned in the commit message, but it should.
> Why is it safe? Probably because format_ref_array_item() only
> populates it if it also returns non-zero and then we end up dying
> anyway.
>
> That makes leak checking harder, though -- it's not easy to see if
> err hasn't simply been forgotten to be released. I'd just retain
> the strbuf_release() call at the end of the function -- it
> shouldn't have a measurable performance impact and documents that
> this function is cleaning up after itself. Thoughts?
>
Makes sense. Perhaps future changes will forget the release of err buffer.
I will add `strbuf_release()` here.
Thanks.
--
ZheNing Hu
next prev parent reply other threads:[~2021-04-18 11:28 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-09 13:35 [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item ZheNing Hu via GitGitGadget
2021-04-16 11:28 ` ZheNing Hu
2021-04-17 9:11 ` René Scharfe.
2021-04-18 11:22 ` ZheNing Hu [this message]
[not found] ` <xmqqeef47s5k.fsf@gitster.g>
2021-04-21 5:51 ` ZheNing Hu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOLTT8RY_8Ui7AAfU6K7O03DqbMTh0m5GJpmuwOyiV8Ghn+cQg@mail.gmail.com \
--to=adlternative@gmail.com \
--cc=chriscool@tuxfamily.org \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=hariom18599@gmail.com \
--cc=l.s.r@web.de \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).