git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe." <l.s.r@web.de>
To: ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: Jeff King <peff@peff.net>, Junio C Hamano <gitster@pobox.com>,
	Christian Couder <chriscool@tuxfamily.org>,
	Hariom Verma <hariom18599@gmail.com>,
	Eric Sunshine <sunshine@sunshineco.com>,
	Derrick Stolee <stolee@gmail.com>,
	ZheNing Hu <adlternative@gmail.com>
Subject: Re: [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item
Date: Sat, 17 Apr 2021 11:11:14 +0200	[thread overview]
Message-ID: <4c4eded7-3bb3-7ae9-6455-468b9522978c@web.de> (raw)
In-Reply-To: <pull.928.git.1617975348494.gitgitgadget@gmail.com>

Am 09.04.21 um 15:35 schrieb ZheNing Hu via GitGitGadget:
> From: ZheNing Hu <adlternative@gmail.com>
>
> When we use `git for-each-ref`, every ref will call
> `show_ref_array_item()` and allocate its own final strbuf.
> But we can reuse the final strbuf for each step ref's output.
> Since `show_ref_array_item()` is not used in many places,
> we can directly delete `show_ref_array_item()` and use the
> same logic code to replace it. In this way, the caller can
> clearly see how the loop work.

Inlining an exported function that is not providing the right level of
abstraction is a bold move that simplifies the API and can unlock
improvements at the former call sites, like the possibility to reuse an
allocated buffer in this case.  OK.

> The performance for `git for-each-ref` on the Git repository
> itself with performance testing tool `hyperfine` changes from
> 23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms.

I see a speedup as well, but it's within the noise.

> At the same time, we apply this optimization to `git tag -l`
> and `git branch -l`, the `git branch -l` performance upgrade
> from 5.8 ms ± 0.8 ms to 2.7 ms ± 0.2 ms and `git tag -l`
> performance upgrade from 5.9 ms ± 0.4 ms to 5.4 ms ± 0.4 ms.

On my system there's no measurable change with these commands.

Nevertheless I think reusing the buffer across the loops is a good
idea.

> Since the number of tags in git.git is much more than branches,
> so this shows that the optimization will be more obvious in
> those repositories that contain a small number of objects.
>
> This approach is similar to the one used by 79ed0a5
> (cat-file: use a single strbuf for all output, 2018-08-14)
> to speed up the cat-file builtin.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>     [GSOC] ref-filter: get rid of show_ref_array_item
>
>     Now git for-each-ref can reuse final buf for all refs output, the
>     performance is slightly improved, This optimization is also applied to
>     git tag -l and git branch -l.
>
>     Thanks.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-928%2Fadlternative%2Fref-filter-reuse-buf-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-928/adlternative/ref-filter-reuse-buf-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/928
>
>  builtin/branch.c       |  8 ++++----
>  builtin/for-each-ref.c | 13 +++++++++++--
>  builtin/tag.c          | 13 +++++++++++--
>  ref-filter.c           | 24 +++++++++---------------
>  ref-filter.h           |  2 --
>  5 files changed, 35 insertions(+), 25 deletions(-)
>
> diff --git a/builtin/branch.c b/builtin/branch.c
> index bcc00bcf182d..5c797e992aa4 100644
> --- a/builtin/branch.c
> +++ b/builtin/branch.c
> @@ -411,6 +411,8 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
>  {
>  	int i;
>  	struct ref_array array;
> +	struct strbuf out = STRBUF_INIT;
> +	struct strbuf err = STRBUF_INIT;
>  	int maxwidth = 0;
>  	const char *remote_prefix = "";
>  	char *to_free = NULL;
> @@ -440,8 +442,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
>  	ref_array_sort(sorting, &array);
>
>  	for (i = 0; i < array.nr; i++) {
> -		struct strbuf out = STRBUF_INIT;
> -		struct strbuf err = STRBUF_INIT;
> +		strbuf_reset(&out);
>  		if (format_ref_array_item(array.items[i], format, &out, &err))

This function didn't call show_ref_array_item() to begin with, so
strictly speaking it's not related to change in the title.  It is a
preexisting example of show_ref_array_item() not being flexible enough,
though.  I think it makes sense to have separate patches for inlining
the function verbatim and reusing the output buffer when
format_ref_array_item() is called in a loop.

>  			die("%s", err.buf);
>  		if (column_active(colopts)) {
> @@ -452,10 +453,9 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
>  			fwrite(out.buf, 1, out.len, stdout);
>  			putchar('\n');
>  		}
> -		strbuf_release(&err);
> -		strbuf_release(&out);
>  	}
>
> +	strbuf_release(&out);

err is no longer released, and it is also not reset in the loop.
That change is not mentioned in the commit message, but it should.
Why is it safe?  Probably because format_ref_array_item() only
populates it if it also returns non-zero and then we end up dying
anyway.

That makes leak checking harder, though -- it's not easy to see if
err hasn't simply been forgotten to be released.  I'd just retain
the strbuf_release() call at the end of the function -- it
shouldn't have a measurable performance impact and documents that
this function is cleaning up after itself.  Thoughts?

>  	ref_array_clear(&array);
>  	free(to_free);
>  }
> diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
> index cb9c81a04606..157cc8623949 100644
> --- a/builtin/for-each-ref.c
> +++ b/builtin/for-each-ref.c
> @@ -22,6 +22,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
>  	struct ref_array array;
>  	struct ref_filter filter;
>  	struct ref_format format = REF_FORMAT_INIT;
> +	struct strbuf output = STRBUF_INIT;
> +	struct strbuf err = STRBUF_INIT;
>
>  	struct option opts[] = {
>  		OPT_BIT('s', "shell", &format.quote_style,
> @@ -80,8 +82,15 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
>
>  	if (!maxcount || array.nr < maxcount)
>  		maxcount = array.nr;
> -	for (i = 0; i < maxcount; i++)
> -		show_ref_array_item(array.items[i], &format);
> +	for (i = 0; i < maxcount; i++) {
> +		strbuf_reset(&output);
> +		if (format_ref_array_item(array.items[i], &format, &output, &err))
> +			die("%s", err.buf);
> +		fwrite(output.buf, 1, output.len, stdout);
> +		putchar('\n');
> +	}
> +
> +	strbuf_release(&output);
>  	ref_array_clear(&array);
>  	return 0;
>  }
> diff --git a/builtin/tag.c b/builtin/tag.c
> index d403417b5625..b172f3861413 100644
> --- a/builtin/tag.c
> +++ b/builtin/tag.c
> @@ -39,6 +39,8 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
>  		     struct ref_format *format)
>  {
>  	struct ref_array array;
> +	struct strbuf output = STRBUF_INIT;
> +	struct strbuf err = STRBUF_INIT;
>  	char *to_free = NULL;
>  	int i;
>
> @@ -63,8 +65,15 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
>  	filter_refs(&array, filter, FILTER_REFS_TAGS);
>  	ref_array_sort(sorting, &array);
>
> -	for (i = 0; i < array.nr; i++)
> -		show_ref_array_item(array.items[i], format);
> +	for (i = 0; i < array.nr; i++) {
> +		strbuf_reset(&output);
> +		if (format_ref_array_item(array.items[i], format, &output, &err))
> +			die("%s", err.buf);
> +		fwrite(output.buf, 1, output.len, stdout);
> +		putchar('\n');
> +	}
> +
> +	strbuf_release(&output);
>  	ref_array_clear(&array);
>  	free(to_free);
>
> diff --git a/ref-filter.c b/ref-filter.c
> index f0bd32f71416..9e38e42fb7ae 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -2435,27 +2435,21 @@ int format_ref_array_item(struct ref_array_item *info,
>  	return 0;
>  }
>
> -void show_ref_array_item(struct ref_array_item *info,
> -			 const struct ref_format *format)
> -{
> -	struct strbuf final_buf = STRBUF_INIT;
> -	struct strbuf error_buf = STRBUF_INIT;
> -
> -	if (format_ref_array_item(info, format, &final_buf, &error_buf))
> -		die("%s", error_buf.buf);
> -	fwrite(final_buf.buf, 1, final_buf.len, stdout);
> -	strbuf_release(&error_buf);
> -	strbuf_release(&final_buf);
> -	putchar('\n');
> -}
> -
>  void pretty_print_ref(const char *name, const struct object_id *oid,
>  		      const struct ref_format *format)
>  {
>  	struct ref_array_item *ref_item;
> +	struct strbuf output = STRBUF_INIT;
> +	struct strbuf err = STRBUF_INIT;
> +
>  	ref_item = new_ref_array_item(name, oid);
>  	ref_item->kind = ref_kind_from_refname(name);
> -	show_ref_array_item(ref_item, format);
> +	if (format_ref_array_item(ref_item, format, &output, &err))
> +		die("%s", err.buf);
> +	fwrite(output.buf, 1, output.len, stdout);
> +	putchar('\n');
> +
> +	strbuf_release(&output);
>  	free_array_item(ref_item);
>  }
>
> diff --git a/ref-filter.h b/ref-filter.h
> index 19ea4c413409..baf72a718965 100644
> --- a/ref-filter.h
> +++ b/ref-filter.h
> @@ -119,8 +119,6 @@ int format_ref_array_item(struct ref_array_item *info,
>  			  const struct ref_format *format,
>  			  struct strbuf *final_buf,
>  			  struct strbuf *error_buf);
> -/*  Print the ref using the given format and quote_style */
> -void show_ref_array_item(struct ref_array_item *info, const struct ref_format *format);
>  /*  Parse a single sort specifier and add it to the list */
>  void parse_ref_sorting(struct ref_sorting **sorting_tail, const char *atom);
>  /*  Callback function for parsing the sort option */
>
> base-commit: 2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48
>

  parent reply	other threads:[~2021-04-17  9:11 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-09 13:35 [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item ZheNing Hu via GitGitGadget
2021-04-16 11:28 ` ZheNing Hu
2021-04-17  9:11 ` René Scharfe. [this message]
2021-04-18 11:22   ` ZheNing Hu
     [not found]   ` <xmqqeef47s5k.fsf@gitster.g>
2021-04-21  5:51     ` ZheNing Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c4eded7-3bb3-7ae9-6455-468b9522978c@web.de \
    --to=l.s.r@web.de \
    --cc=adlternative@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).