git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item
@ 2021-04-09 13:35 ZheNing Hu via GitGitGadget
  2021-04-16 11:28 ` ZheNing Hu
  2021-04-17  9:11 ` René Scharfe.
  0 siblings, 2 replies; 5+ messages in thread
From: ZheNing Hu via GitGitGadget @ 2021-04-09 13:35 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Junio C Hamano, Christian Couder, Hariom Verma,
	Eric Sunshine, Derrick Stolee, René Scharfe, ZheNing Hu,
	ZheNing Hu

From: ZheNing Hu <adlternative@gmail.com>

When we use `git for-each-ref`, every ref will call
`show_ref_array_item()` and allocate its own final strbuf.
But we can reuse the final strbuf for each step ref's output.
Since `show_ref_array_item()` is not used in many places,
we can directly delete `show_ref_array_item()` and use the
same logic code to replace it. In this way, the caller can
clearly see how the loop work.

The performance for `git for-each-ref` on the Git repository
itself with performance testing tool `hyperfine` changes from
23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms.

At the same time, we apply this optimization to `git tag -l`
and `git branch -l`, the `git branch -l` performance upgrade
from 5.8 ms ± 0.8 ms to 2.7 ms ± 0.2 ms and `git tag -l`
performance upgrade from 5.9 ms ± 0.4 ms to 5.4 ms ± 0.4 ms.
Since the number of tags in git.git is much more than branches,
so this shows that the optimization will be more obvious in
those repositories that contain a small number of objects.

This approach is similar to the one used by 79ed0a5
(cat-file: use a single strbuf for all output, 2018-08-14)
to speed up the cat-file builtin.

Signed-off-by: ZheNing Hu <adlternative@gmail.com>
---
    [GSOC] ref-filter: get rid of show_ref_array_item
    
    Now git for-each-ref can reuse final buf for all refs output, the
    performance is slightly improved, This optimization is also applied to
    git tag -l and git branch -l.
    
    Thanks.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-928%2Fadlternative%2Fref-filter-reuse-buf-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-928/adlternative/ref-filter-reuse-buf-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/928

 builtin/branch.c       |  8 ++++----
 builtin/for-each-ref.c | 13 +++++++++++--
 builtin/tag.c          | 13 +++++++++++--
 ref-filter.c           | 24 +++++++++---------------
 ref-filter.h           |  2 --
 5 files changed, 35 insertions(+), 25 deletions(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index bcc00bcf182d..5c797e992aa4 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -411,6 +411,8 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
 {
 	int i;
 	struct ref_array array;
+	struct strbuf out = STRBUF_INIT;
+	struct strbuf err = STRBUF_INIT;
 	int maxwidth = 0;
 	const char *remote_prefix = "";
 	char *to_free = NULL;
@@ -440,8 +442,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
 	ref_array_sort(sorting, &array);
 
 	for (i = 0; i < array.nr; i++) {
-		struct strbuf out = STRBUF_INIT;
-		struct strbuf err = STRBUF_INIT;
+		strbuf_reset(&out);
 		if (format_ref_array_item(array.items[i], format, &out, &err))
 			die("%s", err.buf);
 		if (column_active(colopts)) {
@@ -452,10 +453,9 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
 			fwrite(out.buf, 1, out.len, stdout);
 			putchar('\n');
 		}
-		strbuf_release(&err);
-		strbuf_release(&out);
 	}
 
+	strbuf_release(&out);
 	ref_array_clear(&array);
 	free(to_free);
 }
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index cb9c81a04606..157cc8623949 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -22,6 +22,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 	struct ref_array array;
 	struct ref_filter filter;
 	struct ref_format format = REF_FORMAT_INIT;
+	struct strbuf output = STRBUF_INIT;
+	struct strbuf err = STRBUF_INIT;
 
 	struct option opts[] = {
 		OPT_BIT('s', "shell", &format.quote_style,
@@ -80,8 +82,15 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
 
 	if (!maxcount || array.nr < maxcount)
 		maxcount = array.nr;
-	for (i = 0; i < maxcount; i++)
-		show_ref_array_item(array.items[i], &format);
+	for (i = 0; i < maxcount; i++) {
+		strbuf_reset(&output);
+		if (format_ref_array_item(array.items[i], &format, &output, &err))
+			die("%s", err.buf);
+		fwrite(output.buf, 1, output.len, stdout);
+		putchar('\n');
+	}
+
+	strbuf_release(&output);
 	ref_array_clear(&array);
 	return 0;
 }
diff --git a/builtin/tag.c b/builtin/tag.c
index d403417b5625..b172f3861413 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -39,6 +39,8 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
 		     struct ref_format *format)
 {
 	struct ref_array array;
+	struct strbuf output = STRBUF_INIT;
+	struct strbuf err = STRBUF_INIT;
 	char *to_free = NULL;
 	int i;
 
@@ -63,8 +65,15 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
 	filter_refs(&array, filter, FILTER_REFS_TAGS);
 	ref_array_sort(sorting, &array);
 
-	for (i = 0; i < array.nr; i++)
-		show_ref_array_item(array.items[i], format);
+	for (i = 0; i < array.nr; i++) {
+		strbuf_reset(&output);
+		if (format_ref_array_item(array.items[i], format, &output, &err))
+			die("%s", err.buf);
+		fwrite(output.buf, 1, output.len, stdout);
+		putchar('\n');
+	}
+
+	strbuf_release(&output);
 	ref_array_clear(&array);
 	free(to_free);
 
diff --git a/ref-filter.c b/ref-filter.c
index f0bd32f71416..9e38e42fb7ae 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -2435,27 +2435,21 @@ int format_ref_array_item(struct ref_array_item *info,
 	return 0;
 }
 
-void show_ref_array_item(struct ref_array_item *info,
-			 const struct ref_format *format)
-{
-	struct strbuf final_buf = STRBUF_INIT;
-	struct strbuf error_buf = STRBUF_INIT;
-
-	if (format_ref_array_item(info, format, &final_buf, &error_buf))
-		die("%s", error_buf.buf);
-	fwrite(final_buf.buf, 1, final_buf.len, stdout);
-	strbuf_release(&error_buf);
-	strbuf_release(&final_buf);
-	putchar('\n');
-}
-
 void pretty_print_ref(const char *name, const struct object_id *oid,
 		      const struct ref_format *format)
 {
 	struct ref_array_item *ref_item;
+	struct strbuf output = STRBUF_INIT;
+	struct strbuf err = STRBUF_INIT;
+
 	ref_item = new_ref_array_item(name, oid);
 	ref_item->kind = ref_kind_from_refname(name);
-	show_ref_array_item(ref_item, format);
+	if (format_ref_array_item(ref_item, format, &output, &err))
+		die("%s", err.buf);
+	fwrite(output.buf, 1, output.len, stdout);
+	putchar('\n');
+
+	strbuf_release(&output);
 	free_array_item(ref_item);
 }
 
diff --git a/ref-filter.h b/ref-filter.h
index 19ea4c413409..baf72a718965 100644
--- a/ref-filter.h
+++ b/ref-filter.h
@@ -119,8 +119,6 @@ int format_ref_array_item(struct ref_array_item *info,
 			  const struct ref_format *format,
 			  struct strbuf *final_buf,
 			  struct strbuf *error_buf);
-/*  Print the ref using the given format and quote_style */
-void show_ref_array_item(struct ref_array_item *info, const struct ref_format *format);
 /*  Parse a single sort specifier and add it to the list */
 void parse_ref_sorting(struct ref_sorting **sorting_tail, const char *atom);
 /*  Callback function for parsing the sort option */

base-commit: 2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item
  2021-04-09 13:35 [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item ZheNing Hu via GitGitGadget
@ 2021-04-16 11:28 ` ZheNing Hu
  2021-04-17  9:11 ` René Scharfe.
  1 sibling, 0 replies; 5+ messages in thread
From: ZheNing Hu @ 2021-04-16 11:28 UTC (permalink / raw)
  To: Jeff King
  Cc: Git List, Junio C Hamano, Christian Couder, Hariom Verma,
	Eric Sunshine, Derrick Stolee, René Scharfe

ZheNing Hu via GitGitGadget <gitgitgadget@gmail.com> 于2021年4月9日周五 下午9:35写道:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> When we use `git for-each-ref`, every ref will call
> `show_ref_array_item()` and allocate its own final strbuf.
> But we can reuse the final strbuf for each step ref's output.
> Since `show_ref_array_item()` is not used in many places,
> we can directly delete `show_ref_array_item()` and use the
> same logic code to replace it. In this way, the caller can
> clearly see how the loop work.
>
> The performance for `git for-each-ref` on the Git repository
> itself with performance testing tool `hyperfine` changes from
> 23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms.
>
> At the same time, we apply this optimization to `git tag -l`
> and `git branch -l`, the `git branch -l` performance upgrade
> from 5.8 ms ± 0.8 ms to 2.7 ms ± 0.2 ms and `git tag -l`
> performance upgrade from 5.9 ms ± 0.4 ms to 5.4 ms ± 0.4 ms.
> Since the number of tags in git.git is much more than branches,
> so this shows that the optimization will be more obvious in
> those repositories that contain a small number of objects.
>
> This approach is similar to the one used by 79ed0a5
> (cat-file: use a single strbuf for all output, 2018-08-14)
> to speed up the cat-file builtin.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>     [GSOC] ref-filter: get rid of show_ref_array_item
>
>     Now git for-each-ref can reuse final buf for all refs output, the
>     performance is slightly improved, This optimization is also applied to
>     git tag -l and git branch -l.
>
>     Thanks.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-928%2Fadlternative%2Fref-filter-reuse-buf-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-928/adlternative/ref-filter-reuse-buf-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/928
>
>  builtin/branch.c       |  8 ++++----
>  builtin/for-each-ref.c | 13 +++++++++++--
>  builtin/tag.c          | 13 +++++++++++--
>  ref-filter.c           | 24 +++++++++---------------
>  ref-filter.h           |  2 --
>  5 files changed, 35 insertions(+), 25 deletions(-)
>
> diff --git a/builtin/branch.c b/builtin/branch.c
> index bcc00bcf182d..5c797e992aa4 100644
> --- a/builtin/branch.c
> +++ b/builtin/branch.c
> @@ -411,6 +411,8 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
>  {
>         int i;
>         struct ref_array array;
> +       struct strbuf out = STRBUF_INIT;
> +       struct strbuf err = STRBUF_INIT;
>         int maxwidth = 0;
>         const char *remote_prefix = "";
>         char *to_free = NULL;
> @@ -440,8 +442,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
>         ref_array_sort(sorting, &array);
>
>         for (i = 0; i < array.nr; i++) {
> -               struct strbuf out = STRBUF_INIT;
> -               struct strbuf err = STRBUF_INIT;
> +               strbuf_reset(&out);
>                 if (format_ref_array_item(array.items[i], format, &out, &err))
>                         die("%s", err.buf);
>                 if (column_active(colopts)) {
> @@ -452,10 +453,9 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
>                         fwrite(out.buf, 1, out.len, stdout);
>                         putchar('\n');
>                 }
> -               strbuf_release(&err);
> -               strbuf_release(&out);
>         }
>
> +       strbuf_release(&out);
>         ref_array_clear(&array);
>         free(to_free);
>  }
> diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
> index cb9c81a04606..157cc8623949 100644
> --- a/builtin/for-each-ref.c
> +++ b/builtin/for-each-ref.c
> @@ -22,6 +22,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
>         struct ref_array array;
>         struct ref_filter filter;
>         struct ref_format format = REF_FORMAT_INIT;
> +       struct strbuf output = STRBUF_INIT;
> +       struct strbuf err = STRBUF_INIT;
>
>         struct option opts[] = {
>                 OPT_BIT('s', "shell", &format.quote_style,
> @@ -80,8 +82,15 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
>
>         if (!maxcount || array.nr < maxcount)
>                 maxcount = array.nr;
> -       for (i = 0; i < maxcount; i++)
> -               show_ref_array_item(array.items[i], &format);
> +       for (i = 0; i < maxcount; i++) {
> +               strbuf_reset(&output);
> +               if (format_ref_array_item(array.items[i], &format, &output, &err))
> +                       die("%s", err.buf);
> +               fwrite(output.buf, 1, output.len, stdout);
> +               putchar('\n');
> +       }
> +
> +       strbuf_release(&output);
>         ref_array_clear(&array);
>         return 0;
>  }
> diff --git a/builtin/tag.c b/builtin/tag.c
> index d403417b5625..b172f3861413 100644
> --- a/builtin/tag.c
> +++ b/builtin/tag.c
> @@ -39,6 +39,8 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
>                      struct ref_format *format)
>  {
>         struct ref_array array;
> +       struct strbuf output = STRBUF_INIT;
> +       struct strbuf err = STRBUF_INIT;
>         char *to_free = NULL;
>         int i;
>
> @@ -63,8 +65,15 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
>         filter_refs(&array, filter, FILTER_REFS_TAGS);
>         ref_array_sort(sorting, &array);
>
> -       for (i = 0; i < array.nr; i++)
> -               show_ref_array_item(array.items[i], format);
> +       for (i = 0; i < array.nr; i++) {
> +               strbuf_reset(&output);
> +               if (format_ref_array_item(array.items[i], format, &output, &err))
> +                       die("%s", err.buf);
> +               fwrite(output.buf, 1, output.len, stdout);
> +               putchar('\n');
> +       }
> +
> +       strbuf_release(&output);
>         ref_array_clear(&array);
>         free(to_free);
>
> diff --git a/ref-filter.c b/ref-filter.c
> index f0bd32f71416..9e38e42fb7ae 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -2435,27 +2435,21 @@ int format_ref_array_item(struct ref_array_item *info,
>         return 0;
>  }
>
> -void show_ref_array_item(struct ref_array_item *info,
> -                        const struct ref_format *format)
> -{
> -       struct strbuf final_buf = STRBUF_INIT;
> -       struct strbuf error_buf = STRBUF_INIT;
> -
> -       if (format_ref_array_item(info, format, &final_buf, &error_buf))
> -               die("%s", error_buf.buf);
> -       fwrite(final_buf.buf, 1, final_buf.len, stdout);
> -       strbuf_release(&error_buf);
> -       strbuf_release(&final_buf);
> -       putchar('\n');
> -}
> -
>  void pretty_print_ref(const char *name, const struct object_id *oid,
>                       const struct ref_format *format)
>  {
>         struct ref_array_item *ref_item;
> +       struct strbuf output = STRBUF_INIT;
> +       struct strbuf err = STRBUF_INIT;
> +
>         ref_item = new_ref_array_item(name, oid);
>         ref_item->kind = ref_kind_from_refname(name);
> -       show_ref_array_item(ref_item, format);
> +       if (format_ref_array_item(ref_item, format, &output, &err))
> +               die("%s", err.buf);
> +       fwrite(output.buf, 1, output.len, stdout);
> +       putchar('\n');
> +
> +       strbuf_release(&output);
>         free_array_item(ref_item);
>  }
>
> diff --git a/ref-filter.h b/ref-filter.h
> index 19ea4c413409..baf72a718965 100644
> --- a/ref-filter.h
> +++ b/ref-filter.h
> @@ -119,8 +119,6 @@ int format_ref_array_item(struct ref_array_item *info,
>                           const struct ref_format *format,
>                           struct strbuf *final_buf,
>                           struct strbuf *error_buf);
> -/*  Print the ref using the given format and quote_style */
> -void show_ref_array_item(struct ref_array_item *info, const struct ref_format *format);
>  /*  Parse a single sort specifier and add it to the list */
>  void parse_ref_sorting(struct ref_sorting **sorting_tail, const char *atom);
>  /*  Callback function for parsing the sort option */
>
> base-commit: 2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48
> --
> gitgitgadget

The patch seems to have fallen into the crack.
Jeff and Junio, willing to help?

Thanks!
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item
  2021-04-09 13:35 [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item ZheNing Hu via GitGitGadget
  2021-04-16 11:28 ` ZheNing Hu
@ 2021-04-17  9:11 ` René Scharfe.
  2021-04-18 11:22   ` ZheNing Hu
       [not found]   ` <xmqqeef47s5k.fsf@gitster.g>
  1 sibling, 2 replies; 5+ messages in thread
From: René Scharfe. @ 2021-04-17  9:11 UTC (permalink / raw)
  To: ZheNing Hu via GitGitGadget, git
  Cc: Jeff King, Junio C Hamano, Christian Couder, Hariom Verma,
	Eric Sunshine, Derrick Stolee, ZheNing Hu

Am 09.04.21 um 15:35 schrieb ZheNing Hu via GitGitGadget:
> From: ZheNing Hu <adlternative@gmail.com>
>
> When we use `git for-each-ref`, every ref will call
> `show_ref_array_item()` and allocate its own final strbuf.
> But we can reuse the final strbuf for each step ref's output.
> Since `show_ref_array_item()` is not used in many places,
> we can directly delete `show_ref_array_item()` and use the
> same logic code to replace it. In this way, the caller can
> clearly see how the loop work.

Inlining an exported function that is not providing the right level of
abstraction is a bold move that simplifies the API and can unlock
improvements at the former call sites, like the possibility to reuse an
allocated buffer in this case.  OK.

> The performance for `git for-each-ref` on the Git repository
> itself with performance testing tool `hyperfine` changes from
> 23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms.

I see a speedup as well, but it's within the noise.

> At the same time, we apply this optimization to `git tag -l`
> and `git branch -l`, the `git branch -l` performance upgrade
> from 5.8 ms ± 0.8 ms to 2.7 ms ± 0.2 ms and `git tag -l`
> performance upgrade from 5.9 ms ± 0.4 ms to 5.4 ms ± 0.4 ms.

On my system there's no measurable change with these commands.

Nevertheless I think reusing the buffer across the loops is a good
idea.

> Since the number of tags in git.git is much more than branches,
> so this shows that the optimization will be more obvious in
> those repositories that contain a small number of objects.
>
> This approach is similar to the one used by 79ed0a5
> (cat-file: use a single strbuf for all output, 2018-08-14)
> to speed up the cat-file builtin.
>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
>     [GSOC] ref-filter: get rid of show_ref_array_item
>
>     Now git for-each-ref can reuse final buf for all refs output, the
>     performance is slightly improved, This optimization is also applied to
>     git tag -l and git branch -l.
>
>     Thanks.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-928%2Fadlternative%2Fref-filter-reuse-buf-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-928/adlternative/ref-filter-reuse-buf-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/928
>
>  builtin/branch.c       |  8 ++++----
>  builtin/for-each-ref.c | 13 +++++++++++--
>  builtin/tag.c          | 13 +++++++++++--
>  ref-filter.c           | 24 +++++++++---------------
>  ref-filter.h           |  2 --
>  5 files changed, 35 insertions(+), 25 deletions(-)
>
> diff --git a/builtin/branch.c b/builtin/branch.c
> index bcc00bcf182d..5c797e992aa4 100644
> --- a/builtin/branch.c
> +++ b/builtin/branch.c
> @@ -411,6 +411,8 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
>  {
>  	int i;
>  	struct ref_array array;
> +	struct strbuf out = STRBUF_INIT;
> +	struct strbuf err = STRBUF_INIT;
>  	int maxwidth = 0;
>  	const char *remote_prefix = "";
>  	char *to_free = NULL;
> @@ -440,8 +442,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
>  	ref_array_sort(sorting, &array);
>
>  	for (i = 0; i < array.nr; i++) {
> -		struct strbuf out = STRBUF_INIT;
> -		struct strbuf err = STRBUF_INIT;
> +		strbuf_reset(&out);
>  		if (format_ref_array_item(array.items[i], format, &out, &err))

This function didn't call show_ref_array_item() to begin with, so
strictly speaking it's not related to change in the title.  It is a
preexisting example of show_ref_array_item() not being flexible enough,
though.  I think it makes sense to have separate patches for inlining
the function verbatim and reusing the output buffer when
format_ref_array_item() is called in a loop.

>  			die("%s", err.buf);
>  		if (column_active(colopts)) {
> @@ -452,10 +453,9 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
>  			fwrite(out.buf, 1, out.len, stdout);
>  			putchar('\n');
>  		}
> -		strbuf_release(&err);
> -		strbuf_release(&out);
>  	}
>
> +	strbuf_release(&out);

err is no longer released, and it is also not reset in the loop.
That change is not mentioned in the commit message, but it should.
Why is it safe?  Probably because format_ref_array_item() only
populates it if it also returns non-zero and then we end up dying
anyway.

That makes leak checking harder, though -- it's not easy to see if
err hasn't simply been forgotten to be released.  I'd just retain
the strbuf_release() call at the end of the function -- it
shouldn't have a measurable performance impact and documents that
this function is cleaning up after itself.  Thoughts?

>  	ref_array_clear(&array);
>  	free(to_free);
>  }
> diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
> index cb9c81a04606..157cc8623949 100644
> --- a/builtin/for-each-ref.c
> +++ b/builtin/for-each-ref.c
> @@ -22,6 +22,8 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
>  	struct ref_array array;
>  	struct ref_filter filter;
>  	struct ref_format format = REF_FORMAT_INIT;
> +	struct strbuf output = STRBUF_INIT;
> +	struct strbuf err = STRBUF_INIT;
>
>  	struct option opts[] = {
>  		OPT_BIT('s', "shell", &format.quote_style,
> @@ -80,8 +82,15 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix)
>
>  	if (!maxcount || array.nr < maxcount)
>  		maxcount = array.nr;
> -	for (i = 0; i < maxcount; i++)
> -		show_ref_array_item(array.items[i], &format);
> +	for (i = 0; i < maxcount; i++) {
> +		strbuf_reset(&output);
> +		if (format_ref_array_item(array.items[i], &format, &output, &err))
> +			die("%s", err.buf);
> +		fwrite(output.buf, 1, output.len, stdout);
> +		putchar('\n');
> +	}
> +
> +	strbuf_release(&output);
>  	ref_array_clear(&array);
>  	return 0;
>  }
> diff --git a/builtin/tag.c b/builtin/tag.c
> index d403417b5625..b172f3861413 100644
> --- a/builtin/tag.c
> +++ b/builtin/tag.c
> @@ -39,6 +39,8 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
>  		     struct ref_format *format)
>  {
>  	struct ref_array array;
> +	struct strbuf output = STRBUF_INIT;
> +	struct strbuf err = STRBUF_INIT;
>  	char *to_free = NULL;
>  	int i;
>
> @@ -63,8 +65,15 @@ static int list_tags(struct ref_filter *filter, struct ref_sorting *sorting,
>  	filter_refs(&array, filter, FILTER_REFS_TAGS);
>  	ref_array_sort(sorting, &array);
>
> -	for (i = 0; i < array.nr; i++)
> -		show_ref_array_item(array.items[i], format);
> +	for (i = 0; i < array.nr; i++) {
> +		strbuf_reset(&output);
> +		if (format_ref_array_item(array.items[i], format, &output, &err))
> +			die("%s", err.buf);
> +		fwrite(output.buf, 1, output.len, stdout);
> +		putchar('\n');
> +	}
> +
> +	strbuf_release(&output);
>  	ref_array_clear(&array);
>  	free(to_free);
>
> diff --git a/ref-filter.c b/ref-filter.c
> index f0bd32f71416..9e38e42fb7ae 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -2435,27 +2435,21 @@ int format_ref_array_item(struct ref_array_item *info,
>  	return 0;
>  }
>
> -void show_ref_array_item(struct ref_array_item *info,
> -			 const struct ref_format *format)
> -{
> -	struct strbuf final_buf = STRBUF_INIT;
> -	struct strbuf error_buf = STRBUF_INIT;
> -
> -	if (format_ref_array_item(info, format, &final_buf, &error_buf))
> -		die("%s", error_buf.buf);
> -	fwrite(final_buf.buf, 1, final_buf.len, stdout);
> -	strbuf_release(&error_buf);
> -	strbuf_release(&final_buf);
> -	putchar('\n');
> -}
> -
>  void pretty_print_ref(const char *name, const struct object_id *oid,
>  		      const struct ref_format *format)
>  {
>  	struct ref_array_item *ref_item;
> +	struct strbuf output = STRBUF_INIT;
> +	struct strbuf err = STRBUF_INIT;
> +
>  	ref_item = new_ref_array_item(name, oid);
>  	ref_item->kind = ref_kind_from_refname(name);
> -	show_ref_array_item(ref_item, format);
> +	if (format_ref_array_item(ref_item, format, &output, &err))
> +		die("%s", err.buf);
> +	fwrite(output.buf, 1, output.len, stdout);
> +	putchar('\n');
> +
> +	strbuf_release(&output);
>  	free_array_item(ref_item);
>  }
>
> diff --git a/ref-filter.h b/ref-filter.h
> index 19ea4c413409..baf72a718965 100644
> --- a/ref-filter.h
> +++ b/ref-filter.h
> @@ -119,8 +119,6 @@ int format_ref_array_item(struct ref_array_item *info,
>  			  const struct ref_format *format,
>  			  struct strbuf *final_buf,
>  			  struct strbuf *error_buf);
> -/*  Print the ref using the given format and quote_style */
> -void show_ref_array_item(struct ref_array_item *info, const struct ref_format *format);
>  /*  Parse a single sort specifier and add it to the list */
>  void parse_ref_sorting(struct ref_sorting **sorting_tail, const char *atom);
>  /*  Callback function for parsing the sort option */
>
> base-commit: 2e36527f23b7f6ae15e6f21ac3b08bf3fed6ee48
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item
  2021-04-17  9:11 ` René Scharfe.
@ 2021-04-18 11:22   ` ZheNing Hu
       [not found]   ` <xmqqeef47s5k.fsf@gitster.g>
  1 sibling, 0 replies; 5+ messages in thread
From: ZheNing Hu @ 2021-04-18 11:22 UTC (permalink / raw)
  To: René Scharfe.
  Cc: ZheNing Hu via GitGitGadget, Git List, Jeff King, Junio C Hamano,
	Christian Couder, Hariom Verma, Eric Sunshine, Derrick Stolee

René Scharfe. <l.s.r@web.de> 于2021年4月17日周六 下午5:11写道:
>
> Am 09.04.21 um 15:35 schrieb ZheNing Hu via GitGitGadget:
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > When we use `git for-each-ref`, every ref will call
> > `show_ref_array_item()` and allocate its own final strbuf.
> > But we can reuse the final strbuf for each step ref's output.
> > Since `show_ref_array_item()` is not used in many places,
> > we can directly delete `show_ref_array_item()` and use the
> > same logic code to replace it. In this way, the caller can
> > clearly see how the loop work.
>
> Inlining an exported function that is not providing the right level of
> abstraction is a bold move that simplifies the API and can unlock
> improvements at the former call sites, like the possibility to reuse an
> allocated buffer in this case.  OK.
>
> > The performance for `git for-each-ref` on the Git repository
> > itself with performance testing tool `hyperfine` changes from
> > 23.7 ms ± 0.9 ms to 22.2 ms ± 1.0 ms.
>
> I see a speedup as well, but it's within the noise.
>

Yes, the performance improvement is very small under a large number
of refs. It was almost completely drowned out by the noise.

> > At the same time, we apply this optimization to `git tag -l`
> > and `git branch -l`, the `git branch -l` performance upgrade
> > from 5.8 ms ± 0.8 ms to 2.7 ms ± 0.2 ms and `git tag -l`
> > performance upgrade from 5.9 ms ± 0.4 ms to 5.4 ms ± 0.4 ms.
>
> On my system there's no measurable change with these commands.
>

In our case, git branch -l has made obvious progress, but it may be because
the number of branches is far less than tags.

> Nevertheless I think reusing the buffer across the loops is a good
> idea.
>
> > Since the number of tags in git.git is much more than branches,
> > so this shows that the optimization will be more obvious in
> > those repositories that contain a small number of objects.
> >
> > This approach is similar to the one used by 79ed0a5
> > (cat-file: use a single strbuf for all output, 2018-08-14)
> > to speed up the cat-file builtin.
> >
> > Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> > ---
> >     [GSOC] ref-filter: get rid of show_ref_array_item
> >
> >     Now git for-each-ref can reuse final buf for all refs output, the
> >     performance is slightly improved, This optimization is also applied to
> >     git tag -l and git branch -l.
> >
> >     Thanks.
> >
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-928%2Fadlternative%2Fref-filter-reuse-buf-v1
> > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-928/adlternative/ref-filter-reuse-buf-v1
> > Pull-Request: https://github.com/gitgitgadget/git/pull/928
> >
> >  builtin/branch.c       |  8 ++++----
> >  builtin/for-each-ref.c | 13 +++++++++++--
> >  builtin/tag.c          | 13 +++++++++++--
> >  ref-filter.c           | 24 +++++++++---------------
> >  ref-filter.h           |  2 --
> >  5 files changed, 35 insertions(+), 25 deletions(-)
> >
> > diff --git a/builtin/branch.c b/builtin/branch.c
> > index bcc00bcf182d..5c797e992aa4 100644
> > --- a/builtin/branch.c
> > +++ b/builtin/branch.c
> > @@ -411,6 +411,8 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
> >  {
> >       int i;
> >       struct ref_array array;
> > +     struct strbuf out = STRBUF_INIT;
> > +     struct strbuf err = STRBUF_INIT;
> >       int maxwidth = 0;
> >       const char *remote_prefix = "";
> >       char *to_free = NULL;
> > @@ -440,8 +442,7 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
> >       ref_array_sort(sorting, &array);
> >
> >       for (i = 0; i < array.nr; i++) {
> > -             struct strbuf out = STRBUF_INIT;
> > -             struct strbuf err = STRBUF_INIT;
> > +             strbuf_reset(&out);
> >               if (format_ref_array_item(array.items[i], format, &out, &err))
>
> This function didn't call show_ref_array_item() to begin with, so
> strictly speaking it's not related to change in the title.  It is a
> preexisting example of show_ref_array_item() not being flexible enough,
> though.  I think it makes sense to have separate patches for inlining
> the function verbatim and reusing the output buffer when
> format_ref_array_item() is called in a loop.
>

I agree with you. I will divide this into a separate patch.

> >                       die("%s", err.buf);
> >               if (column_active(colopts)) {
> > @@ -452,10 +453,9 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
> >                       fwrite(out.buf, 1, out.len, stdout);
> >                       putchar('\n');
> >               }
> > -             strbuf_release(&err);
> > -             strbuf_release(&out);
> >       }
> >
> > +     strbuf_release(&out);
>
> err is no longer released, and it is also not reset in the loop.
> That change is not mentioned in the commit message, but it should.
> Why is it safe?  Probably because format_ref_array_item() only
> populates it if it also returns non-zero and then we end up dying
> anyway.
>
> That makes leak checking harder, though -- it's not easy to see if
> err hasn't simply been forgotten to be released.  I'd just retain
> the strbuf_release() call at the end of the function -- it
> shouldn't have a measurable performance impact and documents that
> this function is cleaning up after itself.  Thoughts?
>

Makes sense. Perhaps future changes will forget the release of err buffer.
I will add `strbuf_release()` here.

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item
       [not found]   ` <xmqqeef47s5k.fsf@gitster.g>
@ 2021-04-21  5:51     ` ZheNing Hu
  0 siblings, 0 replies; 5+ messages in thread
From: ZheNing Hu @ 2021-04-21  5:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: René Scharfe., ZheNing Hu via GitGitGadget, Git List,
	Jeff King, Christian Couder, Hariom Verma, Eric Sunshine,
	Derrick Stolee

Junio C Hamano <gitster@pobox.com> 于2021年4月21日周三 上午2:00写道:
>
> René Scharfe. <l.s.r@web.de> writes:
>
> >> @@ -452,10 +453,9 @@ static void print_ref_list(struct ref_filter *filter, struct ref_sorting *sortin
> >>                      fwrite(out.buf, 1, out.len, stdout);
> >>                      putchar('\n');
> >>              }
> >> -            strbuf_release(&err);
> >> -            strbuf_release(&out);
> >>      }
> >>
> >> +    strbuf_release(&out);
> >
> > err is no longer released, and it is also not reset in the loop.
> > That change is not mentioned in the commit message, but it should.
> > Why is it safe?  Probably because format_ref_array_item() only
> > populates it if it also returns non-zero and then we end up dying
> > anyway.
> >
> > That makes leak checking harder, though -- it's not easy to see if
> > err hasn't simply been forgotten to be released.  I'd just retain
> > the strbuf_release() call at the end of the function -- it
> > shouldn't have a measurable performance impact and documents that
> > this function is cleaning up after itself.  Thoughts?
>
> I should have responded to this comment before it was too late,
> sorry.
>
> I am OK with documenting the assumption that we will die when err
> gets populated without coming out of the loop and not releasing at
> the end (because we would not have anything to release when we got
> there).  I am also OK with resetting(err) in the loop (which will
> be a no-op as err.len would always be 0 at that point, or we would
> have died) and releasing(err) at the end.  I found it a bit funny
> to be not resetting in the loop and releasing at the end, without
> a comment that explains the thought behind it.
>

So a better solution is "without err buffer _reset() and _release()" and explain
the reason for "not needing to be cleaned up" in the commit message?

Thanks.
--
ZheNing Hu

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-04-21  5:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-09 13:35 [PATCH] [GSOC] ref-filter: get rid of show_ref_array_item ZheNing Hu via GitGitGadget
2021-04-16 11:28 ` ZheNing Hu
2021-04-17  9:11 ` René Scharfe.
2021-04-18 11:22   ` ZheNing Hu
     [not found]   ` <xmqqeef47s5k.fsf@gitster.g>
2021-04-21  5:51     ` ZheNing Hu

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).