git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: ZheNing Hu <adlternative@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Christian Couder" <christian.couder@gmail.com>,
	"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com>,
	git <git@vger.kernel.org>, "Hariom Verma" <hariom18599@gmail.com>,
	"Bagas Sanjaya" <bagasdotme@gmail.com>,
	"Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Eric Sunshine" <sunshine@sunshineco.com>
Subject: Re: [PATCH 14/19] [GSOC] cat-file: reuse ref-filter logic
Date: Thu, 15 Jul 2021 09:53:23 +0800	[thread overview]
Message-ID: <CAOLTT8Sa984Eo18QMBeGnMCX3_7sr+9qUYoAR4FS3UF6+CDtGw@mail.gmail.com> (raw)
In-Reply-To: <CAOLTT8RR4+tUuT2yc2PDL9NwCburW8bM_Sht6nhKJ_fYV8fGsQ@mail.gmail.com>

ZheNing Hu <adlternative@gmail.com> 于2021年7月15日周四 上午12:24写道:
>
> Junio C Hamano <gitster@pobox.com> 于2021年7月13日周二 上午4:38写道:
>
> >
> > Christian Couder <christian.couder@gmail.com> writes:
> >
> > More importantly, why is such a fast-path even needed?  Isn't it a
> > sign that the ref-filter implementation is eating more cycles than
> > it should for given set of placeholders?  Do we know where the extra
> > cycles goes?
> >
> > I find it somewhat alarming if we are talking about "fast-path"
> > workaround before understanding why we are seeing slowdown in the
> > first place.
>
> There is no complete conclusion yet, but I try to use time and hyperfine test
> for these commits (t/perf/* is not accurate enough):
>
> ----------------------------------------------------------------------------------------------------------------------------
> |                        subject                                  |
> --batch-check (using hyperfine) |   --batch(using time) |
> ----------------------------------------------------------------------------------------------------------------------------
> |[GSOC] cat-file: use fast path when using default_format         |
>         700ms                |          25.450s      |
> ----------------------------------------------------------------------------------------------------------------------------
> |[GSOC] cat-file: re-implement --textconv, --filters options      |
>         790ms                |          29.933s      |
> ----------------------------------------------------------------------------------------------------------------------------
> |[GSOC] cat-file: reuse err buf in batch_object_write()           |
>         770ms                |          29.153s      |
> ----------------------------------------------------------------------------------------------------------------------------
> |[GSOC] cat-file: reuse ref-filter logic                          |
>         780ms                |          29.412s      |
> ----------------------------------------------------------------------------------------------------------------------------
> |The third batch (upstream/master)                                |
>         640ms                |          26.025s      |
> ----------------------------------------------------------------------------------------------------------------------------
>
> I think we their cost is indeed from "[GSOC] cat-file: reuse ref-filter logic".
> But what causes the loss of performance needs further analysis.
>

Now I think:
There are three main reasons why the performance of cat-file --batch
deteriorates after refactor.

1. Too many copies are used in ref-filter and we cannot avoid these copies
easily because ref-filter needs these copied data to implement atoms %(if),
%(else), %(end)... and the --sort option. The original cat-file
--batch only needs
to output the data to the final string. Its copy times are relatively small.

2. More complex data structure and parsing process are used in ref-filter.
This is why it can provide more and more useful atoms. Therefore, I think the
performance degradation that occurs here is normal.

3. As Ævar Arnfjörð Bjarmason mentioned, oid_object_info_extend() was used
twice in get_object() before. oid_object_info_extend() is the hot
path, we should
try to avoid calling it, So in last version of  "[GSOC] cat-file:
re-implement --textconv,
--filters options", I make the unified processing of --textconv and
--filter avoid calling
oid_object_info_extend() twice.

Thanks.
--
ZheNing Hu

  reply	other threads:[~2021-07-15  1:53 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-12 11:46 [PATCH 00/19] [GSOC] cat-file: reuse ref-filter logic ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 01/19] cat-file: handle trivial --batch format with --batch-all-objects ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 02/19] cat-file: merge two block into one ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 03/19] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 04/19] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 05/19] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 06/19] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 07/19] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 08/19] [GSOC] ref-filter: pass get_object() return value to their callers ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 09/19] [GSOC] ref-filter: introduce free_ref_array_item_value() function ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 10/19] [GSOC] ref-filter: introduce reject_atom() ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 11/19] [GSOC] ref-filter: modify the error message and value in get_object ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 12/19] [GSOC] cat-file: add has_object_file() check ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 13/19] [GSOC] cat-file: change batch_objects parameter name ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 14/19] [GSOC] cat-file: reuse ref-filter logic ZheNing Hu via GitGitGadget
2021-07-12 13:17   ` Christian Couder
2021-07-12 13:26     ` Christian Couder
2021-07-12 13:51       ` ZheNing Hu
2021-07-12 13:49     ` ZheNing Hu
2021-07-12 20:38     ` Junio C Hamano
2021-07-14 16:24       ` ZheNing Hu
2021-07-15  1:53         ` ZheNing Hu [this message]
2021-07-15  9:45           ` Christian Couder
2021-07-15 13:53             ` ZheNing Hu
2021-07-15 14:55     ` ZheNing Hu
2021-07-12 11:46 ` [PATCH 15/19] [GSOC] cat-file: reuse err buf in batch_object_write() ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 16/19] [GSOC] cat-file: re-implement --textconv, --filters options ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 17/19] [GSOC] ref-filter: remove grab_oid() function ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 18/19] [GSOC] cat-file: create p1006-cat-file.sh ZheNing Hu via GitGitGadget
2021-07-12 11:46 ` [PATCH 19/19] [GSOC] cat-file: use fast path when using default_format ZheNing Hu via GitGitGadget
2021-07-12 12:36 ` [PATCH 00/19] [GSOC] cat-file: reuse ref-filter logic Christian Couder
2021-07-12 13:01   ` ZheNing Hu
2021-07-12 13:02 ` Philip Oakley
2021-07-12 13:27   ` ZheNing Hu
2021-07-15 15:40 ` [PATCH v2 00/17] " ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 01/17] [GSOC] ref-filter: add obj-type check in grab contents ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 02/17] [GSOC] ref-filter: add %(raw) atom ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 03/17] [GSOC] ref-filter: --format=%(raw) re-support --perl ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 04/17] [GSOC] ref-filter: use non-const ref_format in *_atom_parser() ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 05/17] [GSOC] ref-filter: add %(rest) atom ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 06/17] [GSOC] ref-filter: pass get_object() return value to their callers ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 07/17] [GSOC] ref-filter: introduce free_ref_array_item_value() function ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 08/17] [GSOC] ref-filter: add cat_file_mode to ref_format ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 09/17] [GSOC] ref-filter: modify the error message and value in get_object ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 10/17] [GSOC] cat-file: add has_object_file() check ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 11/17] [GSOC] cat-file: change batch_objects parameter name ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 12/17] [GSOC] cat-file: create p1006-cat-file.sh ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 13/17] [GSOC] cat-file: reuse ref-filter logic ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 14/17] [GSOC] cat-file: reuse err buf in batch_object_write() ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 15/17] [GSOC] cat-file: re-implement --textconv, --filters options ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 16/17] [GSOC] ref-filter: remove grab_oid() function ZheNing Hu via GitGitGadget
2021-07-15 15:40   ` [PATCH v2 17/17] [GSOC] cat-file: use fast path when using default_format ZheNing Hu via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOLTT8Sa984Eo18QMBeGnMCX3_7sr+9qUYoAR4FS3UF6+CDtGw@mail.gmail.com \
    --to=adlternative@gmail.com \
    --cc=avarab@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    --cc=peff@peff.net \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).