git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: ZheNing Hu <adlternative@gmail.com>
To: Christian Couder <christian.couder@gmail.com>
Cc: "Git List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Stefan Beller" <sbeller@google.com>,
	"Hariom verma" <hariom18599@gmail.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Jeff King" <peff@peff.net>
Subject: Re: [GSOC] [QUESTION] ref-filter: can %(raw) implement reuse oi.content?
Date: Thu, 19 Aug 2021 09:39:33 +0800	[thread overview]
Message-ID: <CAOLTT8Qx3=C=MwRmKbrp=G5T_rQVcaLbZfzzO60m7P-_k1qh8A@mail.gmail.com> (raw)
In-Reply-To: <CAOLTT8RY213BMTq+wx8yS=0QjY55L1BnCgPHQph1uos2oX03gw@mail.gmail.com>

Hi, Christian and Hariom,

I want to use this patch series as the temporary final version of GSOC project:

https://github.com/adlternative/git/commits/cat-file-reuse-ref-filter-logic

Due to the branch ref-filter-opt-code-logic or branch
ref-filter-opt-perf patch series
temporarily unable to reflect its optimization to git cat-file
--batch. Therefore, using
branch cat-file-reuse-ref-filter-logic is the most effective now.

This is the final performance regression test result:
Test                                        upstream/master   this
tree
------------------------------------------------------------------------------------
1006.2: cat-file --batch-check              0.06(0.06+0.00)
0.08(0.07+0.00) +33.3%
1006.3: cat-file --batch-check with atoms   0.06(0.04+0.01)
0.06(0.06+0.00) +0.0%
1006.4: cat-file --batch                    0.49(0.47+0.02)
0.48(0.47+0.01) -2.0%
1006.5: cat-file --batch with atoms         0.48(0.44+0.03)
0.47(0.46+0.01) -2.1%

git cat-file --batch has a performance improvement of about 2%.
git cat-file --batch-check still has a performance gap of 33.3%.

The performance degradation of git cat-file --batch-check is actually
not very big.

upstream/master (225bc32a98):

$ hyperfine --warmup=10  "~/git/bin-wrappers/git cat-file
--batch-check --batch-all-objects"
Benchmark #1: ~/git/bin-wrappers/git cat-file --batch-check --batch-all-objects
 Time (mean ± σ):     596.2 ms ±   5.7 ms    [User: 563.0 ms, System: 32.5 ms]
 Range (min … max):   586.9 ms … 607.9 ms    10 runs

cat-file-reuse-ref-filter-logic (709a0c5c12):

$ hyperfine --warmup=10  "~/git/bin-wrappers/git cat-file
--batch-check --batch-all-objects"
Benchmark #1: ~/git/bin-wrappers/git cat-file --batch-check --batch-all-objects
 Time (mean ± σ):     601.3 ms ±   5.8 ms    [User: 566.9 ms, System: 33.9 ms]
 Range (min … max):   596.7 ms … 613.3 ms    10 runs

The execution time of git cat-file --batch-check is only a few
milliseconds away.

But look at the execution time changes of git cat-file --batch:

upstream/master (225bc32a98):

$ time ~/git/bin-wrappers/git cat-file --batch --batch-all-objects
>/dev/null
/home/adl/git/bin-wrappers/git cat-file --batch --batch-all-objects >
 24.61s user 0.30s system 99% cpu 24.908 total

cat-file-reuse-ref-filter-logic (709a0c5c12):

$ time ~/git/bin-wrappers/git cat-file --batch --batch-all-objects >/dev/null
cat-file --batch --batch-all-objects > /dev/null  25.10s user 0.30s
system 99% cpu 25.417 total

The execution time has been reduced by nearly 0.5 seconds. Intuition
tells me that the performance improvement of git cat-file --batch will be
more important.

In fact, git cat-file origin code directly adds the obtained object data
to the output buffer; But after using ref-filter logic, it needs to copy
the object data to the intermediate data (atom_value), and finally
to the output buffer. At present, we cannot easily eliminate intermediate
data, because git for-each-ref --sort has a lot of dependence on it,
but we can reduce the overhead of copying or allocating memory as
much as possible.

I had an idea that I didn't implement before: partial data delayed evaluation.
Or to be more specific, waiting until the data is about to be added to
the output
buffer, form specific output content, this may be a way to bypass the
intermediate
data.

To be optimistic, I think this patch can be merged with the current
performance of
git cat-file --batch. Of course, this still needs more suggestions
from reviewers.

Thanks.
--
ZheNing Hu

  reply	other threads:[~2021-08-19  1:39 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-16 14:00 [GSOC] [QUESTION] ref-filter: can %(raw) implement reuse oi.content? ZheNing Hu
2021-08-17 14:34 ` Fwd: " ZheNing Hu
2021-08-17 16:09 ` Christian Couder
2021-08-18  4:51   ` ZheNing Hu
2021-08-18  8:53     ` Christian Couder
2021-08-18  9:07       ` ZheNing Hu
2021-08-18 11:11         ` ZheNing Hu
2021-08-19  1:39           ` ZheNing Hu [this message]
2021-08-20 16:13             ` Christian Couder
2021-08-21  2:36               ` ZheNing Hu
2021-08-20 15:58           ` Christian Couder
2021-08-21  2:16             ` ZheNing Hu
2021-08-24  7:11               ` Christian Couder
2021-08-25  8:11                 ` ZheNing Hu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOLTT8Qx3=C=MwRmKbrp=G5T_rQVcaLbZfzzO60m7P-_k1qh8A@mail.gmail.com' \
    --to=adlternative@gmail.com \
    --cc=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hariom18599@gmail.com \
    --cc=peff@peff.net \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).