From: ZheNing Hu <adlternative@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Christian Couder <christian.couder@gmail.com>,
Taylor Blau <ttaylorr@github.com>, git <git@vger.kernel.org>,
Christian Couder <chriscool@tuxfamily.org>,
Hariom verma <hariom18599@gmail.com>
Subject: Re: Git in Outreachy?
Date: Sun, 5 Sep 2021 16:58:57 +0800 [thread overview]
Message-ID: <CAOLTT8S1Tfu6YWcoHhZcydQYd_yBBCavdqyV_TzoOrEW6zHXGQ@mail.gmail.com> (raw)
In-Reply-To: <YTNrehKnfPo3E5RI@coredump.intra.peff.net>
Jeff King <peff@peff.net> 于2021年9月4日周六 下午8:50写道:
>
> On Sat, Sep 04, 2021 at 03:40:41PM +0800, ZheNing Hu wrote:
>
> > This may be a place to promote my patches: See [1][2][3].
> > It can provide some extra atoms for git cat-file --batch | --batch-check,
> > like %(tree), %(author), %(tagger) etc. Although some performance
> > optimizations have been made, It still has small performance gap.
> >
> > If the community still expects git cat-file --batch to reuse the logic
> > of ref-filter,
> > I expect it to get the attention of reviewers.
> >
> > The solutions I can think of to further optimize performance are:
> > 1. Delay the evaluation of some ref-filter intermediate data.
> > 2. Let ref-filter code reentrant and can be called in multi-threaded to take
> > advantage of multi-core.
>
> I don't think trying to thread it will help much. For expensive formats,
> where we have to actually open and parse objects, in theory we could do
> that in parallel. But most of our time there is spent in zlib getting
> the object data, and that all needs to be done under a big lock.
>
This big lock is "obj_read_lock()", right? If there are indeed the limitations
of these locks, I am afraid that the parallel scheme is not good.
> For little formats (e.g., just printing "%(refname)"), we need to
> serialize the output anyway. So our unit of work is so tiny, I suspect
> that the threading overhead would be a net negative.
>
Make sence.
> I was coincidentally looking at ref-filter last week, and it seemed to
> me that a lot of the slowness is because of the over-use of malloc
Agree. malloc() and data-copy is the reason for the poor performance of
ref-filter.
> (e.g., we allocate a substring for every atom_value, and then form them
> into a separate buffer). If we could parse the original format into a
> form that could be traversed without having to do further allocations,
> just writing directly to a strbuf (or even a file handle), I think that
> would be a big improvement.
>
This patch has been tried to eliminate some malloc and data-copy:
https://lore.kernel.org/git/3760ff032bb1dec3812881fd408f8d78ec125477.1629184489.git.gitgitgadget@gmail.com/
It is indeed possible to obtain some optimizations.
> I just posted the results of some of my experiments to the list:
>
> https://lore.kernel.org/git/YTNpQ7Od1U%2F5i0R7@coredump.intra.peff.net/
>
> I don't think that gives any kind of useful base to build on, but it
> shows what's possible by skipping past various segments of the
> ref-filter code.
>
> -Peff
Thanks.
--
ZheNing Hu
next prev parent reply other threads:[~2021-09-05 9:01 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-03 2:40 Git in Outreachy? Taylor Blau
2021-09-03 18:33 ` Emily Shaffer
2021-09-04 4:30 ` Christian Couder
2021-09-04 7:40 ` ZheNing Hu
2021-09-04 12:50 ` Jeff King
2021-09-05 8:58 ` ZheNing Hu [this message]
2021-09-06 12:36 ` Matheus Tavares Bernardino
2021-09-07 5:50 ` ZheNing Hu
2021-09-04 17:51 ` Taylor Blau
2021-09-18 16:10 ` Taylor Blau
2021-09-20 7:45 ` ZheNing Hu
2021-09-20 14:52 ` Christian Couder
2021-09-20 15:15 ` Christian Couder
2021-09-21 5:41 ` ZheNing Hu
2021-09-21 15:39 ` Christian Couder
2021-09-22 15:01 ` ZheNing Hu
2021-09-21 5:39 ` ZheNing Hu
2021-09-21 15:35 ` Christian Couder
2021-09-22 14:58 ` ZheNing Hu
2021-09-21 21:25 ` Taylor Blau
2021-09-29 14:18 ` Christian Couder
2021-09-29 17:34 ` Taylor Blau
2021-09-29 20:30 ` Taylor Blau
-- strict thread matches above, loose matches on Subject: below --
2020-08-28 6:56 Jeff King
2020-08-31 6:55 ` Christian Couder
2020-09-03 6:00 ` Jonathan Nieder
2020-09-04 14:14 ` Philip Oakley
2020-09-07 18:49 ` Johannes Schindelin
2020-09-16 15:16 ` Philip Oakley
2020-09-16 18:43 ` Johannes Schindelin
2020-09-17 14:42 ` Philip Oakley
2020-09-09 18:26 ` Taylor Blau
2020-09-10 1:39 ` Jonathan Nieder
2020-09-10 2:19 ` Taylor Blau
2020-09-16 9:12 ` Christian Couder
2020-09-16 6:42 ` Christian Couder
2020-08-31 17:41 ` Junio C Hamano
2020-08-31 18:05 ` Emily Shaffer
2020-09-01 12:51 ` Jeff King
2020-09-03 5:41 ` Jeff King
2020-09-15 17:35 ` Jeff King
2020-09-15 17:55 ` Kaartic Sivaraam
2020-09-15 18:02 ` Jeff King
2020-09-19 8:12 ` Christian Couder
2020-09-19 15:10 ` Phillip Wood
2020-09-16 8:45 ` Christian Couder
2020-09-02 4:00 ` Johannes Schindelin
2020-09-16 9:01 ` Christian Couder
2020-09-16 9:45 ` Phillip Wood
2020-09-17 9:43 ` Christian Couder
2020-09-17 10:14 ` Phillip Wood
2020-09-18 8:37 ` Christian Couder
2020-09-17 15:34 ` Elijah Newren
2020-09-18 8:42 ` Christian Couder
2020-09-27 16:59 ` Kaartic Sivaraam
2020-09-27 21:16 ` Christian Couder
2020-10-29 10:13 ` Christian Couder
2020-09-06 18:56 ` Kaartic Sivaraam
2020-09-07 18:55 ` Johannes Schindelin
2020-09-16 9:35 ` Christian Couder
2020-09-16 20:27 ` Johannes Schindelin
2020-09-19 7:40 ` Christian Couder
2020-09-20 15:06 ` Johannes Schindelin
2020-09-20 16:31 ` Kaartic Sivaraam
2020-09-21 4:22 ` Christian Couder
2020-09-21 7:59 ` Kaartic Sivaraam
2020-09-21 20:56 ` Shourya Shukla
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOLTT8S1Tfu6YWcoHhZcydQYd_yBBCavdqyV_TzoOrEW6zHXGQ@mail.gmail.com \
--to=adlternative@gmail.com \
--cc=chriscool@tuxfamily.org \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=hariom18599@gmail.com \
--cc=peff@peff.net \
--cc=ttaylorr@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).