git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: ZheNing Hu <adlternative@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Christian Couder <christian.couder@gmail.com>,
	Taylor Blau <ttaylorr@github.com>, git <git@vger.kernel.org>,
	Christian Couder <chriscool@tuxfamily.org>,
	Hariom verma <hariom18599@gmail.com>
Subject: Re: Git in Outreachy?
Date: Sun, 5 Sep 2021 16:58:57 +0800	[thread overview]
Message-ID: <CAOLTT8S1Tfu6YWcoHhZcydQYd_yBBCavdqyV_TzoOrEW6zHXGQ@mail.gmail.com> (raw)
In-Reply-To: <YTNrehKnfPo3E5RI@coredump.intra.peff.net>

Jeff King <peff@peff.net> 于2021年9月4日周六 下午8:50写道:
>
> On Sat, Sep 04, 2021 at 03:40:41PM +0800, ZheNing Hu wrote:
>
> > This may be a place to promote my patches: See [1][2][3].
> > It can provide some extra atoms for git cat-file --batch | --batch-check,
> > like %(tree), %(author), %(tagger) etc. Although some performance
> > optimizations have been made, It still has small performance gap.
> >
> > If the community still expects git cat-file --batch to reuse the logic
> > of ref-filter,
> > I expect it to get the attention of reviewers.
> >
> > The solutions I can think of to further optimize performance are:
> > 1. Delay the evaluation of some ref-filter intermediate data.
> > 2. Let ref-filter code reentrant and can be called in multi-threaded  to take
> > advantage of multi-core.
>
> I don't think trying to thread it will help much. For expensive formats,
> where we have to actually open and parse objects, in theory we could do
> that in parallel. But most of our time there is spent in zlib getting
> the object data, and that all needs to be done under a big lock.
>

This big lock is "obj_read_lock()", right? If there are indeed the limitations
of these locks, I am afraid that the parallel scheme is not good.

> For little formats (e.g., just printing "%(refname)"), we need to
> serialize the output anyway. So our unit of work is so tiny, I suspect
> that the threading overhead would be a net negative.
>

Make sence.

> I was coincidentally looking at ref-filter last week, and it seemed to
> me that a lot of the slowness is because of the over-use of malloc

Agree. malloc() and data-copy is the reason for the poor performance of
ref-filter.

> (e.g., we allocate a substring for every atom_value, and then form them
> into a separate buffer). If we could parse the original format into a
> form that could be traversed without having to do further allocations,
> just writing directly to a strbuf (or even a file handle), I think that
> would be a big improvement.
>

This patch has been tried to eliminate some malloc and data-copy:
https://lore.kernel.org/git/3760ff032bb1dec3812881fd408f8d78ec125477.1629184489.git.gitgitgadget@gmail.com/
It is indeed possible to obtain some optimizations.

> I just posted the results of some of my experiments to the list:
>
>   https://lore.kernel.org/git/YTNpQ7Od1U%2F5i0R7@coredump.intra.peff.net/
>
> I don't think that gives any kind of useful base to build on, but it
> shows what's possible by skipping past various segments of the
> ref-filter code.
>
> -Peff

Thanks.
--
ZheNing Hu

  reply	other threads:[~2021-09-05  9:01 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-03  2:40 Git in Outreachy? Taylor Blau
2021-09-03 18:33 ` Emily Shaffer
2021-09-04  4:30 ` Christian Couder
2021-09-04  7:40   ` ZheNing Hu
2021-09-04 12:50     ` Jeff King
2021-09-05  8:58       ` ZheNing Hu [this message]
2021-09-06 12:36         ` Matheus Tavares Bernardino
2021-09-07  5:50           ` ZheNing Hu
2021-09-04 17:51 ` Taylor Blau
2021-09-18 16:10 ` Taylor Blau
2021-09-20  7:45   ` ZheNing Hu
2021-09-20 14:52     ` Christian Couder
2021-09-20 15:15       ` Christian Couder
2021-09-21  5:41         ` ZheNing Hu
2021-09-21 15:39           ` Christian Couder
2021-09-22 15:01             ` ZheNing Hu
2021-09-21  5:39       ` ZheNing Hu
2021-09-21 15:35         ` Christian Couder
2021-09-22 14:58           ` ZheNing Hu
2021-09-21 21:25   ` Taylor Blau
2021-09-29 14:18     ` Christian Couder
2021-09-29 17:34       ` Taylor Blau
2021-09-29 20:30         ` Taylor Blau
  -- strict thread matches above, loose matches on Subject: below --
2020-08-28  6:56 Jeff King
2020-08-31  6:55 ` Christian Couder
2020-09-03  6:00   ` Jonathan Nieder
2020-09-04 14:14     ` Philip Oakley
2020-09-07 18:49       ` Johannes Schindelin
2020-09-16 15:16         ` Philip Oakley
2020-09-16 18:43           ` Johannes Schindelin
2020-09-17 14:42             ` Philip Oakley
2020-09-09 18:26     ` Taylor Blau
2020-09-10  1:39       ` Jonathan Nieder
2020-09-10  2:19         ` Taylor Blau
2020-09-16  9:12     ` Christian Couder
2020-09-16  6:42   ` Christian Couder
2020-08-31 17:41 ` Junio C Hamano
2020-08-31 18:05 ` Emily Shaffer
2020-09-01 12:51   ` Jeff King
2020-09-03  5:41     ` Jeff King
2020-09-15 17:35       ` Jeff King
2020-09-15 17:55         ` Kaartic Sivaraam
2020-09-15 18:02           ` Jeff King
2020-09-19  8:12         ` Christian Couder
2020-09-19 15:10           ` Phillip Wood
2020-09-16  8:45     ` Christian Couder
2020-09-02  4:00 ` Johannes Schindelin
2020-09-16  9:01   ` Christian Couder
2020-09-16  9:45     ` Phillip Wood
2020-09-17  9:43     ` Christian Couder
2020-09-17 10:14       ` Phillip Wood
2020-09-18  8:37         ` Christian Couder
2020-09-17 15:34       ` Elijah Newren
2020-09-18  8:42         ` Christian Couder
2020-09-27 16:59     ` Kaartic Sivaraam
2020-09-27 21:16       ` Christian Couder
2020-10-29 10:13         ` Christian Couder
2020-09-06 18:56 ` Kaartic Sivaraam
2020-09-07 18:55   ` Johannes Schindelin
2020-09-16  9:35     ` Christian Couder
2020-09-16 20:27       ` Johannes Schindelin
2020-09-19  7:40         ` Christian Couder
2020-09-20 15:06           ` Johannes Schindelin
2020-09-20 16:31   ` Kaartic Sivaraam
2020-09-21  4:22     ` Christian Couder
2020-09-21  7:59       ` Kaartic Sivaraam
2020-09-21 20:56       ` Shourya Shukla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOLTT8S1Tfu6YWcoHhZcydQYd_yBBCavdqyV_TzoOrEW6zHXGQ@mail.gmail.com \
    --to=adlternative@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=hariom18599@gmail.com \
    --cc=peff@peff.net \
    --cc=ttaylorr@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).