git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: ZheNing Hu <adlternative@gmail.com>
Cc: Christian Couder <christian.couder@gmail.com>,
	Taylor Blau <ttaylorr@github.com>, git <git@vger.kernel.org>,
	Christian Couder <chriscool@tuxfamily.org>,
	Hariom verma <hariom18599@gmail.com>
Subject: Re: Git in Outreachy?
Date: Sat, 4 Sep 2021 08:50:02 -0400	[thread overview]
Message-ID: <YTNrehKnfPo3E5RI@coredump.intra.peff.net> (raw)
In-Reply-To: <CAOLTT8QufEU5Q64JfQyEOs4FYCsrNX2jgj8PdmYziVtKnRyu4w@mail.gmail.com>

On Sat, Sep 04, 2021 at 03:40:41PM +0800, ZheNing Hu wrote:

> This may be a place to promote my patches: See [1][2][3].
> It can provide some extra atoms for git cat-file --batch | --batch-check,
> like %(tree), %(author), %(tagger) etc. Although some performance
> optimizations have been made, It still has small performance gap.
> 
> If the community still expects git cat-file --batch to reuse the logic
> of ref-filter,
> I expect it to get the attention of reviewers.
> 
> The solutions I can think of to further optimize performance are:
> 1. Delay the evaluation of some ref-filter intermediate data.
> 2. Let ref-filter code reentrant and can be called in multi-threaded  to take
> advantage of multi-core.

I don't think trying to thread it will help much. For expensive formats,
where we have to actually open and parse objects, in theory we could do
that in parallel. But most of our time there is spent in zlib getting
the object data, and that all needs to be done under a big lock.

For little formats (e.g., just printing "%(refname)"), we need to
serialize the output anyway. So our unit of work is so tiny, I suspect
that the threading overhead would be a net negative.

I was coincidentally looking at ref-filter last week, and it seemed to
me that a lot of the slowness is because of the over-use of malloc
(e.g., we allocate a substring for every atom_value, and then form them
into a separate buffer). If we could parse the original format into a
form that could be traversed without having to do further allocations,
just writing directly to a strbuf (or even a file handle), I think that
would be a big improvement.

I just posted the results of some of my experiments to the list:

  https://lore.kernel.org/git/YTNpQ7Od1U%2F5i0R7@coredump.intra.peff.net/

I don't think that gives any kind of useful base to build on, but it
shows what's possible by skipping past various segments of the
ref-filter code.

-Peff

  reply	other threads:[~2021-09-04 12:50 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-03  2:40 Git in Outreachy? Taylor Blau
2021-09-03 18:33 ` Emily Shaffer
2021-09-04  4:30 ` Christian Couder
2021-09-04  7:40   ` ZheNing Hu
2021-09-04 12:50     ` Jeff King [this message]
2021-09-05  8:58       ` ZheNing Hu
2021-09-06 12:36         ` Matheus Tavares Bernardino
2021-09-07  5:50           ` ZheNing Hu
2021-09-04 17:51 ` Taylor Blau
2021-09-18 16:10 ` Taylor Blau
2021-09-20  7:45   ` ZheNing Hu
2021-09-20 14:52     ` Christian Couder
2021-09-20 15:15       ` Christian Couder
2021-09-21  5:41         ` ZheNing Hu
2021-09-21 15:39           ` Christian Couder
2021-09-22 15:01             ` ZheNing Hu
2021-09-21  5:39       ` ZheNing Hu
2021-09-21 15:35         ` Christian Couder
2021-09-22 14:58           ` ZheNing Hu
2021-09-21 21:25   ` Taylor Blau
2021-09-29 14:18     ` Christian Couder
2021-09-29 17:34       ` Taylor Blau
2021-09-29 20:30         ` Taylor Blau
  -- strict thread matches above, loose matches on Subject: below --
2020-08-28  6:56 Jeff King
2020-08-31  6:55 ` Christian Couder
2020-09-03  6:00   ` Jonathan Nieder
2020-09-04 14:14     ` Philip Oakley
2020-09-07 18:49       ` Johannes Schindelin
2020-09-16 15:16         ` Philip Oakley
2020-09-16 18:43           ` Johannes Schindelin
2020-09-17 14:42             ` Philip Oakley
2020-09-09 18:26     ` Taylor Blau
2020-09-10  1:39       ` Jonathan Nieder
2020-09-10  2:19         ` Taylor Blau
2020-09-16  9:12     ` Christian Couder
2020-09-16  6:42   ` Christian Couder
2020-08-31 17:41 ` Junio C Hamano
2020-08-31 18:05 ` Emily Shaffer
2020-09-01 12:51   ` Jeff King
2020-09-03  5:41     ` Jeff King
2020-09-15 17:35       ` Jeff King
2020-09-15 17:55         ` Kaartic Sivaraam
2020-09-15 18:02           ` Jeff King
2020-09-19  8:12         ` Christian Couder
2020-09-19 15:10           ` Phillip Wood
2020-09-16  8:45     ` Christian Couder
2020-09-02  4:00 ` Johannes Schindelin
2020-09-16  9:01   ` Christian Couder
2020-09-16  9:45     ` Phillip Wood
2020-09-17  9:43     ` Christian Couder
2020-09-17 10:14       ` Phillip Wood
2020-09-18  8:37         ` Christian Couder
2020-09-17 15:34       ` Elijah Newren
2020-09-18  8:42         ` Christian Couder
2020-09-27 16:59     ` Kaartic Sivaraam
2020-09-27 21:16       ` Christian Couder
2020-10-29 10:13         ` Christian Couder
2020-09-06 18:56 ` Kaartic Sivaraam
2020-09-07 18:55   ` Johannes Schindelin
2020-09-16  9:35     ` Christian Couder
2020-09-16 20:27       ` Johannes Schindelin
2020-09-19  7:40         ` Christian Couder
2020-09-20 15:06           ` Johannes Schindelin
2020-09-20 16:31   ` Kaartic Sivaraam
2020-09-21  4:22     ` Christian Couder
2020-09-21  7:59       ` Kaartic Sivaraam
2020-09-21 20:56       ` Shourya Shukla

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YTNrehKnfPo3E5RI@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=adlternative@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=hariom18599@gmail.com \
    --cc=ttaylorr@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).