git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <Jeff.Hostetler@microsoft.com>
To: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>
Cc: Jeff Hostetler <git@jeffhostetler.com>,
	Johannes Schindelin <johannes.schindelin@gmx.de>,
	"git@vger.kernel.org" <git@vger.kernel.org>,
	Jeff Hostetler <Jeff.Hostetler@microsoft.com>
Subject: RE: [PATCH 0/5] A series of performance enhancements in the memihash and name-cache area
Date: Sun, 19 Feb 2017 00:02:29 +0000	[thread overview]
Message-ID: <MWHPR03MB29582CB216B01C0A825817A08A5F0@MWHPR03MB2958.namprd03.prod.outlook.com> (raw)
In-Reply-To: <xmqqvas8m499.fsf@gitster.mtv.corp.google.com>



From: Junio C Hamano [mailto:jch2355@gmail.com] On Behalf Of Junio C Hamano
> Jeff King <peff@peff.net> writes:
>> On Wed, Feb 15, 2017 at 09:27:53AM -0500, Jeff Hostetler wrote:
>>
>>> I have some informal numbers in a spreadsheet.  I was seeing
>>> a 8-9% speed up on a status on my gigantic repo.
>>> 
>>> I'll try to put together a before/after perf-test to better
>>> demonstrate this.
>>
>> Thanks. What I'm mostly curious about is how much each individual step
>> buys. Sometimes when doing a long optimization series, I find that some
>> of the optimizations make other ones somewhat redundant (e.g., if patch
>> 2 causes us to call the optimized code from patch 3 less often).
>
> I am curious too.
>
> To me 1/5 (reduction of redundant calls), 4/5 (correctly size the
> hash that would grow to a known size anyway) and 5/5 (take advantage
> of the fact that adjacent cache entries are often in the same
> directory) look like no brainers to take, regardless of the others
> (including themselves). 

agreed.

> It is not clear to me if 3/5 (preload-index uses available cores to
> compute hashes) is an unconditional win (an operation that is
> pathspec limited may need hashes for only a small fraction of the
> index---would it still be a win to compute the hash for all entries
> upon loading the index, even if we are using otherwise-idel cores?).

I'm not sure about pathspec cases.  What I was seeing was that during
the call to lazy_name_init_hash() was taking 30% of the time in
"git status" and 40% in "git add <one_file>".  (Again this was on my
giant repo with a 450MB index).
 
> Of course 2/5 is a prerequisite step for 3/5 and 5/5, so if we want
> either of the latter two, we cannot avoid it.

jeff


  reply	other threads:[~2017-02-19  0:04 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-14 11:31 [PATCH 0/5] A series of performance enhancements in the memihash and name-cache area Johannes Schindelin
2017-02-14 11:31 ` [PATCH 1/5] name-hash: eliminate duplicate memihash call Johannes Schindelin
2017-02-14 11:32 ` [PATCH 2/5] hashmap: allow memihash computation to be continued Johannes Schindelin
2017-02-18  5:35   ` Junio C Hamano
2017-02-20 12:43     ` Johannes Schindelin
2017-02-20 20:27       ` Junio C Hamano
2017-02-14 11:32 ` [PATCH 3/5] name-hash: precompute hash values during preload-index Johannes Schindelin
2017-02-18  5:47   ` Junio C Hamano
2017-02-19  0:19     ` Jeff Hostetler
2017-02-19 21:45       ` Junio C Hamano
2017-02-14 11:32 ` [PATCH 4/5] name-hash: specify initial size for istate.dir_hash table Johannes Schindelin
2017-02-14 11:32 ` [PATCH 5/5] name-hash: remember previous dir_entry during lazy_init_name_hash Johannes Schindelin
2017-02-14 22:03 ` [PATCH 0/5] A series of performance enhancements in the memihash and name-cache area Jeff King
2017-02-15 14:27   ` Jeff Hostetler
2017-02-15 16:44     ` Jeff King
2017-02-18  5:56       ` Junio C Hamano
2017-02-19  0:02         ` Jeff Hostetler [this message]
2017-02-18  5:58     ` Junio C Hamano
2017-02-18  6:29       ` Jeff King
2017-02-18 20:48         ` Junio C Hamano
2017-02-18 23:52           ` Jeff Hostetler
2017-02-19 21:50             ` Junio C Hamano
2017-03-02 21:11     ` Junio C Hamano
2017-03-02 21:18       ` Jeff Hostetler
2017-03-02 21:40         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR03MB29582CB216B01C0A825817A08A5F0@MWHPR03MB2958.namprd03.prod.outlook.com \
    --to=jeff.hostetler@microsoft.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).