git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Ben Peart <peartben@gmail.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Ben Peart <Ben.Peart@microsoft.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH v1] read-cache: speed up index load through parallelization
Date: Fri, 24 Aug 2018 13:28:56 -0400
Message-ID: <3e82ee23-6a04-6714-4d51-68bb2ef4a123@gmail.com> (raw)
In-Reply-To: <20180824155734.GA6170@duynguyen.home>



On 8/24/2018 11:57 AM, Duy Nguyen wrote:
> On Fri, Aug 24, 2018 at 05:37:20PM +0200, Duy Nguyen wrote:
>> Since we're cutting corners to speed things up, could you try
>> something like this?
>>
>> I notice that reading v4 is significantly slower than v2 and
>> apparently strlen() (at least from glibc) is much cleverer and at
>> least gives me a few percentage time saving.
>>
>> diff --git a/read-cache.c b/read-cache.c
>> index 7b1354d759..d10cccaed0 100644
>> --- a/read-cache.c
>> +++ b/read-cache.c
>> @@ -1755,8 +1755,7 @@ static unsigned long expand_name_field(struct
>> strbuf *name, const char *cp_)
>>          if (name->len < len)
>>                  die("malformed name field in the index");
>>          strbuf_remove(name, name->len - len, len);
>> -       for (ep = cp; *ep; ep++)
>> -               ; /* find the end */
>> +       ep = cp + strlen(cp);
>>          strbuf_add(name, cp, ep - cp);
>>          return (const char *)ep + 1 - cp_;
>>   }
> 
> No try this instead. It's half way back to v2 numbers for me (tested
> with "test-tool read-cache 100" on webkit.git). For the record, v4 is
> about 30% slower than v2 in my tests.
> 

Thanks Duy, this helped on my system as well.

Interestingly, simply reading the cache tree extension in read_one() now 
takes about double the CPU on the primary thread as does 
load_cache_entries().

Hmm, that gives me an idea.  I could kick off another thread to load 
that extension in parallel and cut off another ~160 ms.  I'll add that 
to my list of future patches to investigate...

> We could probably do better too. Instead of preparing the string in a
> separate buffer (previous_name_buf), we could just assemble it directly
> to the newly allocated "ce".
> 
> -- 8< --
> diff --git a/read-cache.c b/read-cache.c
> index 7b1354d759..237f60a76c 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1754,9 +1754,8 @@ static unsigned long expand_name_field(struct strbuf *name, const char *cp_)
>   
>   	if (name->len < len)
>   		die("malformed name field in the index");
> -	strbuf_remove(name, name->len - len, len);
> -	for (ep = cp; *ep; ep++)
> -		; /* find the end */
> +	strbuf_setlen(name, name->len - len);
> +	ep = cp + strlen(cp);
>   	strbuf_add(name, cp, ep - cp);
>   	return (const char *)ep + 1 - cp_;
>   }
> -- 8< --
> 

  reply index

Thread overview: 199+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-23 15:41 Ben Peart
2018-08-23 17:31 ` Stefan Beller
2018-08-23 19:44   ` Ben Peart
2018-08-24 18:40   ` Duy Nguyen
2018-08-28 14:53     ` Ben Peart
2018-08-23 18:06 ` Junio C Hamano
2018-08-23 20:33   ` Ben Peart
2018-08-24 15:37     ` Duy Nguyen
2018-08-24 15:57       ` Duy Nguyen
2018-08-24 17:28         ` Ben Peart [this message]
2018-08-25  6:44         ` [PATCH] read-cache.c: optimize reading index format v4 Nguyễn Thái Ngọc Duy
2018-08-27 19:36           ` Junio C Hamano
2018-08-28 19:25             ` Duy Nguyen
2018-08-28 23:54               ` Ben Peart
2018-08-29 17:14               ` Junio C Hamano
2018-09-04 16:08             ` Duy Nguyen
2018-09-02 13:19           ` [PATCH v2 0/1] " Nguyễn Thái Ngọc Duy
2018-09-02 13:19             ` [PATCH v2 1/1] read-cache.c: " Nguyễn Thái Ngọc Duy
2018-09-04 18:58               ` Junio C Hamano
2018-09-04 19:31               ` Junio C Hamano
2018-08-24 18:20       ` [PATCH v1] read-cache: speed up index load through parallelization Duy Nguyen
2018-08-24 18:40         ` Ben Peart
2018-08-24 19:00           ` Duy Nguyen
2018-08-24 19:57             ` Ben Peart
2018-08-29 15:25 ` [PATCH v2 0/3] " Ben Peart
2018-08-29 15:25   ` [PATCH v2 1/3] " Ben Peart
2018-08-29 17:14     ` Junio C Hamano
2018-08-29 21:35       ` Ben Peart
2018-09-03 19:16     ` Duy Nguyen
2018-08-29 15:25   ` [PATCH v2 2/3] read-cache: load cache extensions on worker thread Ben Peart
2018-08-29 17:12     ` Junio C Hamano
2018-08-29 21:42       ` Ben Peart
2018-08-29 22:19         ` Junio C Hamano
2018-09-03 19:21     ` Duy Nguyen
2018-09-03 19:27       ` Duy Nguyen
2018-08-29 15:25   ` [PATCH v2 3/3] read-cache: micro-optimize expand_name_field() to speed up V4 index parsing Ben Peart
2018-09-06 21:03 ` [PATCH v3 0/4] read-cache: speed up index load through parallelization Ben Peart
2018-09-06 21:03   ` [PATCH v3 1/4] read-cache: optimize expand_name_field() to speed up V4 index parsing Ben Peart
2018-09-06 21:03   ` [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-09-07 17:55     ` Junio C Hamano
2018-09-07 20:23       ` Ben Peart
2018-09-08  6:29         ` Martin Ågren
2018-09-08 14:03           ` Ben Peart
2018-09-08 17:08             ` Martin Ågren
2018-09-06 21:03   ` [PATCH v3 3/4] read-cache: load cache extensions on a worker thread Ben Peart
2018-09-07 21:10     ` Junio C Hamano
2018-09-08 14:56       ` Ben Peart
2018-09-06 21:03   ` [PATCH v3 4/4] read-cache: speed up index load through parallelization Ben Peart
2018-09-07  4:16     ` Torsten Bögershausen
2018-09-07 13:43       ` Ben Peart
2018-09-07 17:21   ` [PATCH v3 0/4] " Junio C Hamano
2018-09-07 18:31     ` Ben Peart
2018-09-08 13:18     ` Duy Nguyen
2018-09-11 23:26 ` [PATCH v4 0/5] " Ben Peart
2018-09-11 23:26   ` [PATCH v4 1/5] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-09-11 23:26   ` [PATCH v4 2/5] read-cache: load cache extensions on a worker thread Ben Peart
2018-09-11 23:26   ` [PATCH v4 3/5] read-cache: speed up index load through parallelization Ben Peart
2018-09-11 23:26   ` [PATCH v4 4/5] read-cache.c: optimize reading index format v4 Ben Peart
2018-09-11 23:26   ` [PATCH v4 5/5] read-cache: clean up casting and byte decoding Ben Peart
2018-09-12 14:34   ` [PATCH v4 0/5] read-cache: speed up index load through parallelization Ben Peart
2018-09-12 16:18 ` [PATCH v5 " Ben Peart
2018-09-12 16:18   ` [PATCH v5 1/5] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-09-13 22:44     ` Junio C Hamano
2018-09-15 10:02     ` Duy Nguyen
2018-09-17 14:54       ` Ben Peart
2018-09-17 16:05         ` Duy Nguyen
2018-09-17 17:31           ` Junio C Hamano
2018-09-17 17:38             ` Duy Nguyen
2018-09-17 19:08               ` Junio C Hamano
2018-09-12 16:18   ` [PATCH v5 2/5] read-cache: load cache extensions on a worker thread Ben Peart
2018-09-15 10:22     ` Duy Nguyen
2018-09-15 10:24       ` Duy Nguyen
2018-09-17 16:38         ` Ben Peart
2018-09-15 16:23       ` Duy Nguyen
2018-09-17 17:19         ` Junio C Hamano
2018-09-17 16:26       ` Ben Peart
2018-09-17 16:45         ` Duy Nguyen
2018-09-17 21:32       ` Junio C Hamano
2018-09-12 16:18   ` [PATCH v5 3/5] read-cache: load cache entries on worker threads Ben Peart
2018-09-15 10:31     ` Duy Nguyen
2018-09-17 17:25       ` Ben Peart
2018-09-15 11:07     ` Duy Nguyen
2018-09-15 11:09       ` Duy Nguyen
2018-09-17 18:52         ` Ben Peart
2018-09-15 11:29     ` Duy Nguyen
2018-09-12 16:18   ` [PATCH v5 4/5] read-cache.c: optimize reading index format v4 Ben Peart
2018-09-12 16:18   ` [PATCH v5 5/5] read-cache: clean up casting and byte decoding Ben Peart
2018-09-26 19:54 ` [PATCH v6 0/7] speed up index load through parallelization Ben Peart
2018-09-26 19:54   ` [PATCH v6 1/7] read-cache.c: optimize reading index format v4 Ben Peart
2018-09-26 19:54   ` [PATCH v6 2/7] read-cache: clean up casting and byte decoding Ben Peart
2018-09-26 19:54   ` [PATCH v6 3/7] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-09-28  0:19     ` SZEDER Gábor
2018-09-28 18:38       ` Ben Peart
2018-09-29  0:51     ` SZEDER Gábor
2018-09-29  5:45     ` Duy Nguyen
2018-09-29 18:24       ` Junio C Hamano
2018-09-26 19:54   ` [PATCH v6 4/7] config: add new index.threads config setting Ben Peart
2018-09-28  0:26     ` SZEDER Gábor
2018-09-28 13:39       ` Ben Peart
2018-09-28 17:07         ` Junio C Hamano
2018-09-28 19:41           ` Ben Peart
2018-09-28 20:30             ` Ramsay Jones
2018-09-28 22:15               ` Junio C Hamano
2018-10-01 13:17                 ` Ben Peart
2018-10-01 15:06                   ` SZEDER Gábor
2018-09-26 19:54   ` [PATCH v6 5/7] read-cache: load cache extensions on a worker thread Ben Peart
2018-09-26 19:54   ` [PATCH v6 6/7] ieot: add Index Entry Offset Table (IEOT) extension Ben Peart
2018-09-26 19:54   ` [PATCH v6 7/7] read-cache: load cache entries on worker threads Ben Peart
2018-09-26 22:06   ` [PATCH v6 0/7] speed up index load through parallelization Junio C Hamano
2018-09-27 17:13   ` Duy Nguyen
2018-10-01 13:45 ` [PATCH v7 " Ben Peart
2018-10-01 13:45   ` [PATCH v7 1/7] read-cache.c: optimize reading index format v4 Ben Peart
2018-10-01 13:45   ` [PATCH v7 2/7] read-cache: clean up casting and byte decoding Ben Peart
2018-10-01 15:10     ` Duy Nguyen
2018-10-01 13:45   ` [PATCH v7 3/7] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-10-01 15:17     ` SZEDER Gábor
2018-10-02 14:34       ` Ben Peart
2018-10-01 15:30     ` Duy Nguyen
2018-10-02 15:13       ` Ben Peart
2018-10-01 13:45   ` [PATCH v7 4/7] config: add new index.threads config setting Ben Peart
2018-10-01 13:45   ` [PATCH v7 5/7] read-cache: load cache extensions on a worker thread Ben Peart
2018-10-01 15:50     ` Duy Nguyen
2018-10-02 15:00       ` Ben Peart
2018-10-01 13:45   ` [PATCH v7 6/7] ieot: add Index Entry Offset Table (IEOT) extension Ben Peart
2018-10-01 16:27     ` Duy Nguyen
2018-10-02 16:34       ` Ben Peart
2018-10-02 17:02         ` Duy Nguyen
2018-10-01 13:45   ` [PATCH v7 7/7] read-cache: load cache entries on worker threads Ben Peart
2018-10-01 17:09     ` Duy Nguyen
2018-10-02 19:09       ` Ben Peart
2018-10-10 15:59 ` [PATCH v8 0/7] speed up index load through parallelization Ben Peart
2018-10-10 15:59   ` [PATCH v8 1/7] read-cache.c: optimize reading index format v4 Ben Peart
2018-10-10 15:59   ` [PATCH v8 2/7] read-cache: clean up casting and byte decoding Ben Peart
2018-10-10 15:59   ` [PATCH v8 3/7] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-10-10 15:59   ` [PATCH v8 4/7] config: add new index.threads config setting Ben Peart
2018-10-10 15:59   ` [PATCH v8 5/7] read-cache: load cache extensions on a worker thread Ben Peart
2018-10-10 15:59   ` [PATCH v8 6/7] ieot: add Index Entry Offset Table (IEOT) extension Ben Peart
2018-10-10 15:59   ` [PATCH v8 7/7] read-cache: load cache entries on worker threads Ben Peart
2018-10-19 16:11     ` Jeff King
2018-10-22  2:14       ` Junio C Hamano
2018-10-22 14:40         ` Ben Peart
2018-10-12  3:18   ` [PATCH v8 0/7] speed up index load through parallelization Junio C Hamano
2018-10-14 12:28   ` Duy Nguyen
2018-10-15 17:33     ` Ben Peart
2018-11-13  0:38   ` [PATCH 0/3] Avoid confusing messages from new index extensions (Re: [PATCH v8 0/7] speed up index load through parallelization) Jonathan Nieder
2018-11-13  0:39     ` [PATCH 1/3] eoie: default to not writing EOIE section Jonathan Nieder
2018-11-13  1:05       ` Junio C Hamano
2018-11-13 15:14         ` Ben Peart
2018-11-13 18:25           ` Jonathan Nieder
2018-11-14  1:36             ` Junio C Hamano
2018-11-15  0:19               ` Jonathan Nieder
2018-11-13  0:39     ` [PATCH 2/3] ieot: default to not writing IEOT section Jonathan Nieder
2018-11-13  0:58       ` Jonathan Tan
2018-11-13  1:09       ` Junio C Hamano
2018-11-13  1:12         ` Jonathan Nieder
2018-11-13 15:37           ` Duy Nguyen
2018-11-13 18:09         ` Jonathan Nieder
2018-11-13 15:22       ` Ben Peart
2018-11-13 18:18         ` Jonathan Nieder
2018-11-13 19:15           ` Ben Peart
2018-11-13 21:08             ` Jonathan Nieder
2018-11-14 18:09               ` Ben Peart
2018-11-15  0:05                 ` Jonathan Nieder
2018-11-14  3:05         ` Junio C Hamano
2018-11-20  6:09           ` [PATCH v2 0/5] Avoid confusing messages from new index extensions Jonathan Nieder
2018-11-20  6:11             ` [PATCH 1/5] eoie: default to not writing EOIE section Jonathan Nieder
2018-11-20 13:06               ` Ben Peart
2018-11-20 13:21                 ` SZEDER Gábor
2018-11-21 16:46                   ` Jeff King
2018-11-22  0:47                     ` Junio C Hamano
2018-11-20 15:01               ` Ben Peart
2018-11-20  6:12             ` [PATCH 2/5] ieot: default to not writing IEOT section Jonathan Nieder
2018-11-20 13:07               ` Ben Peart
2018-11-26 19:59                 ` Stefan Beller
2018-11-26 21:47                   ` Ben Peart
2018-11-26 22:02                     ` Stefan Beller
2018-11-27  0:50                   ` Junio C Hamano
2018-11-20  6:12             ` [PATCH 3/5] index: do not warn about unrecognized extensions Jonathan Nieder
2018-11-20  6:14             ` [PATCH 4/5] index: make index.threads=true enable ieot and eoie Jonathan Nieder
2018-11-20 13:24               ` Ben Peart
2018-11-20  6:15             ` [PATCH 5/5] index: offer advice for unknown index extensions Jonathan Nieder
2018-11-20  9:26               ` Ævar Arnfjörð Bjarmason
2018-11-20 13:30                 ` Ben Peart
2018-11-21  0:22                   ` Junio C Hamano
2018-11-21  0:39                     ` Jonathan Nieder
2018-11-21  0:44                       ` Jonathan Nieder
2018-11-21  5:01                       ` Junio C Hamano
2018-11-21  5:04                         ` Jonathan Nieder
2018-11-21  5:15                         ` Junio C Hamano
2018-11-21  5:31                           ` Junio C Hamano
2018-11-21  1:03                     ` Jonathan Nieder
2018-11-21  4:23                       ` Junio C Hamano
2018-11-21  4:57                         ` Jonathan Nieder
2018-11-21  9:30                       ` Ævar Arnfjörð Bjarmason
2018-11-13  0:40     ` [PATCH 3/3] index: do not warn about unrecognized extensions Jonathan Nieder
2018-11-13  1:10       ` Junio C Hamano
2018-11-13 15:25       ` Ben Peart
2018-11-14  3:24       ` Junio C Hamano
2018-11-14 18:19         ` Ben Peart

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3e82ee23-6a04-6714-4d51-68bb2ef4a123@gmail.com \
    --to=peartben@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox