From: Ben Peart <peartben@gmail.com> To: git@vger.kernel.org Cc: gitster@pobox.com, pclouds@gmail.com, Ben Peart <benpeart@microsoft.com>, Ben Peart <peartben@gmail.com> Subject: [PATCH v7 5/7] read-cache: load cache extensions on a worker thread Date: Mon, 1 Oct 2018 09:45:54 -0400 Message-ID: <20181001134556.33232-6-peartben@gmail.com> (raw) In-Reply-To: <20181001134556.33232-1-peartben@gmail.com> From: Ben Peart <benpeart@microsoft.com> This patch helps address the CPU cost of loading the index by loading the cache extensions on a worker thread in parallel with loading the cache entries. In some cases, loading the extensions takes longer than loading the cache entries so this patch utilizes the new EOIE to start the thread to load the extensions before loading all the cache entries in parallel. This is possible because the current extensions don't access the cache entries in the index_state structure so are OK that they don't all exist yet. The CACHE_EXT_TREE, CACHE_EXT_RESOLVE_UNDO, and CACHE_EXT_UNTRACKED extensions don't even get a pointer to the index so don't have access to the cache entries. CACHE_EXT_LINK only uses the index_state to initialize the split index. CACHE_EXT_FSMONITOR only uses the index_state to save the fsmonitor last update and dirty flags. I used p0002-read-cache.sh to generate some performance data: Test w/100,000 files reduced the time by 0.53% Test w/1,000,000 files reduced the time by 27.78% Signed-off-by: Ben Peart <peartben@gmail.com> --- read-cache.c | 97 +++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 81 insertions(+), 16 deletions(-) diff --git a/read-cache.c b/read-cache.c index af2605a168..77083ab8bb 100644 --- a/read-cache.c +++ b/read-cache.c @@ -23,6 +23,7 @@ #include "split-index.h" #include "utf8.h" #include "fsmonitor.h" +#include "thread-utils.h" /* Mask for the name length in ce_flags in the on-disk index */ @@ -1890,6 +1891,46 @@ static size_t estimate_cache_size(size_t ondisk_size, unsigned int entries) static size_t read_eoie_extension(const char *mmap, size_t mmap_size); static void write_eoie_extension(struct strbuf *sb, git_hash_ctx *eoie_context, size_t offset); +struct load_index_extensions +{ +#ifndef NO_PTHREADS + pthread_t pthread; +#endif + struct index_state *istate; + const char *mmap; + size_t mmap_size; + unsigned long src_offset; +}; + +static void *load_index_extensions(void *_data) +{ + struct load_index_extensions *p = _data; + unsigned long src_offset = p->src_offset; + + while (src_offset <= p->mmap_size - the_hash_algo->rawsz - 8) { + /* After an array of active_nr index entries, + * there can be arbitrary number of extended + * sections, each of which is prefixed with + * extension name (4-byte) and section length + * in 4-byte network byte order. + */ + uint32_t extsize; + memcpy(&extsize, p->mmap + src_offset + 4, 4); + extsize = ntohl(extsize); + if (read_index_extension(p->istate, + p->mmap + src_offset, + p->mmap + src_offset + 8, + extsize) < 0) { + munmap((void *)p->mmap, p->mmap_size); + die(_("index file corrupt")); + } + src_offset += 8; + src_offset += extsize; + } + + return NULL; +} + /* remember to discard_cache() before reading a different cache! */ int do_read_index(struct index_state *istate, const char *path, int must_exist) { @@ -1900,6 +1941,11 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) const char *mmap; size_t mmap_size; const struct cache_entry *previous_ce = NULL; + struct load_index_extensions p; + size_t extension_offset = 0; +#ifndef NO_PTHREADS + int nr_threads; +#endif if (istate->initialized) return istate->cache_nr; @@ -1936,6 +1982,30 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) istate->cache = xcalloc(istate->cache_alloc, sizeof(*istate->cache)); istate->initialized = 1; + p.istate = istate; + p.mmap = mmap; + p.mmap_size = mmap_size; + +#ifndef NO_PTHREADS + nr_threads = git_config_get_index_threads(); + if (!nr_threads) + nr_threads = online_cpus(); + + if (nr_threads > 1) { + extension_offset = read_eoie_extension(mmap, mmap_size); + if (extension_offset) { + int err; + + p.src_offset = extension_offset; + err = pthread_create(&p.pthread, NULL, load_index_extensions, &p); + if (err) + die(_("unable to create load_index_extensions thread: %s"), strerror(err)); + + nr_threads--; + } + } +#endif + if (istate->version == 4) { mem_pool_init(&istate->ce_mem_pool, estimate_cache_size_from_compressed(istate->cache_nr)); @@ -1960,22 +2030,17 @@ int do_read_index(struct index_state *istate, const char *path, int must_exist) istate->timestamp.sec = st.st_mtime; istate->timestamp.nsec = ST_MTIME_NSEC(st); - while (src_offset <= mmap_size - the_hash_algo->rawsz - 8) { - /* After an array of active_nr index entries, - * there can be arbitrary number of extended - * sections, each of which is prefixed with - * extension name (4-byte) and section length - * in 4-byte network byte order. - */ - uint32_t extsize; - extsize = get_be32(mmap + src_offset + 4); - if (read_index_extension(istate, - mmap + src_offset, - mmap + src_offset + 8, - extsize) < 0) - goto unmap; - src_offset += 8; - src_offset += extsize; + /* if we created a thread, join it otherwise load the extensions on the primary thread */ +#ifndef NO_PTHREADS + if (extension_offset) { + int ret = pthread_join(p.pthread, NULL); + if (ret) + die(_("unable to join load_index_extensions thread: %s"), strerror(ret)); + } +#endif + if (!extension_offset) { + p.src_offset = src_offset; + load_index_extensions(&p); } munmap((void *)mmap, mmap_size); return istate->cache_nr; -- 2.18.0.windows.1
next prev parent reply other threads:[~2018-10-01 13:46 UTC|newest] Thread overview: 199+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-08-23 15:41 [PATCH v1] read-cache: speed up index load through parallelization Ben Peart 2018-08-23 17:31 ` Stefan Beller 2018-08-23 19:44 ` Ben Peart 2018-08-24 18:40 ` Duy Nguyen 2018-08-28 14:53 ` Ben Peart 2018-08-23 18:06 ` Junio C Hamano 2018-08-23 20:33 ` Ben Peart 2018-08-24 15:37 ` Duy Nguyen 2018-08-24 15:57 ` Duy Nguyen 2018-08-24 17:28 ` Ben Peart 2018-08-25 6:44 ` [PATCH] read-cache.c: optimize reading index format v4 Nguyễn Thái Ngọc Duy 2018-08-27 19:36 ` Junio C Hamano 2018-08-28 19:25 ` Duy Nguyen 2018-08-28 23:54 ` Ben Peart 2018-08-29 17:14 ` Junio C Hamano 2018-09-04 16:08 ` Duy Nguyen 2018-09-02 13:19 ` [PATCH v2 0/1] " Nguyễn Thái Ngọc Duy 2018-09-02 13:19 ` [PATCH v2 1/1] read-cache.c: " Nguyễn Thái Ngọc Duy 2018-09-04 18:58 ` Junio C Hamano 2018-09-04 19:31 ` Junio C Hamano 2018-08-24 18:20 ` [PATCH v1] read-cache: speed up index load through parallelization Duy Nguyen 2018-08-24 18:40 ` Ben Peart 2018-08-24 19:00 ` Duy Nguyen 2018-08-24 19:57 ` Ben Peart 2018-08-29 15:25 ` [PATCH v2 0/3] " Ben Peart 2018-08-29 15:25 ` [PATCH v2 1/3] " Ben Peart 2018-08-29 17:14 ` Junio C Hamano 2018-08-29 21:35 ` Ben Peart 2018-09-03 19:16 ` Duy Nguyen 2018-08-29 15:25 ` [PATCH v2 2/3] read-cache: load cache extensions on worker thread Ben Peart 2018-08-29 17:12 ` Junio C Hamano 2018-08-29 21:42 ` Ben Peart 2018-08-29 22:19 ` Junio C Hamano 2018-09-03 19:21 ` Duy Nguyen 2018-09-03 19:27 ` Duy Nguyen 2018-08-29 15:25 ` [PATCH v2 3/3] read-cache: micro-optimize expand_name_field() to speed up V4 index parsing Ben Peart 2018-09-06 21:03 ` [PATCH v3 0/4] read-cache: speed up index load through parallelization Ben Peart 2018-09-06 21:03 ` [PATCH v3 1/4] read-cache: optimize expand_name_field() to speed up V4 index parsing Ben Peart 2018-09-06 21:03 ` [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension Ben Peart 2018-09-07 17:55 ` Junio C Hamano 2018-09-07 20:23 ` Ben Peart 2018-09-08 6:29 ` Martin Ågren 2018-09-08 14:03 ` Ben Peart 2018-09-08 17:08 ` Martin Ågren 2018-09-06 21:03 ` [PATCH v3 3/4] read-cache: load cache extensions on a worker thread Ben Peart 2018-09-07 21:10 ` Junio C Hamano 2018-09-08 14:56 ` Ben Peart 2018-09-06 21:03 ` [PATCH v3 4/4] read-cache: speed up index load through parallelization Ben Peart 2018-09-07 4:16 ` Torsten Bögershausen 2018-09-07 13:43 ` Ben Peart 2018-09-07 17:21 ` [PATCH v3 0/4] " Junio C Hamano 2018-09-07 18:31 ` Ben Peart 2018-09-08 13:18 ` Duy Nguyen 2018-09-11 23:26 ` [PATCH v4 0/5] " Ben Peart 2018-09-11 23:26 ` [PATCH v4 1/5] eoie: add End of Index Entry (EOIE) extension Ben Peart 2018-09-11 23:26 ` [PATCH v4 2/5] read-cache: load cache extensions on a worker thread Ben Peart 2018-09-11 23:26 ` [PATCH v4 3/5] read-cache: speed up index load through parallelization Ben Peart 2018-09-11 23:26 ` [PATCH v4 4/5] read-cache.c: optimize reading index format v4 Ben Peart 2018-09-11 23:26 ` [PATCH v4 5/5] read-cache: clean up casting and byte decoding Ben Peart 2018-09-12 14:34 ` [PATCH v4 0/5] read-cache: speed up index load through parallelization Ben Peart 2018-09-12 16:18 ` [PATCH v5 " Ben Peart 2018-09-12 16:18 ` [PATCH v5 1/5] eoie: add End of Index Entry (EOIE) extension Ben Peart 2018-09-13 22:44 ` Junio C Hamano 2018-09-15 10:02 ` Duy Nguyen 2018-09-17 14:54 ` Ben Peart 2018-09-17 16:05 ` Duy Nguyen 2018-09-17 17:31 ` Junio C Hamano 2018-09-17 17:38 ` Duy Nguyen 2018-09-17 19:08 ` Junio C Hamano 2018-09-12 16:18 ` [PATCH v5 2/5] read-cache: load cache extensions on a worker thread Ben Peart 2018-09-15 10:22 ` Duy Nguyen 2018-09-15 10:24 ` Duy Nguyen 2018-09-17 16:38 ` Ben Peart 2018-09-15 16:23 ` Duy Nguyen 2018-09-17 17:19 ` Junio C Hamano 2018-09-17 16:26 ` Ben Peart 2018-09-17 16:45 ` Duy Nguyen 2018-09-17 21:32 ` Junio C Hamano 2018-09-12 16:18 ` [PATCH v5 3/5] read-cache: load cache entries on worker threads Ben Peart 2018-09-15 10:31 ` Duy Nguyen 2018-09-17 17:25 ` Ben Peart 2018-09-15 11:07 ` Duy Nguyen 2018-09-15 11:09 ` Duy Nguyen 2018-09-17 18:52 ` Ben Peart 2018-09-15 11:29 ` Duy Nguyen 2018-09-12 16:18 ` [PATCH v5 4/5] read-cache.c: optimize reading index format v4 Ben Peart 2018-09-12 16:18 ` [PATCH v5 5/5] read-cache: clean up casting and byte decoding Ben Peart 2018-09-26 19:54 ` [PATCH v6 0/7] speed up index load through parallelization Ben Peart 2018-09-26 19:54 ` [PATCH v6 1/7] read-cache.c: optimize reading index format v4 Ben Peart 2018-09-26 19:54 ` [PATCH v6 2/7] read-cache: clean up casting and byte decoding Ben Peart 2018-09-26 19:54 ` [PATCH v6 3/7] eoie: add End of Index Entry (EOIE) extension Ben Peart 2018-09-28 0:19 ` SZEDER Gábor 2018-09-28 18:38 ` Ben Peart 2018-09-29 0:51 ` SZEDER Gábor 2018-09-29 5:45 ` Duy Nguyen 2018-09-29 18:24 ` Junio C Hamano 2018-09-26 19:54 ` [PATCH v6 4/7] config: add new index.threads config setting Ben Peart 2018-09-28 0:26 ` SZEDER Gábor 2018-09-28 13:39 ` Ben Peart 2018-09-28 17:07 ` Junio C Hamano 2018-09-28 19:41 ` Ben Peart 2018-09-28 20:30 ` Ramsay Jones 2018-09-28 22:15 ` Junio C Hamano 2018-10-01 13:17 ` Ben Peart 2018-10-01 15:06 ` SZEDER Gábor 2018-09-26 19:54 ` [PATCH v6 5/7] read-cache: load cache extensions on a worker thread Ben Peart 2018-09-26 19:54 ` [PATCH v6 6/7] ieot: add Index Entry Offset Table (IEOT) extension Ben Peart 2018-09-26 19:54 ` [PATCH v6 7/7] read-cache: load cache entries on worker threads Ben Peart 2018-09-26 22:06 ` [PATCH v6 0/7] speed up index load through parallelization Junio C Hamano 2018-09-27 17:13 ` Duy Nguyen 2018-10-01 13:45 ` [PATCH v7 " Ben Peart 2018-10-01 13:45 ` [PATCH v7 1/7] read-cache.c: optimize reading index format v4 Ben Peart 2018-10-01 13:45 ` [PATCH v7 2/7] read-cache: clean up casting and byte decoding Ben Peart 2018-10-01 15:10 ` Duy Nguyen 2018-10-01 13:45 ` [PATCH v7 3/7] eoie: add End of Index Entry (EOIE) extension Ben Peart 2018-10-01 15:17 ` SZEDER Gábor 2018-10-02 14:34 ` Ben Peart 2018-10-01 15:30 ` Duy Nguyen 2018-10-02 15:13 ` Ben Peart 2018-10-01 13:45 ` [PATCH v7 4/7] config: add new index.threads config setting Ben Peart 2018-10-01 13:45 ` Ben Peart [this message] 2018-10-01 15:50 ` [PATCH v7 5/7] read-cache: load cache extensions on a worker thread Duy Nguyen 2018-10-02 15:00 ` Ben Peart 2018-10-01 13:45 ` [PATCH v7 6/7] ieot: add Index Entry Offset Table (IEOT) extension Ben Peart 2018-10-01 16:27 ` Duy Nguyen 2018-10-02 16:34 ` Ben Peart 2018-10-02 17:02 ` Duy Nguyen 2018-10-01 13:45 ` [PATCH v7 7/7] read-cache: load cache entries on worker threads Ben Peart 2018-10-01 17:09 ` Duy Nguyen 2018-10-02 19:09 ` Ben Peart 2018-10-10 15:59 ` [PATCH v8 0/7] speed up index load through parallelization Ben Peart 2018-10-10 15:59 ` [PATCH v8 1/7] read-cache.c: optimize reading index format v4 Ben Peart 2018-10-10 15:59 ` [PATCH v8 2/7] read-cache: clean up casting and byte decoding Ben Peart 2018-10-10 15:59 ` [PATCH v8 3/7] eoie: add End of Index Entry (EOIE) extension Ben Peart 2018-10-10 15:59 ` [PATCH v8 4/7] config: add new index.threads config setting Ben Peart 2018-10-10 15:59 ` [PATCH v8 5/7] read-cache: load cache extensions on a worker thread Ben Peart 2018-10-10 15:59 ` [PATCH v8 6/7] ieot: add Index Entry Offset Table (IEOT) extension Ben Peart 2018-10-10 15:59 ` [PATCH v8 7/7] read-cache: load cache entries on worker threads Ben Peart 2018-10-19 16:11 ` Jeff King 2018-10-22 2:14 ` Junio C Hamano 2018-10-22 14:40 ` Ben Peart 2018-10-12 3:18 ` [PATCH v8 0/7] speed up index load through parallelization Junio C Hamano 2018-10-14 12:28 ` Duy Nguyen 2018-10-15 17:33 ` Ben Peart 2018-11-13 0:38 ` [PATCH 0/3] Avoid confusing messages from new index extensions (Re: [PATCH v8 0/7] speed up index load through parallelization) Jonathan Nieder 2018-11-13 0:39 ` [PATCH 1/3] eoie: default to not writing EOIE section Jonathan Nieder 2018-11-13 1:05 ` Junio C Hamano 2018-11-13 15:14 ` Ben Peart 2018-11-13 18:25 ` Jonathan Nieder 2018-11-14 1:36 ` Junio C Hamano 2018-11-15 0:19 ` Jonathan Nieder 2018-11-13 0:39 ` [PATCH 2/3] ieot: default to not writing IEOT section Jonathan Nieder 2018-11-13 0:58 ` Jonathan Tan 2018-11-13 1:09 ` Junio C Hamano 2018-11-13 1:12 ` Jonathan Nieder 2018-11-13 15:37 ` Duy Nguyen 2018-11-13 18:09 ` Jonathan Nieder 2018-11-13 15:22 ` Ben Peart 2018-11-13 18:18 ` Jonathan Nieder 2018-11-13 19:15 ` Ben Peart 2018-11-13 21:08 ` Jonathan Nieder 2018-11-14 18:09 ` Ben Peart 2018-11-15 0:05 ` Jonathan Nieder 2018-11-14 3:05 ` Junio C Hamano 2018-11-20 6:09 ` [PATCH v2 0/5] Avoid confusing messages from new index extensions Jonathan Nieder 2018-11-20 6:11 ` [PATCH 1/5] eoie: default to not writing EOIE section Jonathan Nieder 2018-11-20 13:06 ` Ben Peart 2018-11-20 13:21 ` SZEDER Gábor 2018-11-21 16:46 ` Jeff King 2018-11-22 0:47 ` Junio C Hamano 2018-11-20 15:01 ` Ben Peart 2018-11-20 6:12 ` [PATCH 2/5] ieot: default to not writing IEOT section Jonathan Nieder 2018-11-20 13:07 ` Ben Peart 2018-11-26 19:59 ` Stefan Beller 2018-11-26 21:47 ` Ben Peart 2018-11-26 22:02 ` Stefan Beller 2018-11-27 0:50 ` Junio C Hamano 2018-11-20 6:12 ` [PATCH 3/5] index: do not warn about unrecognized extensions Jonathan Nieder 2018-11-20 6:14 ` [PATCH 4/5] index: make index.threads=true enable ieot and eoie Jonathan Nieder 2018-11-20 13:24 ` Ben Peart 2018-11-20 6:15 ` [PATCH 5/5] index: offer advice for unknown index extensions Jonathan Nieder 2018-11-20 9:26 ` Ævar Arnfjörð Bjarmason 2018-11-20 13:30 ` Ben Peart 2018-11-21 0:22 ` Junio C Hamano 2018-11-21 0:39 ` Jonathan Nieder 2018-11-21 0:44 ` Jonathan Nieder 2018-11-21 5:01 ` Junio C Hamano 2018-11-21 5:04 ` Jonathan Nieder 2018-11-21 5:15 ` Junio C Hamano 2018-11-21 5:31 ` Junio C Hamano 2018-11-21 1:03 ` Jonathan Nieder 2018-11-21 4:23 ` Junio C Hamano 2018-11-21 4:57 ` Jonathan Nieder 2018-11-21 9:30 ` Ævar Arnfjörð Bjarmason 2018-11-13 0:40 ` [PATCH 3/3] index: do not warn about unrecognized extensions Jonathan Nieder 2018-11-13 1:10 ` Junio C Hamano 2018-11-13 15:25 ` Ben Peart 2018-11-14 3:24 ` Junio C Hamano 2018-11-14 18:19 ` Ben Peart
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20181001134556.33232-6-peartben@gmail.com \ --to=peartben@gmail.com \ --cc=benpeart@microsoft.com \ --cc=git@vger.kernel.org \ --cc=gitster@pobox.com \ --cc=pclouds@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
git@vger.kernel.org list mirror (unofficial, one of many) This inbox may be cloned and mirrored by anyone: git clone --mirror https://public-inbox.org/git git clone --mirror http://ou63pmih66umazou.onion/git git clone --mirror http://czquwvybam4bgbro.onion/git git clone --mirror http://hjrcffqmbrq6wope.onion/git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V1 git git/ https://public-inbox.org/git \ git@vger.kernel.org public-inbox-index git Example config snippet for mirrors. Newsgroups are available over NNTP: nntp://news.public-inbox.org/inbox.comp.version-control.git nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git nntp://news.gmane.io/gmane.comp.version-control.git note: .onion URLs require Tor: https://www.torproject.org/ code repositories for the project(s) associated with this inbox: https://80x24.org/mirrors/git.git AGPL code for this site: git clone https://public-inbox.org/public-inbox.git