git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Ben Peart <peartben@gmail.com>
To: Duy Nguyen <pclouds@gmail.com>, Ben Peart <benpeart@microsoft.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Ben Peart <Ben.Peart@microsoft.com>
Subject: Re: [PATCH v5 2/5] read-cache: load cache extensions on a worker thread
Date: Mon, 17 Sep 2018 12:26:20 -0400
Message-ID: <78f62979-18a7-2fc1-6f26-c4f84e19424f@gmail.com> (raw)
In-Reply-To: <CACsJy8ATsS6S5zib2FqJf1stPcGwSTO1qYBSz514Xu2GfJ4Apw@mail.gmail.com>



On 9/15/2018 6:22 AM, Duy Nguyen wrote:
>> +index.threads::
>> +       Specifies the number of threads to spawn when loading the index.
>> +       This is meant to reduce index load time on multiprocessor machines.
>> +       Specifying 0 or 'true' will cause Git to auto-detect the number of
>> +       CPU's and set the number of threads accordingly. Defaults to 'true'.
> 
> I'd rather this variable defaults to 0. Spawning threads have
> associated cost and most projects out there are small enough that this
> multi threading could just add more cost than gain. It only makes
> sense to enable this on huge repos.
> 
> Wait there's no way to disable this parallel reading? Does not sound
> right. And  if ordinary numbers mean the number of threads then 0
> should mean no threading. Auto detection could have a new keyword,
> like 'auto'.
> 

The index.threads setting is patterned after the pack.threads setting 
for consistency.  Specifying 1 (or 'false') will disable multithreading 
but I will call that out explicitly in the documentation to make it more 
obvious.

The THREAD_COST logic is designed to ensure small repos don't incur more 
cost than gain.  If you have data on that logic that shows it isn't 
working properly, I'm happy to change the logic as necessary.

>> --- a/read-cache.c
>> +++ b/read-cache.c
>> @@ -23,6 +23,10 @@
>>   #include "split-index.h"
>>   #include "utf8.h"
>>   #include "fsmonitor.h"
>> +#ifndef NO_PTHREADS
>> +#include <pthread.h>
>> +#include <thread-utils.h>
>> +#endif
> 
> I don't think you're supposed to include system header files after
> "cache.h". Including thread-utils.h should be enough (and it keeps the
> exception of inclduing pthread.h in just one place). Please use
> "pthread-utils.h" instead of <pthread-utils.h> which is usually for
> system header files. And include ptherad-utils.h unconditionally.
> 

Thanks, I'll fix that.

>>
>>   /* Mask for the name length in ce_flags in the on-disk index */
>>
>> @@ -1898,6 +1902,46 @@ static unsigned long read_eoie_extension(void *mmap_, size_t mmap_size);
>>   #endif
>>   static void write_eoie_extension(struct strbuf *sb, git_hash_ctx *eoie_context, unsigned long offset);
>>
>> +struct load_index_extensions
>> +{
>> +#ifndef NO_PTHREADS
>> +       pthread_t pthread;
>> +#endif
>> +       struct index_state *istate;
>> +       void *mmap;
>> +       size_t mmap_size;
>> +       unsigned long src_offset;
>> +};
>> +
>> +static void *load_index_extensions(void *_data)
>> +{
>> +       struct load_index_extensions *p = _data;
>> +       unsigned long src_offset = p->src_offset;
>> +
>> +       while (src_offset <= p->mmap_size - the_hash_algo->rawsz - 8) {
>> +               /* After an array of active_nr index entries,
>> +                * there can be arbitrary number of extended
>> +                * sections, each of which is prefixed with
>> +                * extension name (4-byte) and section length
>> +                * in 4-byte network byte order.
>> +                */
>> +               uint32_t extsize;
>> +               memcpy(&extsize, (char *)p->mmap + src_offset + 4, 4);
>> +               extsize = ntohl(extsize);
>> +               if (read_index_extension(p->istate,
>> +                       (const char *)p->mmap + src_offset,
>> +                       (char *)p->mmap + src_offset + 8,
>> +                       extsize) < 0) {
>> +                       munmap(p->mmap, p->mmap_size);
>> +                       die("index file corrupt");
> 
> _()
> 

You're feedback style can be a bit abrupt and terse.  I _think_ what you 
are trying to say here is that the "die" call should use the _() macro 
around the string.

This is an edit of the previous code that loaded index extensions and 
doesn't change the use of _(). I don't know the rules for when _() 
should be used and didn't have any luck finding where it was documented 
so left it unchanged.

FWIW, in this file alone there are 20 existing instances of die() or 
die_errorno() and only two that use the _() macro.  A quick grep through 
the source code shows thousands of die() calls the vast majority of 
which do not use the _() macro.  This appears to be an area that is 
unclear and inconsistent and could use some attention in a separate patch.


>> +       /* if we created a thread, join it otherwise load the extensions on the primary thread */
>> +#ifndef NO_PTHREADS
>> +       if (extension_offset && pthread_join(p.pthread, NULL))
>> +               die(_("unable to join load_index_extensions_thread"));
> 
> I guess the last _ is a typo and you wanted "unable to join
> load_index_extensions thread". Please use die_errno() instead.
> 

Why should this be die_errorno() here?  All other instances of 
pthread_join() failing in a fatal way use die(), not die_errorno().

>> +#endif
>> +       if (!extension_offset) {
>> +               p.src_offset = src_offset;
>> +               load_index_extensions(&p);
>>          }
>>          munmap(mmap, mmap_size);
>>          return istate->cache_nr;
>> --
>> 2.18.0.windows.1
>>
> 
> 

  parent reply index

Thread overview: 199+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-23 15:41 [PATCH v1] read-cache: speed up index load through parallelization Ben Peart
2018-08-23 17:31 ` Stefan Beller
2018-08-23 19:44   ` Ben Peart
2018-08-24 18:40   ` Duy Nguyen
2018-08-28 14:53     ` Ben Peart
2018-08-23 18:06 ` Junio C Hamano
2018-08-23 20:33   ` Ben Peart
2018-08-24 15:37     ` Duy Nguyen
2018-08-24 15:57       ` Duy Nguyen
2018-08-24 17:28         ` Ben Peart
2018-08-25  6:44         ` [PATCH] read-cache.c: optimize reading index format v4 Nguyễn Thái Ngọc Duy
2018-08-27 19:36           ` Junio C Hamano
2018-08-28 19:25             ` Duy Nguyen
2018-08-28 23:54               ` Ben Peart
2018-08-29 17:14               ` Junio C Hamano
2018-09-04 16:08             ` Duy Nguyen
2018-09-02 13:19           ` [PATCH v2 0/1] " Nguyễn Thái Ngọc Duy
2018-09-02 13:19             ` [PATCH v2 1/1] read-cache.c: " Nguyễn Thái Ngọc Duy
2018-09-04 18:58               ` Junio C Hamano
2018-09-04 19:31               ` Junio C Hamano
2018-08-24 18:20       ` [PATCH v1] read-cache: speed up index load through parallelization Duy Nguyen
2018-08-24 18:40         ` Ben Peart
2018-08-24 19:00           ` Duy Nguyen
2018-08-24 19:57             ` Ben Peart
2018-08-29 15:25 ` [PATCH v2 0/3] " Ben Peart
2018-08-29 15:25   ` [PATCH v2 1/3] " Ben Peart
2018-08-29 17:14     ` Junio C Hamano
2018-08-29 21:35       ` Ben Peart
2018-09-03 19:16     ` Duy Nguyen
2018-08-29 15:25   ` [PATCH v2 2/3] read-cache: load cache extensions on worker thread Ben Peart
2018-08-29 17:12     ` Junio C Hamano
2018-08-29 21:42       ` Ben Peart
2018-08-29 22:19         ` Junio C Hamano
2018-09-03 19:21     ` Duy Nguyen
2018-09-03 19:27       ` Duy Nguyen
2018-08-29 15:25   ` [PATCH v2 3/3] read-cache: micro-optimize expand_name_field() to speed up V4 index parsing Ben Peart
2018-09-06 21:03 ` [PATCH v3 0/4] read-cache: speed up index load through parallelization Ben Peart
2018-09-06 21:03   ` [PATCH v3 1/4] read-cache: optimize expand_name_field() to speed up V4 index parsing Ben Peart
2018-09-06 21:03   ` [PATCH v3 2/4] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-09-07 17:55     ` Junio C Hamano
2018-09-07 20:23       ` Ben Peart
2018-09-08  6:29         ` Martin Ågren
2018-09-08 14:03           ` Ben Peart
2018-09-08 17:08             ` Martin Ågren
2018-09-06 21:03   ` [PATCH v3 3/4] read-cache: load cache extensions on a worker thread Ben Peart
2018-09-07 21:10     ` Junio C Hamano
2018-09-08 14:56       ` Ben Peart
2018-09-06 21:03   ` [PATCH v3 4/4] read-cache: speed up index load through parallelization Ben Peart
2018-09-07  4:16     ` Torsten Bögershausen
2018-09-07 13:43       ` Ben Peart
2018-09-07 17:21   ` [PATCH v3 0/4] " Junio C Hamano
2018-09-07 18:31     ` Ben Peart
2018-09-08 13:18     ` Duy Nguyen
2018-09-11 23:26 ` [PATCH v4 0/5] " Ben Peart
2018-09-11 23:26   ` [PATCH v4 1/5] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-09-11 23:26   ` [PATCH v4 2/5] read-cache: load cache extensions on a worker thread Ben Peart
2018-09-11 23:26   ` [PATCH v4 3/5] read-cache: speed up index load through parallelization Ben Peart
2018-09-11 23:26   ` [PATCH v4 4/5] read-cache.c: optimize reading index format v4 Ben Peart
2018-09-11 23:26   ` [PATCH v4 5/5] read-cache: clean up casting and byte decoding Ben Peart
2018-09-12 14:34   ` [PATCH v4 0/5] read-cache: speed up index load through parallelization Ben Peart
2018-09-12 16:18 ` [PATCH v5 " Ben Peart
2018-09-12 16:18   ` [PATCH v5 1/5] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-09-13 22:44     ` Junio C Hamano
2018-09-15 10:02     ` Duy Nguyen
2018-09-17 14:54       ` Ben Peart
2018-09-17 16:05         ` Duy Nguyen
2018-09-17 17:31           ` Junio C Hamano
2018-09-17 17:38             ` Duy Nguyen
2018-09-17 19:08               ` Junio C Hamano
2018-09-12 16:18   ` [PATCH v5 2/5] read-cache: load cache extensions on a worker thread Ben Peart
2018-09-15 10:22     ` Duy Nguyen
2018-09-15 10:24       ` Duy Nguyen
2018-09-17 16:38         ` Ben Peart
2018-09-15 16:23       ` Duy Nguyen
2018-09-17 17:19         ` Junio C Hamano
2018-09-17 16:26       ` Ben Peart [this message]
2018-09-17 16:45         ` Duy Nguyen
2018-09-17 21:32       ` Junio C Hamano
2018-09-12 16:18   ` [PATCH v5 3/5] read-cache: load cache entries on worker threads Ben Peart
2018-09-15 10:31     ` Duy Nguyen
2018-09-17 17:25       ` Ben Peart
2018-09-15 11:07     ` Duy Nguyen
2018-09-15 11:09       ` Duy Nguyen
2018-09-17 18:52         ` Ben Peart
2018-09-15 11:29     ` Duy Nguyen
2018-09-12 16:18   ` [PATCH v5 4/5] read-cache.c: optimize reading index format v4 Ben Peart
2018-09-12 16:18   ` [PATCH v5 5/5] read-cache: clean up casting and byte decoding Ben Peart
2018-09-26 19:54 ` [PATCH v6 0/7] speed up index load through parallelization Ben Peart
2018-09-26 19:54   ` [PATCH v6 1/7] read-cache.c: optimize reading index format v4 Ben Peart
2018-09-26 19:54   ` [PATCH v6 2/7] read-cache: clean up casting and byte decoding Ben Peart
2018-09-26 19:54   ` [PATCH v6 3/7] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-09-28  0:19     ` SZEDER Gábor
2018-09-28 18:38       ` Ben Peart
2018-09-29  0:51     ` SZEDER Gábor
2018-09-29  5:45     ` Duy Nguyen
2018-09-29 18:24       ` Junio C Hamano
2018-09-26 19:54   ` [PATCH v6 4/7] config: add new index.threads config setting Ben Peart
2018-09-28  0:26     ` SZEDER Gábor
2018-09-28 13:39       ` Ben Peart
2018-09-28 17:07         ` Junio C Hamano
2018-09-28 19:41           ` Ben Peart
2018-09-28 20:30             ` Ramsay Jones
2018-09-28 22:15               ` Junio C Hamano
2018-10-01 13:17                 ` Ben Peart
2018-10-01 15:06                   ` SZEDER Gábor
2018-09-26 19:54   ` [PATCH v6 5/7] read-cache: load cache extensions on a worker thread Ben Peart
2018-09-26 19:54   ` [PATCH v6 6/7] ieot: add Index Entry Offset Table (IEOT) extension Ben Peart
2018-09-26 19:54   ` [PATCH v6 7/7] read-cache: load cache entries on worker threads Ben Peart
2018-09-26 22:06   ` [PATCH v6 0/7] speed up index load through parallelization Junio C Hamano
2018-09-27 17:13   ` Duy Nguyen
2018-10-01 13:45 ` [PATCH v7 " Ben Peart
2018-10-01 13:45   ` [PATCH v7 1/7] read-cache.c: optimize reading index format v4 Ben Peart
2018-10-01 13:45   ` [PATCH v7 2/7] read-cache: clean up casting and byte decoding Ben Peart
2018-10-01 15:10     ` Duy Nguyen
2018-10-01 13:45   ` [PATCH v7 3/7] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-10-01 15:17     ` SZEDER Gábor
2018-10-02 14:34       ` Ben Peart
2018-10-01 15:30     ` Duy Nguyen
2018-10-02 15:13       ` Ben Peart
2018-10-01 13:45   ` [PATCH v7 4/7] config: add new index.threads config setting Ben Peart
2018-10-01 13:45   ` [PATCH v7 5/7] read-cache: load cache extensions on a worker thread Ben Peart
2018-10-01 15:50     ` Duy Nguyen
2018-10-02 15:00       ` Ben Peart
2018-10-01 13:45   ` [PATCH v7 6/7] ieot: add Index Entry Offset Table (IEOT) extension Ben Peart
2018-10-01 16:27     ` Duy Nguyen
2018-10-02 16:34       ` Ben Peart
2018-10-02 17:02         ` Duy Nguyen
2018-10-01 13:45   ` [PATCH v7 7/7] read-cache: load cache entries on worker threads Ben Peart
2018-10-01 17:09     ` Duy Nguyen
2018-10-02 19:09       ` Ben Peart
2018-10-10 15:59 ` [PATCH v8 0/7] speed up index load through parallelization Ben Peart
2018-10-10 15:59   ` [PATCH v8 1/7] read-cache.c: optimize reading index format v4 Ben Peart
2018-10-10 15:59   ` [PATCH v8 2/7] read-cache: clean up casting and byte decoding Ben Peart
2018-10-10 15:59   ` [PATCH v8 3/7] eoie: add End of Index Entry (EOIE) extension Ben Peart
2018-10-10 15:59   ` [PATCH v8 4/7] config: add new index.threads config setting Ben Peart
2018-10-10 15:59   ` [PATCH v8 5/7] read-cache: load cache extensions on a worker thread Ben Peart
2018-10-10 15:59   ` [PATCH v8 6/7] ieot: add Index Entry Offset Table (IEOT) extension Ben Peart
2018-10-10 15:59   ` [PATCH v8 7/7] read-cache: load cache entries on worker threads Ben Peart
2018-10-19 16:11     ` Jeff King
2018-10-22  2:14       ` Junio C Hamano
2018-10-22 14:40         ` Ben Peart
2018-10-12  3:18   ` [PATCH v8 0/7] speed up index load through parallelization Junio C Hamano
2018-10-14 12:28   ` Duy Nguyen
2018-10-15 17:33     ` Ben Peart
2018-11-13  0:38   ` [PATCH 0/3] Avoid confusing messages from new index extensions (Re: [PATCH v8 0/7] speed up index load through parallelization) Jonathan Nieder
2018-11-13  0:39     ` [PATCH 1/3] eoie: default to not writing EOIE section Jonathan Nieder
2018-11-13  1:05       ` Junio C Hamano
2018-11-13 15:14         ` Ben Peart
2018-11-13 18:25           ` Jonathan Nieder
2018-11-14  1:36             ` Junio C Hamano
2018-11-15  0:19               ` Jonathan Nieder
2018-11-13  0:39     ` [PATCH 2/3] ieot: default to not writing IEOT section Jonathan Nieder
2018-11-13  0:58       ` Jonathan Tan
2018-11-13  1:09       ` Junio C Hamano
2018-11-13  1:12         ` Jonathan Nieder
2018-11-13 15:37           ` Duy Nguyen
2018-11-13 18:09         ` Jonathan Nieder
2018-11-13 15:22       ` Ben Peart
2018-11-13 18:18         ` Jonathan Nieder
2018-11-13 19:15           ` Ben Peart
2018-11-13 21:08             ` Jonathan Nieder
2018-11-14 18:09               ` Ben Peart
2018-11-15  0:05                 ` Jonathan Nieder
2018-11-14  3:05         ` Junio C Hamano
2018-11-20  6:09           ` [PATCH v2 0/5] Avoid confusing messages from new index extensions Jonathan Nieder
2018-11-20  6:11             ` [PATCH 1/5] eoie: default to not writing EOIE section Jonathan Nieder
2018-11-20 13:06               ` Ben Peart
2018-11-20 13:21                 ` SZEDER Gábor
2018-11-21 16:46                   ` Jeff King
2018-11-22  0:47                     ` Junio C Hamano
2018-11-20 15:01               ` Ben Peart
2018-11-20  6:12             ` [PATCH 2/5] ieot: default to not writing IEOT section Jonathan Nieder
2018-11-20 13:07               ` Ben Peart
2018-11-26 19:59                 ` Stefan Beller
2018-11-26 21:47                   ` Ben Peart
2018-11-26 22:02                     ` Stefan Beller
2018-11-27  0:50                   ` Junio C Hamano
2018-11-20  6:12             ` [PATCH 3/5] index: do not warn about unrecognized extensions Jonathan Nieder
2018-11-20  6:14             ` [PATCH 4/5] index: make index.threads=true enable ieot and eoie Jonathan Nieder
2018-11-20 13:24               ` Ben Peart
2018-11-20  6:15             ` [PATCH 5/5] index: offer advice for unknown index extensions Jonathan Nieder
2018-11-20  9:26               ` Ævar Arnfjörð Bjarmason
2018-11-20 13:30                 ` Ben Peart
2018-11-21  0:22                   ` Junio C Hamano
2018-11-21  0:39                     ` Jonathan Nieder
2018-11-21  0:44                       ` Jonathan Nieder
2018-11-21  5:01                       ` Junio C Hamano
2018-11-21  5:04                         ` Jonathan Nieder
2018-11-21  5:15                         ` Junio C Hamano
2018-11-21  5:31                           ` Junio C Hamano
2018-11-21  1:03                     ` Jonathan Nieder
2018-11-21  4:23                       ` Junio C Hamano
2018-11-21  4:57                         ` Jonathan Nieder
2018-11-21  9:30                       ` Ævar Arnfjörð Bjarmason
2018-11-13  0:40     ` [PATCH 3/3] index: do not warn about unrecognized extensions Jonathan Nieder
2018-11-13  1:10       ` Junio C Hamano
2018-11-13 15:25       ` Ben Peart
2018-11-14  3:24       ` Junio C Hamano
2018-11-14 18:19         ` Ben Peart

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=78f62979-18a7-2fc1-6f26-c4f84e19424f@gmail.com \
    --to=peartben@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=benpeart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox