From: Jeff Hostetler <git@jeffhostetler.com>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: gerardu@amazon.com, git@vger.kernel.org
Subject: Re: RFC: auto-enabling parallel-checkout on NFS
Date: Thu, 19 Nov 2020 09:04:34 -0500 [thread overview]
Message-ID: <212a2def-6811-b6e4-0550-ecae2fe0c02c@jeffhostetler.com> (raw)
In-Reply-To: <20201119040117.67914-1-matheus.bernardino@usp.br>
On 11/18/20 11:01 PM, Matheus Tavares wrote:
> Hi, Jeff
>
> On Mon, Nov 16, 2020 at 12:19 PM Jeff Hostetler <git@jeffhostetler.com> wrote:
>>
>> I can't really speak to NFS performance, but I have to wonder if there's
>> not something else affecting the results -- 4 and/or 8 core results are
>> better than 16+ results in some columns. And we get diminishing returns
>> after ~16.
>
> Yeah, that's a good point. I'm not sure yet what's causing the
> diminishing returns, but Geert and I are investigating. Maybe we are
> hitting some limit for parallelism in this scenario.
I seem to recall back when I was working on this problem that
the unzip of each blob was a major pain point. Combine this
long delta-chains and each worker would need multiple rounds of
read/memmap, unzip, and de-delta before it had the complete blob
and could then smudge and write.
This makes me wonder if repacking the repo with shorter delta-chains
affects the checkout times. And improves the perf when there are
more workers. I'm not saying that this is a solution, but rather
an experiment to see if it changes anything and maybe adjust our
focus.
>
>> I'm wondering if during these test runs, you were IO vs CPU bound and if
>> VM was a problem.
>
> I would say we are more IO bound during these tests. While a sequential
> linux-v5.8 checkout usually uses 100% of one core in my laptop's SSD,
> in this setup, it only used 5% to 10%. And even with 64 workers (on a
> single core), CPU usage stays around 60% most of the time.
>
> About memory, the peak PSS was around 1.75GB, with 64 workers, and the
> machine has 10GB of RAM. But are there other numbers that I should keep
> an eye on while running the test?
>
>> I'm wondering if setting thread affinity would help here.
>
> Hmm, I only had one core online during the benchmark, so I think thread
> affinity wouldn't impact the runtime.
I wasn't really thinking about the 64 workers on 1 core case. I was
more thinking about the 64 workers on 64 cores case and wondering
if workers were being randomly bounced from core to core and we were
thrashing.
>
> Thanks,
> Matheus
>
next prev parent reply other threads:[~2020-11-19 14:08 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-15 19:43 RFC: auto-enabling parallel-checkout on NFS Matheus Tavares
2020-11-16 15:19 ` Jeff Hostetler
2020-11-19 4:01 ` Matheus Tavares
2020-11-19 14:04 ` Jeff Hostetler [this message]
2020-11-20 12:10 ` Ævar Arnfjörð Bjarmason
2020-11-23 23:18 ` Geert Jansen
2020-11-19 9:01 ` Ævar Arnfjörð Bjarmason
2020-11-19 14:11 ` Jeff Hostetler
2020-11-23 23:37 ` Geert Jansen
2020-11-24 12:58 ` Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=212a2def-6811-b6e4-0550-ecae2fe0c02c@jeffhostetler.com \
--to=git@jeffhostetler.com \
--cc=gerardu@amazon.com \
--cc=git@vger.kernel.org \
--cc=matheus.bernardino@usp.br \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).