Re: RFC: auto-enabling parallel-checkout on NFS

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

From: Jeff Hostetler <git@jeffhostetler.com>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: gerardu@amazon.com, git@vger.kernel.org
Subject: Re: RFC: auto-enabling parallel-checkout on NFS
Date: Thu, 19 Nov 2020 09:04:34 -0500	[thread overview]
Message-ID: <212a2def-6811-b6e4-0550-ecae2fe0c02c@jeffhostetler.com> (raw)
In-Reply-To: <20201119040117.67914-1-matheus.bernardino@usp.br>

On 11/18/20 11:01 PM, Matheus Tavares wrote:
> Hi, Jeff
> 
> On Mon, Nov 16, 2020 at 12:19 PM Jeff Hostetler <git@jeffhostetler.com> wrote:
>>
>> I can't really speak to NFS performance, but I have to wonder if there's
>> not something else affecting the results -- 4 and/or 8 core results are
>> better than 16+ results in some columns.  And we get diminishing returns
>> after ~16.
> 
> Yeah, that's a good point. I'm not sure yet what's causing the
> diminishing returns, but Geert and I are investigating. Maybe we are
> hitting some limit for parallelism in this scenario.

I seem to recall back when I was working on this problem that
the unzip of each blob was a major pain point.  Combine this
long delta-chains and each worker would need multiple rounds of
read/memmap, unzip, and de-delta before it had the complete blob
and could then smudge and write.

This makes me wonder if repacking the repo with shorter delta-chains
affects the checkout times.  And improves the perf when there are
more workers.  I'm not saying that this is a solution, but rather
an experiment to see if it changes anything and maybe adjust our
focus.

> 
>> I'm wondering if during these test runs, you were IO vs CPU bound and if
>> VM was a problem.
> 
> I would say we are more IO bound during these tests. While a sequential
> linux-v5.8 checkout usually uses 100% of one core in my laptop's SSD,
> in this setup, it only used 5% to 10%. And even with 64 workers (on a
> single core), CPU usage stays around 60% most of the time.
> 
> About memory, the peak PSS was around 1.75GB, with 64 workers, and the
> machine has 10GB of RAM. But are there other numbers that I should keep
> an eye on while running the test?
> 
>> I'm wondering if setting thread affinity would help here.
> 
> Hmm, I only had one core online during the benchmark, so I think thread
> affinity wouldn't impact the runtime.

I wasn't really thinking about the 64 workers on 1 core case.  I was
more thinking about the 64 workers on 64 cores case and wondering
if workers were being randomly bounced from core to core and we were
thrashing.

> 
> Thanks,
> Matheus
>

next prev parent reply	other threads:[~2020-11-19 14:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-15 19:43 RFC: auto-enabling parallel-checkout on NFS Matheus Tavares
2020-11-16 15:19 ` Jeff Hostetler
2020-11-19  4:01   ` Matheus Tavares
2020-11-19 14:04     ` Jeff Hostetler [this message]
2020-11-20 12:10       ` Ævar Arnfjörð Bjarmason
2020-11-23 23:18       ` Geert Jansen
2020-11-19  9:01 ` Ævar Arnfjörð Bjarmason
2020-11-19 14:11   ` Jeff Hostetler
2020-11-23 23:37   ` Geert Jansen
2020-11-24 12:58     ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=212a2def-6811-b6e4-0550-ecae2fe0c02c@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=gerardu@amazon.com \
    --cc=git@vger.kernel.org \
    --cc=matheus.bernardino@usp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).