git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: git@vger.kernel.org
Cc: gerardu@amazon.com
Subject: RFC: auto-enabling parallel-checkout on NFS
Date: Sun, 15 Nov 2020 16:43:59 -0300
Message-ID: <20201115194359.67901-1-matheus.bernardino@usp.br> (raw)

Hi, everyone

I've been testing the parallel checkout code on some different machines,
to benchmark its performance against the sequential version. As
discussed in [1], the biggest speedups, on Linux, are usually seen on
SSDs and NFS volumes. (I haven't got the chance to benchmark on
Windows or OSX yet.)

Regarding NFS, I got some 2~3.4x speedups even when the NFS client and
server were both running on single-core machines. Here are some runtimes
for a linux-v5.8 clone (means of 5 cold-cache executions):

    nfs 3.0              nfs 4.0              nfs 4.1
1:  183.708 s ± 3.290 s  205.851 s ± 0.844 s  217.317 s ± 3.047 s
2:  130.510 s ± 3.917 s  139.124 s ± 0.772 s  142.963 s ± 0.765 s
4:   89.611 s ± 1.032 s  102.701 s ± 1.665 s   94.728 s ± 1.014 s
8:   68.097 s ± 0.820 s  104.914 s ± 1.239 s   69.359 s ± 0.619 s
16:  63.999 s ± 0.820 s  104.808 s ± 2.279 s   64.843 s ± 0.587 s
32:  62.316 s ± 2.095 s  102.105 s ± 1.537 s   64.122 s ± 0.374 s
64:  63.699 s ± 0.841 s  103.103 s ± 1.319 s   63.532 s ± 0.734 s

The parallel version was also faster for some smaller checkouts. For
example, the following numbers come from a bat-v0.16.0 clone
(251 files, ~3MB):

     nfs 3.0             nfs 4.0             nfs 4.1
1:   0.853 s ± 0.080 s   0.814 s ± 0.020 s   0.876 s ± 0.065 s
2:   0.671 s ± 0.020 s   0.702 s ± 0.030 s   0.705 s ± 0.030 s
4:   0.530 s ± 0.024 s   0.595 s ± 0.020 s   0.570 s ± 0.030 s
8:   0.470 s ± 0.033 s   0.609 s ± 0.025 s   0.510 s ± 0.031 s
16:  0.469 s ± 0.037 s   0.616 s ± 0.022 s   0.513 s ± 0.030 s
32:  0.487 s ± 0.030 s   0.639 s ± 0.018 s   0.527 s ± 0.028 s
64:  0.520 s ± 0.022 s   0.680 s ± 0.028 s   0.562 s ± 0.026 s

While looking at these numbers with Geert (in CC), he had the idea that
we could try to detect when the checkout path is within an NFS mount,
and auto-enable paralellism for this case. This way, users in NFS would
get the best performance by default. And it seems that using ~16 workers
would produce good results regardless of the NFS version that they might
be running.

The major downside is that detecting the file system type is quite
platform-dependent, so there is no simple and portable solution. (Also,
I'm not sure if the optimal number of workers would be the same on
different OSes). But we decided to give it a try, so this is a
rough prototype that would work for Linux:
https://github.com/matheustavares/git/commit/2e2c787e2a1742fed8c35dba185b7cd208603de9

Any thoughts on this idea? Or alternative suggestions?

Thanks,
Matheus

[1]: https://lore.kernel.org/git/815137685ac3e41444201316f537db9797dcacd2.1604521276.git.matheus.bernardino@usp.br/

             reply	other threads:[~2020-11-15 19:45 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-15 19:43 Matheus Tavares [this message]
2020-11-16 15:19 ` Jeff Hostetler
2020-11-19  4:01   ` Matheus Tavares
2020-11-19 14:04     ` Jeff Hostetler
2020-11-20 12:10       ` Ævar Arnfjörð Bjarmason
2020-11-23 23:18       ` Geert Jansen
2020-11-19  9:01 ` Ævar Arnfjörð Bjarmason
2020-11-19 14:11   ` Jeff Hostetler
2020-11-23 23:37   ` Geert Jansen
2020-11-24 12:58     ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201115194359.67901-1-matheus.bernardino@usp.br \
    --to=matheus.bernardino@usp.br \
    --cc=gerardu@amazon.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for the project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git