git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Zager <szager@chromium.org>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Stefan Zager <szager@chromium.org>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: Make the git codebase thread-safe
Date: Wed, 12 Feb 2014 10:12:20 -0800	[thread overview]
Message-ID: <CAHOQ7J8gvwpwJV2mBPDaARu3cQ54-ZDQ6iGOwKuJRr9Z+XBL7g@mail.gmail.com> (raw)
In-Reply-To: <CACsJy8Bsc6sywL9L5QC-SKKmh9J+CKnoG5i78WfUbAG9BdZ8Rw@mail.gmail.com>

On Tue, Feb 11, 2014 at 6:11 PM, Duy Nguyen <pclouds@gmail.com> wrote:
>
> I have no comments about thread safety improvements (well, not yet).
> If you have investigated about git performance on chromium
> repositories, could you please sum it up? Threading may be an option
> to improve performance, but it's probably not the only option.

Well, the painful operations that we use frequently are pack-objects,
checkout, status, and blame.  Anything on Windows that touches a lot
of files is miserable due to the usual file system slowness on
Windows, and luafv.sys (the UAC file virtualization driver) seems to
make it much worse.

With threading turned on, pack-objects on Windows now takes about
twice as long as on Linux, which is still more than a 2x improvement
over the non-threaded operation.

Checkout is really bad on Windows.  The blink repository is ~200K
files, and a full clean checkout from the index takes about 10 seconds
on Linux, and about 3:30 on Windows.  I used the Very Sleepy profiler
to see where all the time was spent on Windows: 55% of the time was
spent in OpenFile, and 25% in CloseFile (both in win32).  My immediate
goal is to add threading to checkout, so those file system calls can
be done in parallel.

Enabling the fscache speeds up status quite a bit.  I'm optimistic
that parallelizing the stat calls will yield a further improvement.
Beyond that, it may not be possible to do much more without using a
file system watcher daemon, like facebook does with mercurial.
(https://code.facebook.com/posts/218678814984400/scaling-mercurial-at-facebook/)

Blame is something that chromium and blink developers use heavily, and
it is not unusual for a blame invocation on the blink repository to
run for 30 seconds.  It seems like it should be possible to
parallelize blame, but it requires pack file operations to be
thread-safe.

Stefan

  reply	other threads:[~2014-02-12 18:12 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-12  1:54 Make the git codebase thread-safe Stefan Zager
2014-02-12  2:02 ` Robin H. Johnson
2014-02-12  3:43   ` Duy Nguyen
2014-02-12 11:00     ` Karsten Blees
2014-02-12 23:03       ` Mike Hommey
2014-02-13  0:06         ` Karsten Blees
2014-02-12 18:15     ` Stefan Zager
2014-02-12  2:11 ` Duy Nguyen
2014-02-12 18:12   ` Stefan Zager [this message]
2014-02-12 18:33     ` Matthieu Moy
2014-02-12 18:39       ` Stefan Zager
2014-02-12 18:50     ` David Kastrup
2014-02-12 19:02       ` Stefan Zager
2014-02-12 19:15         ` David Kastrup
2014-02-12 23:09           ` Mike Hommey
2014-02-13  6:04             ` David Kastrup
2014-02-13  9:34               ` Mike Hommey
2014-02-13  9:48                 ` Mike Hommey
2014-02-13  8:30           ` David Kastrup
2014-02-12 20:06     ` Junio C Hamano
2014-02-12 20:27       ` Stefan Zager
2014-02-12 23:05         ` Junio C Hamano
2014-02-12 11:59 ` Erik Faye-Lund
2014-02-12 18:20   ` Stefan Zager
2014-02-12 18:27     ` Erik Faye-Lund
2014-02-12 18:34       ` Stefan Zager
2014-02-12 18:37         ` Erik Faye-Lund
2014-02-12 19:22           ` Karsten Blees
2014-02-12 19:30             ` Stefan Zager
2014-02-13  8:27               ` Johannes Sixt
2014-02-13  8:38                 ` David Kastrup
2014-02-13 18:40                 ` Stefan Zager
2014-02-13 18:38             ` Zachary Turner
2014-02-13 22:51               ` Karsten Blees
2014-02-13 22:53                 ` Stefan Zager
2014-02-13 23:09                   ` Zachary Turner
2014-02-14 19:04                     ` Karsten Blees
     [not found]                       ` <CAAErz9g7ND1htfk=yxRJJLbSEgBi4EV_AHC9uDRptugGWFWcXw@mail.gmail.com>
2014-02-14 19:16                         ` Zachary Turner
2014-02-14 23:10                           ` Karsten Blees
2014-02-15  0:45                           ` Duy Nguyen
2014-02-15  0:50                             ` Stefan Zager
2014-02-15  0:56                               ` Duy Nguyen
2014-02-15  1:15                                 ` Zachary Turner
2014-02-15  1:39                                   ` Duy Nguyen
2014-02-18 17:55                                     ` Junio C Hamano
2014-02-18 18:14                                       ` Zachary Turner
2014-02-14 19:52                         ` Stefan Zager
2014-02-14 21:49                       ` Stefan Zager
2014-02-13  1:42 ` brian m. carlson
2019-04-02  0:52 ` Matheus Tavares
2019-04-02  1:07   ` Duy Nguyen
2019-04-02 10:30     ` David Kastrup
2019-04-02 11:35       ` Duy Nguyen
2019-04-02 11:52         ` David Kastrup
2019-04-02 19:06     ` Matheus Tavares Bernardino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHOQ7J8gvwpwJV2mBPDaARu3cQ54-ZDQ6iGOwKuJRr9Z+XBL7g@mail.gmail.com \
    --to=szager@chromium.org \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).