git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Ben Peart <peartben@gmail.com>
To: git@vger.kernel.org
Subject: Re: [PATCH v13 11/20] index-helper: use watchman to avoid refreshing index with lstat()
Date: Thu, 30 Jun 2016 23:54:50 +0000 (UTC)	[thread overview]
Message-ID: <loom.20160701T013515-311@post.gmane.org> (raw)
In-Reply-To: CACsJy8DjYLQCBRc9CzFSWNqkVnhbAfnxd1mnQh4oEfJwKjPd1A@mail.gmail.com

Duy Nguyen <pclouds <at> gmail.com> writes:

> 
> On Thu, Jun 30, 2016 at 7:55 PM, Ben Peart <peartben <at> gmail.com> wrote:
> > David Turner <novalis <at> novalis.org> writes:
> >
> >>
> >> Hiding watchman behind index-helper means you need both daemons. You
> >> can't run watchman alone. Not so good. But on the other hand, 'git'
> >> binary is not linked to watchman/json libraries, which is good for
> >> packaging. Core git package will run fine without watchman-related
> >> packages. If they need watchman, they can install git-index-helper and
> >> dependencies.
> >>
> >
> > Have you considered splitting index-helper and watchman apart?  Using
> > Watchman to not lstat unchanged entries is a huge perf win with very
> > large repos.
> 
> On large repos (i.e. lots of files/dirs on worktree), the cost of
> reading index will increase proportionally. Yes lstat costs, but I
> suspect index reading (integrity verification actually) may cost more,
> especially on platforms with cheap lstat like linux. On these repos
> you really want to enable all four: index-helper (with watchman),
> split-index (I still need to work out pruning on split-index) and
> untracked cache. There's still a lot more to make git run fast on
> large repos though.
> 

I've found (at least on Windows) that as the repo size gets larger, the
time to read the index becomes a much smaller percentage of the overall
time.  I just captured some perf traces of git status on a large repo we
have.  Of that, 92.5% was spent in git!read_directory and only 4.0% was 
spent in git!read_index.  Of that 4%, 2.6% was git!glk_SHA1_Update.

Given there were no dirty files, Watchman would have made a huge 
improvement in the overall time but index helper would have had
relatively little impact.  We've noticed this same pattern in all our
repos which is what is driving our interest in the Watchman model and
also shows why index-helper is of less interest.

> > It would also be interesting to make the Watchman backend replaceable by
> > using an extensible API.  This has the benefit of not having to link the
> > 'git' binary to the watchman/json libraries.
> 
> 'git' binary is not linked to watchman libraries. git-index-helper is
> a separate binary, by design. In theory you can create a
> 'git-index-helper' replacement binary with something other than
> watchman. I think David documented the protocol well (it may change in
> the future though and we are not prepared for capability progression)
> 
> > Is there any pattern already in git for accomplishing this?

While the current design hides watchman behind index-helper, if we were
to change that model so they were independent, we would be interested
in doing it in such a way that provided some abstraction so that it 
could be replaced with another file watching daemon.


  reply	other threads:[~2016-06-30 23:55 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-26  4:14 [PATCH v13 01/20] read-cache.c: fix constness of verify_hdr() David Turner
2016-06-26  4:14 ` [PATCH v13 02/20] read-cache: allow to keep mmap'd memory after reading David Turner
2016-06-26  4:14 ` [PATCH v13 03/20] pkt-line: add gentle version of packet_write David Turner
2016-06-26  4:14 ` [PATCH v13 04/20] index-helper: new daemon for caching index and related stuff David Turner
     [not found]   ` <CAFUO74nENapwVsM3CUst9AHqy5LcKTFBCnJxGXPk8E952t+X5Q@mail.gmail.com>
2016-06-27 16:53     ` David Turner
2016-06-30 13:06   ` Johannes Schindelin
2016-06-30 15:04     ` Duy Nguyen
2016-07-02 11:20       ` Johannes Schindelin
2016-07-02 12:43         ` Duy Nguyen
2016-06-26  4:14 ` [PATCH v13 05/20] index-helper: add --strict David Turner
2016-06-26  4:14 ` [PATCH v13 06/20] daemonize(): set a flag before exiting the main process David Turner
2016-06-26  4:14 ` [PATCH v13 07/20] index-helper: add --detach David Turner
2016-06-26  4:14 ` [PATCH v13 08/20] index-helper: log warnings David Turner
2016-06-26  4:14 ` [PATCH v13 09/20] read-cache: add watchman 'WAMA' extension David Turner
2016-06-26  4:14 ` [PATCH v13 10/20] watchman: support watchman to reduce index refresh cost David Turner
2016-06-26  4:14 ` [PATCH v13 11/20] index-helper: use watchman to avoid refreshing index with lstat() David Turner
2016-06-30 17:55   ` Ben Peart
2016-06-30 19:14     ` Duy Nguyen
2016-06-30 23:54       ` Ben Peart [this message]
2016-07-03 12:28         ` Duy Nguyen
2016-07-06 16:54           ` Ben Peart
2016-07-08 16:32             ` Duy Nguyen
2016-07-08 16:39             ` Duy Nguyen
2016-06-26  4:14 ` [PATCH v13 12/20] update-index: enable/disable watchman support David Turner
2016-06-26  4:14 ` [PATCH v13 13/20] unpack-trees: preserve index extensions David Turner
2016-06-26  4:14 ` [PATCH v13 14/20] watchman: add a config option to enable the extension David Turner
2016-06-26  4:14 ` [PATCH v13 15/20] index-helper: kill mode David Turner
2016-06-26  4:14 ` [PATCH v13 16/20] index-helper: don't run if already running David Turner
2016-06-26  4:14 ` [PATCH v13 17/20] index-helper: autorun mode David Turner
2016-06-26  4:14 ` [PATCH v13 18/20] index-helper: optionally automatically run David Turner
2016-06-26  4:14 ` [PATCH v13 19/20] trace: measure where the time is spent in the index-heavy operations David Turner
2016-06-26 12:09   ` [PATCH v13 21/20] unix-socket.c: add stub implementation when unix sockets are not supported Nguyễn Thái Ngọc Duy
2016-06-27 12:14     ` Johannes Schindelin
2016-06-27 16:16       ` Duy Nguyen
2016-06-28 10:06         ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=loom.20160701T013515-311@post.gmane.org \
    --to=peartben@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).