git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Duy Nguyen <pclouds@gmail.com>
To: David Turner <dturner@twopensource.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH v5 09/15] index-helper: use watchman to avoid refreshing index with lstat()
Date: Wed, 20 Apr 2016 07:15:37 +0700	[thread overview]
Message-ID: <CACsJy8CM409OH12w3EdVP3UjXoURbWNuqb_coQ=AagdCs+ctaQ@mail.gmail.com> (raw)
In-Reply-To: <1461108489-29376-10-git-send-email-dturner@twopensource.com>

Continuing my comment from the --use-watchman patch about watchman not
being supported...

On Wed, Apr 20, 2016 at 6:28 AM, David Turner <dturner@twopensource.com> wrote:
> +static int poke_and_wait_for_reply(int fd)
> +{
> +       struct strbuf buf = STRBUF_INIT;
> +       struct strbuf reply = STRBUF_INIT;
> +       int ret = -1;
> +       fd_set fds;
> +       struct timeval timeout;
> +
> +       timeout.tv_usec = 0;
> +       timeout.tv_sec = 1;
> +
> +       if (fd < 0)
> +               return -1;
> +
> +       strbuf_addf(&buf, "poke %d", getpid());
> +       if (write_in_full(fd, buf.buf, buf.len + 1) != buf.len + 1)
> +               goto done_poke;
> +
> +       /* Now wait for a reply */
> +       FD_ZERO(&fds);
> +       FD_SET(fd, &fds);
> +       if (select(fd + 1, &fds, NULL, NULL, &timeout) == 0)
> +               /* No reply, giving up */
> +               goto done_poke;
> +
> +       if (strbuf_getwholeline_fd(&reply, fd, 0))
> +               goto done_poke;
> +
> +       if (!starts_with(reply.buf, "OK"))
> +               goto done_poke;

... while we could simply check USE_WATCHMAN macro and reject in
update-index, a better solution is sending "poke %d watchman" and
returning "OK watchman" (vs "OK") when watchman is supported and
active. If the user already requests watchman and index-helper returns
just "OK" then we can warn the user the reason of possible performance
degradation. It's related to the error reporting, but I don't think
you can send straight errors over unix socket. It's possible but it's
a bit more complicated.

> +static void refresh_by_watchman(struct index_state *istate)
> +{
> +       void *shm = NULL;
> +       int length;
> +       int i;
> +       struct stat st;
> +       int fd = -1;
> +       const char *path = git_path("shm-watchman-%s-%"PRIuMAX,
> +                                   sha1_to_hex(istate->sha1),
> +                                   (uintmax_t)getpid());
> +
> +       fd = open(path, O_RDONLY);
> +       if (fd < 0)
> +               return;
> +
> +       /*
> +        * This watchman data is just for us -- no need to keep it
> +        * around once we've got it open.
> +        */
> +       unlink(path);

This will not play well when multiple processes read and refresh the
index at the same time. They could refresh non-overlapping
subdirectories, and I think it's perfectly ok for them to do so
(writing index down is a different matter). I don't have a good answer
for this. Perhaps if shm-watchman-%s-%d file is small enough (and it
should be, we store it in the index), then we can just send the
content straight over unix socket. I didn't have this option with my
signal-based communication model.

This is really extra. But if we know in advance that git does not need
refresh(), then we should be able to tell index-helper not to waste
cycles contacting watchman and preparing shm-watchman-%s-%d (the poke
line gets more parameters). Either that, or we decouple watchman
requests from read_cache() requests. Only when refresh_index() is
called that we send something to request shm-watchman-%s-%d. The same
for read_directory() (i.e. untracked cache stuff). Hmm?

Now that I think of it, with watchman backing us, we probably should
just do nothing in update_index_if_able() (or write_locked_index()
when we know only stat info is changed) when watchman is active. The
purpose of update_index_if_able() is to avoid costly refresh, but we
can already avoid that with watchman. And updating big index files is
always costly (even though it should cost less with split-index). Of
course this can only be done if watchman (inotify to be precise) can
cover whole worktree. I'm not sure how watchman behaves when there's
not enough inotify resource to cover full worktree.
-- 
Duy

  reply	other threads:[~2016-04-20  0:16 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-19 23:27 [PATCH v5 00/15] index-helper/watchman David Turner
2016-04-19 23:27 ` [PATCH v5 01/15] read-cache.c: fix constness of verify_hdr() David Turner
2016-04-19 23:27 ` [PATCH v5 02/15] read-cache: allow to keep mmap'd memory after reading David Turner
2016-04-20  9:01   ` Johannes Schindelin
2016-04-20 19:41     ` David Turner
2016-04-20  9:26   ` Duy Nguyen
2016-04-20 19:43     ` David Turner
2016-04-19 23:27 ` [PATCH v5 03/15] index-helper: new daemon for caching index and related stuff David Turner
2016-04-20 12:17   ` Johannes Schindelin
2016-04-20 12:31     ` Duy Nguyen
2016-04-20 19:38     ` David Turner
2016-04-19 23:27 ` [PATCH v5 04/15] index-helper: add --strict David Turner
2016-04-19 23:27 ` [PATCH v5 05/15] daemonize(): set a flag before exiting the main process David Turner
2016-04-19 23:28 ` [PATCH v5 06/15] index-helper: add --detach David Turner
2016-04-19 23:50   ` Duy Nguyen
2016-04-20  1:04     ` David Turner
2016-04-20  9:33       ` Duy Nguyen
2016-04-25 20:53     ` David Turner
2016-04-19 23:28 ` [PATCH v5 07/15] read-cache: add watchman 'WAMA' extension David Turner
2016-04-19 23:28 ` [PATCH v5 08/15] Add watchman support to reduce index refresh cost David Turner
2016-04-19 23:28 ` [PATCH v5 09/15] index-helper: use watchman to avoid refreshing index with lstat() David Turner
2016-04-20  0:15   ` Duy Nguyen [this message]
2016-04-20  1:01     ` David Turner
2016-04-20  9:36       ` Duy Nguyen
2016-04-19 23:28 ` [PATCH v5 10/15] update-index: enable/disable watchman support David Turner
2016-04-19 23:45   ` Duy Nguyen
2016-04-20 19:50     ` David Turner
2016-04-19 23:28 ` [PATCH v5 11/15] unpack-trees: preserve index extensions David Turner
2016-04-19 23:28 ` [PATCH v5 12/15] index-helper: kill mode David Turner
2016-04-19 23:28 ` [PATCH v5 13/15] index-helper: don't run if already running David Turner
2016-04-19 23:28 ` [PATCH v5 14/15] index-helper: autorun mode David Turner
2016-04-19 23:28 ` [PATCH v5 15/15] index-helper: optionally automatically run David Turner
2016-04-20  9:59   ` [PATCH 16/15] Add tracing to measure where most of the time is spent Nguyễn Thái Ngọc Duy
2016-04-20 12:28     ` Johannes Schindelin
2016-04-20 12:36       ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACsJy8CM409OH12w3EdVP3UjXoURbWNuqb_coQ=AagdCs+ctaQ@mail.gmail.com' \
    --to=pclouds@gmail.com \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).