git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: David Turner <dturner@twopensource.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH v5 09/15] index-helper: use watchman to avoid refreshing index with lstat()
Date: Tue, 19 Apr 2016 21:01:38 -0400	[thread overview]
Message-ID: <1461114098.5540.158.camel@twopensource.com> (raw)
In-Reply-To: <CACsJy8CM409OH12w3EdVP3UjXoURbWNuqb_coQ=AagdCs+ctaQ@mail.gmail.com>

On Wed, 2016-04-20 at 07:15 +0700, Duy Nguyen wrote:
> Continuing my comment from the --use-watchman patch about watchman
> not
> being supported...
> 
> On Wed, Apr 20, 2016 at 6:28 AM, David Turner <
> dturner@twopensource.com> wrote:
> > +static int poke_and_wait_for_reply(int fd)
> > +{
> > +       struct strbuf buf = STRBUF_INIT;
> > +       struct strbuf reply = STRBUF_INIT;
> > +       int ret = -1;
> > +       fd_set fds;
> > +       struct timeval timeout;
> > +
> > +       timeout.tv_usec = 0;
> > +       timeout.tv_sec = 1;
> > +
> > +       if (fd < 0)
> > +               return -1;
> > +
> > +       strbuf_addf(&buf, "poke %d", getpid());
> > +       if (write_in_full(fd, buf.buf, buf.len + 1) != buf.len + 1)
> > +               goto done_poke;
> > +
> > +       /* Now wait for a reply */
> > +       FD_ZERO(&fds);
> > +       FD_SET(fd, &fds);
> > +       if (select(fd + 1, &fds, NULL, NULL, &timeout) == 0)
> > +               /* No reply, giving up */
> > +               goto done_poke;
> > +
> > +       if (strbuf_getwholeline_fd(&reply, fd, 0))
> > +               goto done_poke;
> > +
> > +       if (!starts_with(reply.buf, "OK"))
> > +               goto done_poke;
> 
> ... while we could simply check USE_WATCHMAN macro and reject in
> update-index, a better solution is sending "poke %d watchman" and
> returning "OK watchman" (vs "OK") when watchman is supported and
> active. If the user already requests watchman and index-helper
> returns
> just "OK" then we can warn the user the reason of possible
> performance
> degradation. It's related to the error reporting, but I don't think
> you can send straight errors over unix socket. It's possible but it's
> a bit more complicated.

Do you mean that we should do this here?  Or in update-index -
-watchman?  If the former, I agree.  If the latter, I'm not sure; maybe
you'll be setting up your index before you've started the index helper?

> > +static void refresh_by_watchman(struct index_state *istate)
> > +{
> > +       void *shm = NULL;
> > +       int length;
> > +       int i;
> > +       struct stat st;
> > +       int fd = -1;
> > +       const char *path = git_path("shm-watchman-%s-%"PRIuMAX,
> > +                                   sha1_to_hex(istate->sha1),
> > +                                   (uintmax_t)getpid());
> > +
> > +       fd = open(path, O_RDONLY);
> > +       if (fd < 0)
> > +               return;
> > +
> > +       /*
> > +        * This watchman data is just for us -- no need to keep it
> > +        * around once we've got it open.
> > +        */
> > +       unlink(path);
> 
> This will not play well when multiple processes read and refresh the
> index at the same time. 

Multiple processes will have different pids, right?  And the pid is
included in the filename.  Am I missing something?

> This is really extra. But if we know in advance that git does not 
> need refresh(), then we should be able to tell index-helper not to 
> waste cycles contacting watchman and preparing shm-watchman-%s-%d 
> (the poke line gets more parameters). Either that, or we decouple 
> watchman requests from read_cache() requests. Only when 
> refresh_index() is called that we send something to request shm-
> watchman-%s-%d. The same for read_directory() (i.e. untracked cache 
> stuff). Hmm?

It's true that we could decouple watchman requests.  I'll look and see
if that's reasonable.

> Now that I think of it, with watchman backing us, we probably should
> just do nothing in update_index_if_able() (or write_locked_index()
> when we know only stat info is changed) when watchman is active. The
> purpose of update_index_if_able() is to avoid costly refresh, but we
> can already avoid that with watchman. And updating big index files is
> always costly (even though it should cost less with split-index). 

That sounds like a change we could make in a separate series.  It's not
a bad idea, but if our goal is to get the basic version out, we should
start there.

> Of
> course this can only be done if watchman (inotify to be precise) can
> cover whole worktree. I'm not sure how watchman behaves when there's
> not enough inotify resource to cover full worktree.

It will detect this case and will either manually recrawl (in the event
of a max_queued_events overflow) or return an error (in the event of
too many watched directories).

  reply	other threads:[~2016-04-20  1:01 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-19 23:27 [PATCH v5 00/15] index-helper/watchman David Turner
2016-04-19 23:27 ` [PATCH v5 01/15] read-cache.c: fix constness of verify_hdr() David Turner
2016-04-19 23:27 ` [PATCH v5 02/15] read-cache: allow to keep mmap'd memory after reading David Turner
2016-04-20  9:01   ` Johannes Schindelin
2016-04-20 19:41     ` David Turner
2016-04-20  9:26   ` Duy Nguyen
2016-04-20 19:43     ` David Turner
2016-04-19 23:27 ` [PATCH v5 03/15] index-helper: new daemon for caching index and related stuff David Turner
2016-04-20 12:17   ` Johannes Schindelin
2016-04-20 12:31     ` Duy Nguyen
2016-04-20 19:38     ` David Turner
2016-04-19 23:27 ` [PATCH v5 04/15] index-helper: add --strict David Turner
2016-04-19 23:27 ` [PATCH v5 05/15] daemonize(): set a flag before exiting the main process David Turner
2016-04-19 23:28 ` [PATCH v5 06/15] index-helper: add --detach David Turner
2016-04-19 23:50   ` Duy Nguyen
2016-04-20  1:04     ` David Turner
2016-04-20  9:33       ` Duy Nguyen
2016-04-25 20:53     ` David Turner
2016-04-19 23:28 ` [PATCH v5 07/15] read-cache: add watchman 'WAMA' extension David Turner
2016-04-19 23:28 ` [PATCH v5 08/15] Add watchman support to reduce index refresh cost David Turner
2016-04-19 23:28 ` [PATCH v5 09/15] index-helper: use watchman to avoid refreshing index with lstat() David Turner
2016-04-20  0:15   ` Duy Nguyen
2016-04-20  1:01     ` David Turner [this message]
2016-04-20  9:36       ` Duy Nguyen
2016-04-19 23:28 ` [PATCH v5 10/15] update-index: enable/disable watchman support David Turner
2016-04-19 23:45   ` Duy Nguyen
2016-04-20 19:50     ` David Turner
2016-04-19 23:28 ` [PATCH v5 11/15] unpack-trees: preserve index extensions David Turner
2016-04-19 23:28 ` [PATCH v5 12/15] index-helper: kill mode David Turner
2016-04-19 23:28 ` [PATCH v5 13/15] index-helper: don't run if already running David Turner
2016-04-19 23:28 ` [PATCH v5 14/15] index-helper: autorun mode David Turner
2016-04-19 23:28 ` [PATCH v5 15/15] index-helper: optionally automatically run David Turner
2016-04-20  9:59   ` [PATCH 16/15] Add tracing to measure where most of the time is spent Nguyễn Thái Ngọc Duy
2016-04-20 12:28     ` Johannes Schindelin
2016-04-20 12:36       ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1461114098.5540.158.camel@twopensource.com \
    --to=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).