git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Some rough edges of core.fsmonitor
@ 2018-01-27  0:28 Ævar Arnfjörð Bjarmason
  2018-01-27  1:36 ` Duy Nguyen
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-01-27  0:28 UTC (permalink / raw)
  To: git; +Cc: Ben Peart, Alex Vandiver, Christian Couder, David Turner

I just got around to testing this since it landed, for context some
previous poking of mine in [1].

Issues / stuff I've noticed:

1) We end up invalidating the untracked cache because stuff in .git/
changed. For example:

    01:09:24.975524 fsmonitor.c:173         fsmonitor process '.git/hooks/fsmonitor-watchman' returned success
    01:09:24.975548 fsmonitor.c:138         fsmonitor_refresh_callback '.git'
    01:09:24.975556 fsmonitor.c:138         fsmonitor_refresh_callback '.git/config'
    01:09:24.975568 fsmonitor.c:138         fsmonitor_refresh_callback '.git/index'
    01:09:25.122726 fsmonitor.c:91          write fsmonitor extension successful

Am I missing something or should we do something like:

    diff --git a/fsmonitor.c b/fsmonitor.c
    index 0af7c4edba..5067b89bda 100644
    --- a/fsmonitor.c
    +++ b/fsmonitor.c
    @@ -118,7 +118,12 @@ static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *que

     static void fsmonitor_refresh_callback(struct index_state *istate, const char *name)
     {
    -       int pos = index_name_pos(istate, name, strlen(name));
    +       int pos;
    +
    +       if (!strcmp(name, ".git") || starts_with(name, ".git/"))
    +               return;
    +
    +       pos = index_name_pos(istate, name, strlen(name));

            if (pos >= 0) {
                    struct cache_entry *ce = istate->cache[pos];

With that patch applied status on a large repo[2] goes from a consistent
~180-200ms to ~140-150ms, since we're not invalidating some of the UC
structure

2) We re-write out the index even though we know nothing changed

When you first run with core.fsmonitor it needs to
mark_fsmonitor_clean() for every path, but is there a reason for why we
wouldn't supply the equivalent of GIT_OPTIONAL_LOCKS=0 if all paths are
marked and we know from the hook that nothing changed? Why write out the
index again?

3) A lot of time spend reading the index (or something..)

While the hook itself takes ~20ms (and watchman itself 1/4 of that)
status as a whole takes much longer. gprof reveals:

    Each sample counts as 0.01 seconds.
      %   cumulative   self              self     total
     time   seconds   seconds    calls  ms/call  ms/call  name
     15.38      0.02     0.02   221690     0.00     0.00  memihash
     15.38      0.04     0.02   221689     0.00     0.00  create_from_disk
      7.69      0.05     0.01  2216897     0.00     0.00  git_bswap32
      7.69      0.06     0.01   222661     0.00     0.00  ce_path_match
      7.69      0.07     0.01   221769     0.00     0.00  hashmap_add
      7.69      0.08     0.01    39941     0.00     0.00  prep_exclude
      7.69      0.09     0.01    39940     0.00     0.00  strbuf_addch
      7.69      0.10     0.01        1    10.00    10.00  read_one
      7.69      0.11     0.01        1    10.00    10.00  refresh_index
      7.69      0.12     0.01        1    10.00    10.00  tweak_fsmonitor
      7.69      0.13     0.01                             preload_thread

The index is 24M in this case, I guess it's unpacking it, but I wonder
if this couldn't be much faster if we saved away the result of the last
"status" in something that's quick to access, and then if nothing
changed we just report that, and no need to re-write the index (or just
write the "it was clean at this time" part).

4) core.fsmonitor=false behaves unexpectedly

The code that reads this variable just treats it as a string, so we do a
bunch of work for nothing (and nothing warns) if this is set and 'false'
is executed. Any reason we couldn't do our standard boolean parsing
here? You couldn't call your hook 0/1/true/false, but that doesn't seem
like a big loss.

1. https://public-inbox.org/git/CACBZZX5a6Op7dH_g9WOFBnejh2zgNK4b34ygxA8daNDqvitFVA@mail.gmail.com/
2. https://github.com/avar/2015-04-03-1M-git

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2018-02-14  8:00 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-27  0:28 Some rough edges of core.fsmonitor Ævar Arnfjörð Bjarmason
2018-01-27  1:36 ` Duy Nguyen
2018-01-27  1:39   ` [PATCH] trace: measure where the time is spent in the index-heavy operations Nguyễn Thái Ngọc Duy
2018-01-27 11:58     ` Thomas Gummerer
2018-01-27 12:27       ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-01-27 11:43   ` Some rough edges of core.fsmonitor Ævar Arnfjörð Bjarmason
2018-01-27 12:39     ` Duy Nguyen
2018-01-27 13:09       ` Duy Nguyen
2018-01-27 19:01         ` Ævar Arnfjörð Bjarmason
2018-01-30 22:41           ` Ben Peart
2018-01-29  9:40     ` Duy Nguyen
2018-01-29 23:16       ` Ben Peart
2018-02-01 10:40         ` Duy Nguyen
2018-01-28 20:44 ` Johannes Schindelin
2018-01-28 22:28   ` Ævar Arnfjörð Bjarmason
2018-01-30  1:21     ` Ben Peart
2018-01-31 10:15       ` Duy Nguyen
2018-02-04  9:38         ` [PATCH] dir.c: ignore paths containing .git when invalidating untracked cache Nguyễn Thái Ngọc Duy
2018-02-05 17:44           ` Ben Peart
2018-02-06 12:02             ` Duy Nguyen
2018-02-07  9:21           ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-02-07  9:21             ` Nguyễn Thái Ngọc Duy
2018-02-07 16:59               ` Ben Peart
2018-02-13 10:00                 ` Duy Nguyen
2018-02-13 17:57                   ` Junio C Hamano
2018-02-14  1:24                     ` Duy Nguyen
2018-02-14  8:00                       ` Junio C Hamano
2018-01-30 22:57 ` Some rough edges of core.fsmonitor Ben Peart
2018-01-30 23:16   ` Ævar Arnfjörð Bjarmason
2018-01-31 16:12     ` Ben Peart

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).