git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Ben Peart <peartben@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>
Cc: git <git@vger.kernel.org>, Ben Peart <benpeart@microsoft.com>,
	Alex Vandiver <alexmv@dropbox.com>,
	Christian Couder <christian.couder@gmail.com>,
	David Turner <dturner@twopensource.com>
Subject: Re: Some rough edges of core.fsmonitor
Date: Mon, 29 Jan 2018 20:21:11 -0500	[thread overview]
Message-ID: <56b8d8a3-9483-330d-64a4-ec0295b254ac@gmail.com> (raw)
In-Reply-To: <CACBZZX74WcK4qX5O1E_6nxv7f4Vpns3O-7dcURbs+QneKsA87Q@mail.gmail.com>



On 1/28/2018 5:28 PM, Ævar Arnfjörð Bjarmason wrote:
> On Sun, Jan 28, 2018 at 9:44 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>> Hi,
>>
>> On Sat, 27 Jan 2018, Ævar Arnfjörð Bjarmason wrote:
>>
>>> I just got around to testing this since it landed, for context some
>>> previous poking of mine in [1].
>>>
>>> Issues / stuff I've noticed:
>>>
>>> 1) We end up invalidating the untracked cache because stuff in .git/
>>> changed. For example:
>>>
>>>      01:09:24.975524 fsmonitor.c:173         fsmonitor process '.git/hooks/fsmonitor-watchman' returned success
>>>      01:09:24.975548 fsmonitor.c:138         fsmonitor_refresh_callback '.git'
>>>      01:09:24.975556 fsmonitor.c:138         fsmonitor_refresh_callback '.git/config'
>>>      01:09:24.975568 fsmonitor.c:138         fsmonitor_refresh_callback '.git/index'
>>>      01:09:25.122726 fsmonitor.c:91          write fsmonitor extension successful
>>>
>>> Am I missing something or should we do something like:
>>>
>>>      diff --git a/fsmonitor.c b/fsmonitor.c
>>>      index 0af7c4edba..5067b89bda 100644
>>>      --- a/fsmonitor.c
>>>      +++ b/fsmonitor.c
>>>      @@ -118,7 +118,12 @@ static int query_fsmonitor(int version, uint64_t last_update, struct strbuf *que
>>>
>>>       static void fsmonitor_refresh_callback(struct index_state *istate, const char *name)
>>>       {
>>>      -       int pos = index_name_pos(istate, name, strlen(name));
>>>      +       int pos;
>>>      +
>>>      +       if (!strcmp(name, ".git") || starts_with(name, ".git/"))
>>>      +               return;
>>>      +
>>>      +       pos = index_name_pos(istate, name, strlen(name));
>>
>> I would much rather have the fsmonitor hook already exclude those.
> 
> As documented the fsmonitor-watchman hook we ship doesn't work as
> described in githooks(5), unless "files in the working directory" is
> taken to include .git/, but I haven't seen that ever used.
> 
> On the other hand relying on arbitrary user-provided hooks to do the
> right thing at the cost of silent performance degradation is bad. If
> we're going to expect the hook to remove these we should probably
> warn/die here if it does send us .git/* files.
> 

I'm not sure how often something is modified in the git directory when 
nothing was modified in the working directory but this seems like a nice 
optimization.

We can't just blindly ignore changes under ".git" as the git directory 
may have been moved somewhere else.  Instead we'd need to use get_git_dir().

Rather than assuming the hook will optimize for this particular case, I 
think a better solution would be to update 
untracked_cache_invalidate_path() so that it doesn't invalidate the 
untracked cache and mark the index as dirty when it was asked to 
invalidate a path under GIT_DIR.  I can't think of a case when that 
would be the desired behavior.

Somewhat off topic but related to the overall performance discussion: 
I've also thought the untracked cache shouldn't mark the index as dirty 
except in the case where the extension is being added or removed.  We've 
observed that it causes unnecessary index writes that actually slows 
down overall performance.

Since it is a cache, it does not require the index to be written out for 
correctness, it can simply update the cache again the next time it is 
needed. This is typically faster than the cost of the index write so 
makes things faster overall.  I adopted this same model with the 
fsmonitor extension.

>> If you *must* add these comparisons, you have to use fspathcmp() and
>> fspathncmp() instead (because case-insensitivity).
> 
> Thanks.
> 

  reply	other threads:[~2018-01-30  1:21 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-27  0:28 Some rough edges of core.fsmonitor Ævar Arnfjörð Bjarmason
2018-01-27  1:36 ` Duy Nguyen
2018-01-27  1:39   ` [PATCH] trace: measure where the time is spent in the index-heavy operations Nguyễn Thái Ngọc Duy
2018-01-27 11:58     ` Thomas Gummerer
2018-01-27 12:27       ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-01-27 11:43   ` Some rough edges of core.fsmonitor Ævar Arnfjörð Bjarmason
2018-01-27 12:39     ` Duy Nguyen
2018-01-27 13:09       ` Duy Nguyen
2018-01-27 19:01         ` Ævar Arnfjörð Bjarmason
2018-01-30 22:41           ` Ben Peart
2018-01-29  9:40     ` Duy Nguyen
2018-01-29 23:16       ` Ben Peart
2018-02-01 10:40         ` Duy Nguyen
2018-01-28 20:44 ` Johannes Schindelin
2018-01-28 22:28   ` Ævar Arnfjörð Bjarmason
2018-01-30  1:21     ` Ben Peart [this message]
2018-01-31 10:15       ` Duy Nguyen
2018-02-04  9:38         ` [PATCH] dir.c: ignore paths containing .git when invalidating untracked cache Nguyễn Thái Ngọc Duy
2018-02-05 17:44           ` Ben Peart
2018-02-06 12:02             ` Duy Nguyen
2018-02-07  9:21           ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-02-07  9:21             ` Nguyễn Thái Ngọc Duy
2018-02-07 16:59               ` Ben Peart
2018-02-13 10:00                 ` Duy Nguyen
2018-02-13 17:57                   ` Junio C Hamano
2018-02-14  1:24                     ` Duy Nguyen
2018-02-14  8:00                       ` Junio C Hamano
2018-01-30 22:57 ` Some rough edges of core.fsmonitor Ben Peart
2018-01-30 23:16   ` Ævar Arnfjörð Bjarmason
2018-01-31 16:12     ` Ben Peart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56b8d8a3-9483-330d-64a4-ec0295b254ac@gmail.com \
    --to=peartben@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=alexmv@dropbox.com \
    --cc=avarab@gmail.com \
    --cc=benpeart@microsoft.com \
    --cc=christian.couder@gmail.com \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).