git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Tao Klerks <tao@klerks.biz>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: git <git@vger.kernel.org>,
	Jeff Hostetler <jeffhost@microsoft.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: Windows: core.useBuiltinFSMonitor without core.untrackedcache - performance hazard?
Date: Mon, 21 Jun 2021 14:41:36 -0400	[thread overview]
Message-ID: <81153d02-8e7a-be59-e709-e90cd5906f3a@jeffhostetler.com> (raw)
In-Reply-To: <CAPMMpogeWeQujG0UL80REOsaBJipxhQyOpBTuWD9U9_jg=FMMA@mail.gmail.com>



On 6/21/21 8:50 AM, Tao Klerks wrote:
> Hi Johannes,
> 
> Thanks for the detailed reply!
> 
> My apologies for the subsequent delay - I've been trying to understand
> the behavior so that I can describe it in more detail, and that
> required me to learn a little C along the way :)
> 
> On Fri, Jun 11, 2021 at 11:49 AM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>> _sometimes_, the mtime of a directory seems not to
>> be updated immediately after an item in it was modified/added/deleted. And
>> that mtime is precisely what the untracked cache depends on.
>>
>> The funny thing is: while the output of `git status` will therefore at
>> first fail to pick up on, say, a new untracked file, running `git status`
>> _immediately_ afterwards _will succeed_ to see that untracked file.
> 
> (I have nothing to add here - I don't understand what
> synchronous-acknowledgement or ordering guarantees we rely on to
> determine when we *expect* a change to be available to the fsmonitor,
> nor to I have any familiarity with the underlying APIs we use)
> 
>> On Thu, 10 Jun 2021, Tao Klerks wrote:
>>>   - There is also a lingering "problem" with "git status -uall", with
>>> both "core.useBuiltinFSMonitor" and "core.fsmonitor", but that seems
>>> far less trivial to address
>>
>> Interesting. I guess the untracked cache might become too clunky with many
>> untracked files? Or is there something else going on?
> 
> I think I understand this now, but I don't think I can explain it
> particularly well/succinctly.
> 
> I've attached a sort of "truth table" of performance data for a
> particular repository, running *warm* git status calls (no cold-cache
> testing at all) with various config and command-line options, and
> reporting the durations of various phases/processes captured using
> GIT_TRACE2_PERF=1.
> 
> The claimed "problem" with "git status -uall", when both
> "core.useBuiltinFSMonitor" and "core.fsmonitor" are enabled, only
> exists from the perspective of someone who's got core.fscache enabled
> all the time:
> * core.fscache works (in the Windows port only) by doing slightly more
> expensive work up-front on first directory query within a
> request/process lifetime, and then intercepting subsequent filesystem
> "queries"/operations
> * The context within which most of this more-expensive work typically
> occurs, in a "git status" request, is an explicitly and intentionally
> multi-threaded filesystem "lstat"-checking process in preload-index.c
> (always 20 threads, for a large repo)
> * There are two sets of filesystem-iteration in a simple/naive git
> status call with core.preloadindex enabled as by default - the
> lstat-checking multithreaded loops for files in the index, and the
> recursive directory scanning for untracked files that happens later
> * That second (not-multithreaded) set of work, with fscache enabled,
> gets to reuse a bunch of cached fs data from the first one. A walk
> that cost 7s without fscache now costs only 2.5s, for example.
> * With fsmonitor enabled (and warm), the first walk simply doesn't
> happen, so fscache stops making any difference to that
> untracked-file-searching directory walk; it goes back to taking 7s;
> every directory is queried once in series, so fscache has no impact at
> all.
> * Because the preload-index lstat-querying loop is parallelized, with
> fscache the initial cache population happens fast-ish - the total time
> taken for git status is eg 5s (2.5s of parallelized & accelerated
> lstat-querying and 2.5s of
> untracked-folder-iterating-and-processing-from-fscache)
> * So, with fscache enabled, turning on fsmonitor actually makes "git
> status" *slower* - it changes the "lstat + untracked" time from "2.5s
> + 2.5s" to "0s + 7s"
> * We can hide/mitigate that by enabling the untracked cache - but that
> "fails out" in all sorts of conditions specified in dir.c
> validate_untracked_cache(), including specifying "-uall"/"-u" to "git
> status
> * -> it is only in the context of fscache being enabled, and already
> speeding up the filesystem work, that fsmonitor can make things worse
> by making the first directory-querying loop happen in a non-parallel
> area of code, and thereby cancel fscache's impact.
> * -> fsmonitor never makes things worse on linux, since there is no
> fscache and so untracked folder iteration never benefits from any
> multithreading
> * -> when the untracked cache does apply, fsmonitor can *help* it,
> avoiding the need for any sort of directory walk at all, on the basis
> that "nothing in the filesystem appears to have changed"
> 
> Based on this understanding, it looks like there are at least three
> possible directions to be explored:
> 
> 1. Making the untracked cache less sensitive to configuration in
> dir.c#validate_untracked_cache(), at the cost of doing more work in
> the cold cases & saving more data in the index file (specific
> untracked files, and .git folder names or something)
>   ** This would result in peak performance, with no
> filesystem-iterating at all in the ideal case
>   ** This would apply / add substantial value in Windows and Linux
>   ** This is probably the most complex change - dir.c is not easy to
> understand/navigate
> 
> 2. Making the untracked directory-search happen in a multithreaded way
>   ** This would raise the performance with "-uall" to approximately the
> same as it was before fsmonitor was introduced on windows, and speed
> it up slightly on linux
>   ** This change would probably not be worthwhile, - its impact would
> not be huge except in very specific cases, and it would still
> introduce non-trivial complexity
> 
> 3. Forcing preload-index to actually "do its multithreaded work", even
> when fsmonitor is there, if we know that the untracked cached cannot
> be used and fsmonitor is enabled
>   ** This would raise the performance with "-uall" to approximately the
> same as it was before fsmonitor was introduced on windows
>   ** This change would probably be pretty easy - the main challenge is
> how to get untracked-cache-applicability information at preload-index
> time, since these happen in completely different parts of the codebase
> 
> I mocked up a local hack for the third option, and confirmed that
> forcing preload-index to ignore it does indeed bring "git status
> -uall"  performance back to the same level as before enabling
> fsmonitor.
> 
>>> I just started testing the new "core.useBuiltinFSMonitor" option in
>>> the new installer, and it's amazing, thanks Ben, Alex, Johannes and
>>> Kevin!
>>
>> Not to forget Jeff Hostetler, who essentially spent the past half year on
>> it on his own.
> 
> Argh, thank you for the correction, and thank you Jeff also for all
> your work on this.
> 
>>> For context, this is in a repo with 200,000 or so files, within 40,000
>>> folders (avg path depth 4 I think?), with a reasonably-intricate set
>>> of .gitignore patterns.
> 
>> My `.gitignore` consists of only ~40 heavily commented lines (containing
>> five lines with wildcards), but I do have a `.git/info/exclude` that
>> contains a set of generated file/directory lists, i.e. without any
>> wildcards. This `exclude` file is ~26k lines long.
>>
>> My guess is that the amount of work to match the untracked vs ignored
>> files is dominating the entire operation, by a lot.
> 
> In my case, as per the "truth table" referenced above, there are two
> kinds of things in play:
> 1. Filesystem operations are the main thing that matters in Windows
> 2. Some smaller amount of overhead (0.5-1s in my case) is associated
> with other work (pattern-matching etc) during untracked file
> detection, even with untracked cache enabled. The only way to avoid
> that work altogether is to have fsmonitor *and* untracked cache
> working, so that untracked cache can "trust" the fsmonitor results to
> avoid having to do any work at all. In this ideal situation fscache
> makes no difference, as there is no fs iteration.
> 
>>> I don't know whether this is the right place to report Windows-centric
>>> concerns, if not, my apologies.
>>
>> I would not necessarily call them "Windows-centric", even if yes, at the
>> moment the built-in FSMonitor is most easily enabled on Windows (because I
>> added that experimental option in Git for Windows' installer, after
>> integrating the experimental feature).
> 
> Right - now that I understand the situation better, there are three
> specific ways in which I consider these to be windows-centric
> concerns:
> * In my experience / in my context at least, the "naive" impact of
> file operation performance differences results in a more than 10X "git
> status" reduction in performance for large repos (over linux); various
> optional and/or windows-specific strategies significantly close that
> gap
> * core.fscache is a windows-specific strategy for dealing with this,
> and interacts with other features/strategies in potentially-surprising
> ways
> * The built-in FSMonitor is still only *easily* available in Windows
> 
> But, to your point, most of the *solutions* need not be
> windows-centric at all. The "best" one, making untracked cache a
> little more forgiving, would definitely have tangible performance
> benefits on linux (and presumably OSX).
> 
> Thanks,
> Tao
> 

Nice summary.  Thanks!

We're currently looking at a problem that we believe is in the
untracked-cache code.  This is causing some of our Scalar tests
to fail on Windows when the untracked-cache is turned on.  This
is independent of whether FSMonitor or FSCache is turned on.
We're still tracking this down.

And yes, the best possible solution is to turn on FSMonitor *and*
the untracked-cache, so that the "untracked" status code doesn't
have to do anything.  So I want to look at tracking down the above
problem before doing anything else.

Thanks
Jeff




  reply	other threads:[~2021-06-21 18:41 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-10 10:24 Windows: core.useBuiltinFSMonitor without core.untrackedcache - performance hazard? Tao Klerks
2021-06-11  9:49 ` Johannes Schindelin
2021-06-21 12:50   ` Tao Klerks
2021-06-21 18:41     ` Jeff Hostetler [this message]
2021-06-21 20:52       ` Tao Klerks
2021-06-24 18:51         ` Tao Klerks
2021-06-24  5:25       ` Tao Klerks
2021-06-24 13:10         ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=81153d02-8e7a-be59-e709-e90cd5906f3a@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=jeffhost@microsoft.com \
    --cc=tao@klerks.biz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).