From: Tao Klerks <tao@klerks.biz>
To: git@vger.kernel.org
Subject: Question about fsmonitor and --untracked-files=all
Date: Tue, 22 Sep 2020 13:35:42 +0200 [thread overview]
Message-ID: <CAPMMpoj+UhKCW_k34-cGkiWFghOOu13GhPgA0V-y4ZpLVppuiA@mail.gmail.com> (raw)
Hi folks,
I've got a couple questions about the "fsmonitor" functionality,
untracked files, and multithreading.
Background:
In a repo with:
* A couple hundred thousand tracked files, and a couple hundred
thousand .gitignored files, across a few thousand directories
* The --untracked-cache setting, tested and working
* core.fsmonitor set up with watchman (with the sample integration
script from january)
* Git version 2.27.0.windows.1
"git status" takes about 2s
"git status --untracked-files=all" takes about 20s
When I turn off "core.fsmonitor", the numbers change to something like:
"git status": 8s
"git status --untracked-files=all": 9s
Using windows' "procmon" to observe git.exe's behavior from outside, I
think I've understood a couple things that surprise me:
1. when you specify "--untracked-files=all", git scans the entire
folder tree regardless of the "fsmonitor" hook
2. when you specify the "fsmonitor" hook, git does any
filesystem-scanning in a single-threaded fashion (as opposed to
multi-threaded without "fsmonitor" / normally)
These two things combine so that with "fsmonitor" set, normal
command-line git status performance is great, but the performance in
tools that eagerly look for untracked files (like "Git Extensions" on
windows) actually suffers - it takes twice as long to run the 'git -c
diff.ignoreSubModules=none status --porcelain=2 -z
--untracked-files=all' command that this UI wants (and blocks on, when
you go to a commit dialog).
Questions:
1. Is there a reason "--untracked-files=all" causes a full directory
tree scan even with the "fsmonitor" hook active, or is this
accidental?
2. Assuming that the full directory tree scan is indeed necessary even
with "fsmonitor" (when requesting all untracked files), could it be
made multithreaded?
(my apologies for the simplistic "outside-in" observations; I don't
feel qualified to attempt to understand the git source code)
Thanks for any help understanding the optimization opportunities here!
Tao Klerks
next reply other threads:[~2020-09-22 11:36 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-22 11:35 Tao Klerks [this message]
2020-09-23 10:40 ` Question about fsmonitor and --untracked-files=all Johannes Schindelin
2020-09-24 12:14 ` Tao Klerks
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPMMpoj+UhKCW_k34-cGkiWFghOOu13GhPgA0V-y4ZpLVppuiA@mail.gmail.com \
--to=tao@klerks.biz \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).