From: "Tao Klerks via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Tao Klerks <tao@klerks.biz>
Subject: [PATCH v2 0/3] Empty untracked cache performance issue
Date: Fri, 25 Feb 2022 17:10:11 +0000 [thread overview]
Message-ID: <pull.986.v2.git.1645809015.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.986.git.1624559401.gitgitgadget@gmail.com>
This patchset addresses a performance issue with untracked cache. When a new
untracked cache structure is added to the index but remains empty,
subsequent "git status" calls populate it but do not write the index - so
they perform as though the index were disabled.
This situation can be caused in several different ways:
* Running "git update-index --untracked-cache" on a repo that did not have
the untracked cache
* Modifying the git configuration to enable untracked cache, but then
immediately running "git status -uall", causing the untracked cache to be
created, but not used/populated (and the index written).
* (likely others)
The patchset includes fixes to t7519, which otherwise starts failing because
it wasn't testing what it intended to.
Tao Klerks (3):
t7519: avoid file to index mtime race for untracked cache
t7519: populate untracked cache before test
untracked-cache: write index when populating empty untracked cache
dir.c | 10 +++++++---
t/t7519-status-fsmonitor.sh | 7 +++++++
2 files changed, 14 insertions(+), 3 deletions(-)
base-commit: 4c53a8c20f8984adb226293a3ffd7b88c3f4ac1a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-986%2FTaoK%2Ftaok-empty-untracked-cache-bug-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-986/TaoK/taok-empty-untracked-cache-bug-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/986
Range-diff vs v1:
1: c4d2a3cab4b ! 1: 9421b71540d Add a second's delay to t7519 for untracked cache
@@ Metadata
Author: Tao Klerks <tao@klerks.biz>
## Commit message ##
- Add a second's delay to t7519 for untracked cache
+ t7519: avoid file to index mtime race for untracked cache
In t7519 there is a test that writes files to disk, and immediately
- writes the untracked index. Because of mtime-comparison logic
- that uses a 1-second resolution, this means the cached entries
- are not trusted/used under some circumstances (see
- read-cache.c#is_racy_stat()).
+ writes the untracked index. Because of mtime-comparison logic that
+ uses a 1-second resolution, this means the cached entries are not
+ trusted/used under some circumstances
+ (see read-cache.c#is_racy_stat()).
- Untracked cache tests in t7063 use a 1-second delay to avoid
- this issue. We should do the same here.
+ Untracked cache tests in t7063 use a 1-second delay to avoid this
+ issue, but we don't want to introduce arbitrary slowdowns, so instead
+ use test-tool chmtime to backdate the files slightly.
- This change doesn't actually affect the outcome of the test,
- but does enhance its validity, and becomes relevant after
- later changes
+ This change doesn't actually affect the outcome of the test, but does
+ enhance its validity, and becomes relevant after later changes.
Signed-off-by: Tao Klerks <tao@klerks.biz>
## t/t7519-status-fsmonitor.sh ##
-@@ t/t7519-status-fsmonitor.sh: clean_repo () {
- git clean -fd
- }
-
-+avoid_racy() {
-+ sleep 1
-+}
-+
- dirty_repo () {
- : >untracked &&
- : >dir1/untracked &&
@@ t/t7519-status-fsmonitor.sh: test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR'
+ cd dot-git &&
+ mkdir -p .git/hooks &&
+ : >tracked &&
++ test-tool chmtime =-60 tracked &&
+ : >modified &&
++ test-tool chmtime =-60 modified &&
+ mkdir dir1 &&
+ : >dir1/tracked &&
++ test-tool chmtime =-60 dir1/tracked &&
+ : >dir1/modified &&
++ test-tool chmtime =-60 dir1/modified &&
+ mkdir dir2 &&
+ : >dir2/tracked &&
++ test-tool chmtime =-60 dir2/tracked &&
: >dir2/modified &&
++ test-tool chmtime =-60 dir2/modified &&
write_integration_script &&
git config core.fsmonitor .git/hooks/fsmonitor-test &&
-+ avoid_racy &&
git update-index --untracked-cache &&
- git update-index --fsmonitor &&
- GIT_TRACE2_PERF="$TRASH_DIRECTORY/trace-before" \
2: d63faad03a4 ! 2: d29a68e65a0 In t7519, populate untracked cache before test
@@ Metadata
Author: Tao Klerks <tao@klerks.biz>
## Commit message ##
- In t7519, populate untracked cache before test
+ t7519: populate untracked cache before test
- In its current state, the t7519 test dealing with untracked
- cache assumes that
- "git update-index --untracked-cache" will *populate*
- the untracked cache. This is not correct - it will only add
- an empty untracked cache structure to the index.
+ In its current state, the t7519 test dealing with untracked cache
+ assumes that "git update-index --untracked-cache" will *populate* the
+ untracked cache. This is not correct - it will only add an empty
+ untracked cache structure to the index.
- If we're going to compare two git status runs with
- something interesting happening in-between, we
- need to ensure that the index is in a stable/steady
- state *before* that first run.
+ If we're going to compare two git status runs with something
+ interesting happening in-between, we need to ensure that the index is
+ in a stable/steady state *before* that first run.
- We achieve this by adding another prior "git status"
- run.
+ Achieve this by adding another prior "git status" run.
- At this stage this change does nothing, because there
- is a bug, addressed in the next patch. whereby once
- the empty untracked cache structure is added by the
- update-index invocation, the untracked cache gets
- updated in every subsequent "git status" call, but the
- index with these updates does not get written down.
+ At this stage this change does nothing, because there is a bug,
+ addressed in the next patch, whereby once the empty untracked cache
+ structure is added by the update-index invocation, the untracked cache
+ gets updated in every subsequent "git status" call, but the index with
+ these updates does not get written down.
- That bug actually invalidates this entire test case -
- but we're fixing that next.
+ That bug actually invalidates this entire test case - but we're fixing
+ that next.
Signed-off-by: Tao Klerks <tao@klerks.biz>
## t/t7519-status-fsmonitor.sh ##
@@ t/t7519-status-fsmonitor.sh: test_expect_success UNTRACKED_CACHE 'ignore .git changes when invalidating UNTR'
- avoid_racy &&
+ git config core.fsmonitor .git/hooks/fsmonitor-test &&
git update-index --untracked-cache &&
git update-index --fsmonitor &&
+ git status &&
3: 627f1952fd8 ! 3: 190b27e518a Write index when populating empty untracked cache
@@ Metadata
Author: Tao Klerks <tao@klerks.biz>
## Commit message ##
- Write index when populating empty untracked cache
+ untracked-cache: write index when populating empty untracked cache
- It is expected that an empty/unpopulated untracked
- cache structure can be written to the index - by update-
- index, or by a "git status" call that sees the untracked cache
- should be enabled, but is running with options that make
- it non-applicable in that run.
+ It is expected that an empty/unpopulated untracked cache structure can
+ be written to the index - by update-index, or by a "git status" call
+ that sees the untracked cache should be enabled and is not, but is
+ running with options that make the untracked cache non-applicable in
+ that run (eg a pathspec).
- Currently, if that happens, then subsequent "git status"
- calls end up populating the untracked cache, but not
- writing the index (not saving their work) - so the
- performance outcome is almost identical to the cache
- being altogether disabled.
+ Currently, if that happens, then subsequent "git status" calls end up
+ populating the untracked cache, but not writing the index (not saving
+ their work) - so the performance outcome is almost identical to the
+ cache being altogether disabled.
- This continues until the index gets written with the cache
- populated, for some *other* reason.
+ This continues until the index gets written with the untracked cache
+ populated, for some *other* reason, such as a working tree change.
- In this change, we detect the "existing cache is empty
- and it looks like we are using it" condition, and queues
- an index write when this happens.
+ Detect the condition where an empty untracked cache exists in the
+ index and we will collect the list of untracked paths, and queue an
+ index write under that condition, so that the collected untracked
+ paths can be written out to the untracked cache extension in the
+ index.
- This change depends on previous fixes to t7519 for the
- "ignore .git changes when invalidating UNTR" test case to
- pass - before this fix, the test never actually did anything
- as it was not set up correctly.
+ This change depends on previous fixes to t7519 for the "ignore .git
+ changes when invalidating UNTR" test case to pass - before this fix,
+ the test never actually did anything as it was not set up correctly.
Signed-off-by: Tao Klerks <tao@klerks.biz>
@@ dir.c: static struct untracked_cache_dir *validate_untracked_cache(struct dir_st
- if (!dir->untracked->root)
+ if (!dir->untracked->root) {
++ /* Untracked cache existed but is not initialized; fix that */
FLEX_ALLOC_STR(dir->untracked->root, name, "");
-+ /*
-+ * If we've had to initialize the root, then what we had was an
-+ * empty uninitialized untracked cache structure. We will be
-+ * populating it now, so we should trigger an index write.
-+ */
+ istate->cache_changed |= UNTRACKED_CHANGED;
+ }
--
gitgitgadget
next prev parent reply other threads:[~2022-02-25 17:10 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-24 18:29 [PATCH 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget
2021-06-24 18:29 ` [PATCH 1/3] Add a second's delay to t7519 for untracked cache Tao Klerks via GitGitGadget
2021-06-29 4:22 ` Junio C Hamano
2021-06-24 18:30 ` [PATCH 2/3] In t7519, populate untracked cache before test Tao Klerks via GitGitGadget
2021-06-24 18:30 ` [PATCH 3/3] Write index when populating empty untracked cache Tao Klerks via GitGitGadget
2021-06-29 4:42 ` Junio C Hamano
2022-02-24 17:52 ` Tao Klerks
2022-02-24 20:35 ` Junio C Hamano
2022-02-25 17:10 ` Tao Klerks via GitGitGadget [this message]
2022-02-25 17:10 ` [PATCH v2 1/3] t7519: avoid file to index mtime race for " Tao Klerks via GitGitGadget
2022-02-25 19:07 ` Junio C Hamano
2022-02-27 22:12 ` Tao Klerks
2022-02-25 17:10 ` [PATCH v2 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget
2022-02-25 17:10 ` [PATCH v2 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget
2022-02-25 19:12 ` Junio C Hamano
2022-02-27 21:56 ` [PATCH v3 0/3] Empty untracked cache performance issue Tao Klerks via GitGitGadget
2022-02-27 21:56 ` [PATCH v3 1/3] t7519: avoid file to index mtime race for untracked cache Tao Klerks via GitGitGadget
2022-02-27 21:57 ` [PATCH v3 2/3] t7519: populate untracked cache before test Tao Klerks via GitGitGadget
2022-02-27 21:57 ` [PATCH v3 3/3] untracked-cache: write index when populating empty untracked cache Tao Klerks via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=pull.986.v2.git.1645809015.gitgitgadget@gmail.com \
--to=gitgitgadget@gmail.com \
--cc=git@vger.kernel.org \
--cc=tao@klerks.biz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).