* Re: [PATCH v2] index: add trace2 region for clear skip worktree
2022-10-28 0:46 ` [PATCH v2] " Anh Le via GitGitGadget
@ 2022-10-28 15:49 ` Derrick Stolee
2022-10-28 17:17 ` Junio C Hamano
2022-10-28 16:50 ` Jeff Hostetler
2022-10-31 0:56 ` [PATCH v3] " Anh Le via GitGitGadget
2 siblings, 1 reply; 18+ messages in thread
From: Derrick Stolee @ 2022-10-28 15:49 UTC (permalink / raw)
To: Anh Le via GitGitGadget, git
Cc: Timothy Jones, Jeff Hostetler, Jeff Hostetler, Anh Le
On 10/27/2022 8:46 PM, Anh Le via GitGitGadget wrote:
> From: Anh Le <anh@canva.com>
>
> In a large repository using sparse checkout, checking
> whether a file marked with skip worktree is present
> on disk and its skip worktree bit should be cleared
> can take a considerable amount of time. Add a trace2
> region to surface this information, keeping a count of how many paths
> have been checked and separately
> keep counts for after a full index is
> materialised.
You have some strange line wrapping here. You can wrap them better
using several editors, but here is brief technique if your core.editor
is vim (do within 'git commit --amend'):
1. Use "SHIFT+V" to get into visual mode.
2. Use up and down arrows to select the entire paragraph.
3. Type "G" then "Q" to word wrap the selection.
4. Save and close the editor.
> int i;
> + int path_counts[2] = {0, 0};
> + int restarted = 0;
This count mechanism is a good one. Nice and simple.
>
> if (!core_apply_sparse_checkout ||
> sparse_expect_files_outside_of_patterns)
> return;
>
> + trace2_region_enter("index", "clear_skip_worktree_from_present_files", istate->repo);
> restart:
> for (i = 0; i < istate->cache_nr; i++) {
> struct cache_entry *ce = istate->cache[i];
>
> - if (ce_skip_worktree(ce) &&
> - path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
> - if (S_ISSPARSEDIR(ce->ce_mode)) {
> - ensure_full_index(istate);
> - goto restart;
> + if (ce_skip_worktree(ce)) {
> + path_counts[restarted]++;
> + if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
> + if (S_ISSPARSEDIR(ce->ce_mode)) {
> + ensure_full_index(istate);
> + restarted = 1;
> + goto restart;
> + }
> + ce->ce_flags &= ~CE_SKIP_WORKTREE;
> }
> - ce->ce_flags &= ~CE_SKIP_WORKTREE;
> }
> }
> +
> + if (path_counts[0] > 0) {
> + trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/path_count", path_counts[0]);
> + }
> + if (restarted) {
> + trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/full_index/path_count", path_counts[1]);
> + }
A few style things:
1. Use "if (path_counts[0])" to detect a non-zero value.
2. Don't use { } around a single-line block.
3. Your lines are quite long, due a lot from your long keys.
Shorten the keys (maybe "sparse_path_count" and "restarted_count"
is enough context) and consider using a line break in
the middle of the parameters.
> + trace2_region_leave("index", "clear_skip_worktree_from_present_files", istate->repo);
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] index: add trace2 region for clear skip worktree
2022-10-28 15:49 ` Derrick Stolee
@ 2022-10-28 17:17 ` Junio C Hamano
2022-10-30 23:28 ` Anh Le
0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2022-10-28 17:17 UTC (permalink / raw)
To: Derrick Stolee
Cc: Anh Le via GitGitGadget, git, Timothy Jones, Jeff Hostetler,
Jeff Hostetler, Anh Le
Derrick Stolee <derrickstolee@github.com> writes:
> A few style things:
>
> 1. Use "if (path_counts[0])" to detect a non-zero value.
> 2. Don't use { } around a single-line block.
> 3. Your lines are quite long, due a lot from your long keys.
> Shorten the keys (maybe "sparse_path_count" and "restarted_count"
> is enough context) and consider using a line break in
> the middle of the parameters.
All good suggestions. Let me add another one.
4. Call an array you primarily access its individual elements in
singular. That way, you can say path_count[0] (i.e. the count
for the zeroth round). An array that exists mostly to be
passed around as a whole to represent a "set of things", on the
other hand, can be called in plural.
Taking them all together, perhaps something like this is what you
meant?
sparse-index.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git c/sparse-index.c w/sparse-index.c
index dbf647949c..8c269dab80 100644
--- c/sparse-index.c
+++ w/sparse-index.c
@@ -493,22 +493,25 @@ void clear_skip_worktree_from_present_files(struct index_state *istate)
int dir_found = 1;
int i;
- int path_counts[2] = {0, 0};
+ int path_count[2] = {0, 0};
int restarted = 0;
if (!core_apply_sparse_checkout ||
sparse_expect_files_outside_of_patterns)
return;
- trace2_region_enter("index", "clear_skip_worktree_from_present_files", istate->repo);
+ trace2_region_enter("index", "clear_skip_worktree_from_present_files",
+ istate->repo);
restart:
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *ce = istate->cache[i];
if (ce_skip_worktree(ce)) {
- path_counts[restarted]++;
+ path_count[restarted]++;
if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
if (S_ISSPARSEDIR(ce->ce_mode)) {
+ if (restarted)
+ BUG("ensure-full-index did not fully flatten?");
ensure_full_index(istate);
restarted = 1;
goto restart;
@@ -518,13 +521,14 @@ void clear_skip_worktree_from_present_files(struct index_state *istate)
}
}
- if (path_counts[0] > 0) {
- trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/path_count", path_counts[0]);
- }
- if (restarted) {
- trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/full_index/path_count", path_counts[1]);
- }
- trace2_region_leave("index", "clear_skip_worktree_from_present_files", istate->repo);
+ if (path_count[0])
+ trace2_data_intmax("index", istate->repo,
+ "sparse_path_count", path_count[0]);
+ if (restarted)
+ trace2_data_intmax("index", istate->repo,
+ "sparse_path_count_full", path_count[1]);
+ trace2_region_leave("index", "clear_skip_worktree_from_present_files",
+ istate->repo);
}
/*
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v2] index: add trace2 region for clear skip worktree
2022-10-28 17:17 ` Junio C Hamano
@ 2022-10-30 23:28 ` Anh Le
0 siblings, 0 replies; 18+ messages in thread
From: Anh Le @ 2022-10-30 23:28 UTC (permalink / raw)
To: Junio C Hamano
Cc: Derrick Stolee, Anh Le via GitGitGadget, git, Timothy Jones,
Jeff Hostetler, Jeff Hostetler
Thank you everyone, lots of things for me to take notes of. I'll send
in another patch.
Regards,
Anh
On Sat, Oct 29, 2022 at 4:17 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Derrick Stolee <derrickstolee@github.com> writes:
>
> > A few style things:
> >
> > 1. Use "if (path_counts[0])" to detect a non-zero value.
> > 2. Don't use { } around a single-line block.
> > 3. Your lines are quite long, due a lot from your long keys.
> > Shorten the keys (maybe "sparse_path_count" and "restarted_count"
> > is enough context) and consider using a line break in
> > the middle of the parameters.
>
> All good suggestions. Let me add another one.
>
> 4. Call an array you primarily access its individual elements in
> singular. That way, you can say path_count[0] (i.e. the count
> for the zeroth round). An array that exists mostly to be
> passed around as a whole to represent a "set of things", on the
> other hand, can be called in plural.
>
> Taking them all together, perhaps something like this is what you
> meant?
>
> sparse-index.c | 24 ++++++++++++++----------
> 1 file changed, 14 insertions(+), 10 deletions(-)
>
> diff --git c/sparse-index.c w/sparse-index.c
> index dbf647949c..8c269dab80 100644
> --- c/sparse-index.c
> +++ w/sparse-index.c
> @@ -493,22 +493,25 @@ void clear_skip_worktree_from_present_files(struct index_state *istate)
> int dir_found = 1;
>
> int i;
> - int path_counts[2] = {0, 0};
> + int path_count[2] = {0, 0};
> int restarted = 0;
>
> if (!core_apply_sparse_checkout ||
> sparse_expect_files_outside_of_patterns)
> return;
>
> - trace2_region_enter("index", "clear_skip_worktree_from_present_files", istate->repo);
> + trace2_region_enter("index", "clear_skip_worktree_from_present_files",
> + istate->repo);
> restart:
> for (i = 0; i < istate->cache_nr; i++) {
> struct cache_entry *ce = istate->cache[i];
>
> if (ce_skip_worktree(ce)) {
> - path_counts[restarted]++;
> + path_count[restarted]++;
> if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
> if (S_ISSPARSEDIR(ce->ce_mode)) {
> + if (restarted)
> + BUG("ensure-full-index did not fully flatten?");
> ensure_full_index(istate);
> restarted = 1;
> goto restart;
> @@ -518,13 +521,14 @@ void clear_skip_worktree_from_present_files(struct index_state *istate)
> }
> }
>
> - if (path_counts[0] > 0) {
> - trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/path_count", path_counts[0]);
> - }
> - if (restarted) {
> - trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/full_index/path_count", path_counts[1]);
> - }
> - trace2_region_leave("index", "clear_skip_worktree_from_present_files", istate->repo);
> + if (path_count[0])
> + trace2_data_intmax("index", istate->repo,
> + "sparse_path_count", path_count[0]);
> + if (restarted)
> + trace2_data_intmax("index", istate->repo,
> + "sparse_path_count_full", path_count[1]);
> + trace2_region_leave("index", "clear_skip_worktree_from_present_files",
> + istate->repo);
> }
>
> /*
>
--
**
** <https://www.canva.com/>Empowering the world to design
We're hiring,
apply here <https://www.canva.com/careers/>! Check out the latest news and
learnings from our team on the Canva Newsroom
<https://www.canva.com/newsroom/news/>.
<https://twitter.com/canva>
<https://facebook.com/canva> <https://au.linkedin.com/company/canva>
<https://twitter.com/canva> <https://facebook.com/canva>
<https://www.linkedin.com/company/canva>
<https://instagram.com/canva>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v2] index: add trace2 region for clear skip worktree
2022-10-28 0:46 ` [PATCH v2] " Anh Le via GitGitGadget
2022-10-28 15:49 ` Derrick Stolee
@ 2022-10-28 16:50 ` Jeff Hostetler
2022-10-31 0:56 ` [PATCH v3] " Anh Le via GitGitGadget
2 siblings, 0 replies; 18+ messages in thread
From: Jeff Hostetler @ 2022-10-28 16:50 UTC (permalink / raw)
To: Anh Le via GitGitGadget, git; +Cc: Timothy Jones, Jeff Hostetler, Anh Le
On 10/27/22 8:46 PM, Anh Le via GitGitGadget wrote:
[...]
> diff --git a/sparse-index.c b/sparse-index.c
> index e4a54ce1943..dbf647949c1 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -493,24 +493,38 @@ void clear_skip_worktree_from_present_files(struct index_state *istate)
> int dir_found = 1;
>
> int i;
> + int path_counts[2] = {0, 0};
> + int restarted = 0;
>
> if (!core_apply_sparse_checkout ||
> sparse_expect_files_outside_of_patterns)
> return;
>
> + trace2_region_enter("index", "clear_skip_worktree_from_present_files", istate->repo);
> restart:
> for (i = 0; i < istate->cache_nr; i++) {
> struct cache_entry *ce = istate->cache[i];
>
> - if (ce_skip_worktree(ce) &&
> - path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
> - if (S_ISSPARSEDIR(ce->ce_mode)) {
> - ensure_full_index(istate);
> - goto restart;
> + if (ce_skip_worktree(ce)) {
> + path_counts[restarted]++;
> + if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
> + if (S_ISSPARSEDIR(ce->ce_mode)) {
> + ensure_full_index(istate);
> + restarted = 1;
> + goto restart;
> + }
> + ce->ce_flags &= ~CE_SKIP_WORKTREE;
> }
> - ce->ce_flags &= ~CE_SKIP_WORKTREE;
> }
> }
> +
> + if (path_counts[0] > 0) {
> + trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/path_count", path_counts[0]);
> + }
> + if (restarted) {
> + trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/full_index/path_count", path_counts[1]);
> + }
> + trace2_region_leave("index", "clear_skip_worktree_from_present_files", istate->repo);
> }
This looks good. And I think it sets us up to later
answer some of our earlier perf questions after you
get some data.
For example, if the path_count[0] phase is expensive,
does the restart always need to restart the loop at i=0?
Granted, the `ensure_full_index()` may create a new cache-entry
array, so the value of `i` prior to the call may not correspond
to the same cell after the call at the `goto`, but we could
cache the strdup the pathname of cache[i] before the call and
do a find on the new cache[] afterwards to get the corresponding
i-prime value and let the restart start the loop on i-prime.
That's just an idea, but let's wait for the data to see if that
would be helpful.
Thanks!
Jeff
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v3] index: add trace2 region for clear skip worktree
2022-10-28 0:46 ` [PATCH v2] " Anh Le via GitGitGadget
2022-10-28 15:49 ` Derrick Stolee
2022-10-28 16:50 ` Jeff Hostetler
@ 2022-10-31 0:56 ` Anh Le via GitGitGadget
2022-10-31 22:34 ` Taylor Blau
2022-11-03 23:04 ` [PATCH v4 0/2] " Anh Le via GitGitGadget
2 siblings, 2 replies; 18+ messages in thread
From: Anh Le via GitGitGadget @ 2022-10-31 0:56 UTC (permalink / raw)
To: git
Cc: Timothy Jones, Jeff Hostetler, Jeff Hostetler, Derrick Stolee,
Anh Le, Anh Le
From: Anh Le <anh@canva.com>
In a large repository using sparse checkout, checking whether a file marked
with skip worktree is present on disk and its skip worktree bit should be
cleared can take a considerable amount of time. Add a trace2 region to
surface this information, keeping a count of how many paths have been
checked and separately keep counts for after a full index is materialised.
Signed-off-by: Anh Le <anh@canva.com>
---
index: add trace2 region for clear skip worktree
In large repository using sparse checkout, checking whether a file
marked with skip worktree is present on disk and its skip worktree bit
should be cleared can take a considerable amount of time. Add a trace2
region to surface this information.
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1368%2FHaizzz%2Fmaster-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1368/Haizzz/master-v3
Pull-Request: https://github.com/git/git/pull/1368
Range-diff vs v2:
1: effe6b5b912 ! 1: d0d9f258b08 index: add trace2 region for clear skip worktree
@@ Metadata
## Commit message ##
index: add trace2 region for clear skip worktree
- In a large repository using sparse checkout, checking
- whether a file marked with skip worktree is present
- on disk and its skip worktree bit should be cleared
- can take a considerable amount of time. Add a trace2
- region to surface this information, keeping a count of how many paths
- have been checked and separately
- keep counts for after a full index is
- materialised.
+ In a large repository using sparse checkout, checking whether a file marked
+ with skip worktree is present on disk and its skip worktree bit should be
+ cleared can take a considerable amount of time. Add a trace2 region to
+ surface this information, keeping a count of how many paths have been
+ checked and separately keep counts for after a full index is materialised.
Signed-off-by: Anh Le <anh@canva.com>
@@ sparse-index.c: void clear_skip_worktree_from_present_files(struct index_state *
int dir_found = 1;
int i;
-+ int path_counts[2] = {0, 0};
++ int path_count[2] = {0, 0};
+ int restarted = 0;
if (!core_apply_sparse_checkout ||
sparse_expect_files_outside_of_patterns)
return;
-+ trace2_region_enter("index", "clear_skip_worktree_from_present_files", istate->repo);
++ trace2_region_enter("index", "clear_skip_worktree_from_present_files",
++ istate->repo);
restart:
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *ce = istate->cache[i];
@@ sparse-index.c: void clear_skip_worktree_from_present_files(struct index_state *
- ensure_full_index(istate);
- goto restart;
+ if (ce_skip_worktree(ce)) {
-+ path_counts[restarted]++;
++ path_count[restarted]++;
+ if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
+ if (S_ISSPARSEDIR(ce->ce_mode)) {
++ if (restarted)
++ BUG("ensure-full-index did not fully flatten?");
+ ensure_full_index(istate);
+ restarted = 1;
+ goto restart;
@@ sparse-index.c: void clear_skip_worktree_from_present_files(struct index_state *
}
}
+
-+ if (path_counts[0] > 0) {
-+ trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/path_count", path_counts[0]);
-+ }
-+ if (restarted) {
-+ trace2_data_intmax("index", istate->repo, "clear_skip_worktree_from_present_files/full_index/path_count", path_counts[1]);
-+ }
-+ trace2_region_leave("index", "clear_skip_worktree_from_present_files", istate->repo);
++ if (path_count[0])
++ trace2_data_intmax("index", istate->repo,
++ "sparse_path_count", path_count[0]);
++ if (restarted)
++ trace2_data_intmax("index", istate->repo,
++ "sparse_path_count_full", path_count[1]);
++ trace2_region_leave("index", "clear_skip_worktree_from_present_files",
++ istate->repo);
}
/*
sparse-index.c | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)
diff --git a/sparse-index.c b/sparse-index.c
index e4a54ce1943..8301129bf8f 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -493,24 +493,42 @@ void clear_skip_worktree_from_present_files(struct index_state *istate)
int dir_found = 1;
int i;
+ int path_count[2] = {0, 0};
+ int restarted = 0;
if (!core_apply_sparse_checkout ||
sparse_expect_files_outside_of_patterns)
return;
+ trace2_region_enter("index", "clear_skip_worktree_from_present_files",
+ istate->repo);
restart:
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *ce = istate->cache[i];
- if (ce_skip_worktree(ce) &&
- path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
- if (S_ISSPARSEDIR(ce->ce_mode)) {
- ensure_full_index(istate);
- goto restart;
+ if (ce_skip_worktree(ce)) {
+ path_count[restarted]++;
+ if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
+ if (S_ISSPARSEDIR(ce->ce_mode)) {
+ if (restarted)
+ BUG("ensure-full-index did not fully flatten?");
+ ensure_full_index(istate);
+ restarted = 1;
+ goto restart;
+ }
+ ce->ce_flags &= ~CE_SKIP_WORKTREE;
}
- ce->ce_flags &= ~CE_SKIP_WORKTREE;
}
}
+
+ if (path_count[0])
+ trace2_data_intmax("index", istate->repo,
+ "sparse_path_count", path_count[0]);
+ if (restarted)
+ trace2_data_intmax("index", istate->repo,
+ "sparse_path_count_full", path_count[1]);
+ trace2_region_leave("index", "clear_skip_worktree_from_present_files",
+ istate->repo);
}
/*
base-commit: 1fc3c0ad407008c2f71dd9ae1241d8b75f8ef886
--
gitgitgadget
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v3] index: add trace2 region for clear skip worktree
2022-10-31 0:56 ` [PATCH v3] " Anh Le via GitGitGadget
@ 2022-10-31 22:34 ` Taylor Blau
2022-11-03 23:04 ` [PATCH v4 0/2] " Anh Le via GitGitGadget
1 sibling, 0 replies; 18+ messages in thread
From: Taylor Blau @ 2022-10-31 22:34 UTC (permalink / raw)
To: Anh Le via GitGitGadget
Cc: git, Timothy Jones, Jeff Hostetler, Jeff Hostetler,
Derrick Stolee, Anh Le
On Mon, Oct 31, 2022 at 12:56:31AM +0000, Anh Le via GitGitGadget wrote:
> From: Anh Le <anh@canva.com>
>
> In a large repository using sparse checkout, checking whether a file marked
> with skip worktree is present on disk and its skip worktree bit should be
> cleared can take a considerable amount of time. Add a trace2 region to
> surface this information, keeping a count of how many paths have been
> checked and separately keep counts for after a full index is materialised.
I think the text you wrote here is fine, but it did take me a couple of
passes to make sure I understood it correctly.
Perhaps the following might be clearer:
When using sparse checkout, clear_skip_worktree_from_present_files()
must enumerate index entries to find ones with the SKIP_WORKTREE bit
to determine whether those index entries exist on disk (in which
case their SKIP_WORKTREE bit should be removed).
In a large repository, this may take considerable time depending on
the size of the index.
Add a trace2 region to surface this information, keeping a count o
fhow many paths have been checked. Separately, keep counts after a
full index is materialized.
Is that accurate?
> diff --git a/sparse-index.c b/sparse-index.c
> index e4a54ce1943..8301129bf8f 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -493,24 +493,42 @@ void clear_skip_worktree_from_present_files(struct index_state *istate)
> int dir_found = 1;
>
> int i;
> + int path_count[2] = {0, 0};
> + int restarted = 0;
Looks good.
> if (!core_apply_sparse_checkout ||
> sparse_expect_files_outside_of_patterns)
> return;
>
> + trace2_region_enter("index", "clear_skip_worktree_from_present_files",
> + istate->repo);
Here and below, there is some funky indentation. When I apply the patch
locally, I find that I need to squash this in:
--- >8 ---
diff --git a/sparse-index.c b/sparse-index.c
index 8301129bf8..23f997e103 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -501,7 +501,7 @@ void clear_skip_worktree_from_present_files(struct index_state *istate)
return;
trace2_region_enter("index", "clear_skip_worktree_from_present_files",
- istate->repo);
+ istate->repo);
restart:
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *ce = istate->cache[i];
--- 8< ---
> restart:
> for (i = 0; i < istate->cache_nr; i++) {
> struct cache_entry *ce = istate->cache[i];
>
> - if (ce_skip_worktree(ce) &&
> - path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
> - if (S_ISSPARSEDIR(ce->ce_mode)) {
> - ensure_full_index(istate);
> - goto restart;
> + if (ce_skip_worktree(ce)) {
> + path_count[restarted]++;
> + if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
> + if (S_ISSPARSEDIR(ce->ce_mode)) {
> + if (restarted)
> + BUG("ensure-full-index did not fully flatten?");
> + ensure_full_index(istate);
> + restarted = 1;
> + goto restart;
> + }
> + ce->ce_flags &= ~CE_SKIP_WORKTREE;
> }
> - ce->ce_flags &= ~CE_SKIP_WORKTREE;
Looks fine, though the "if (restarted) BUG(...)" is new. It's not wrong,
per-se, but seeing it in a separate commit either before or after the
main change would be appreciated.
> }
> }
> +
> + if (path_count[0])
> + trace2_data_intmax("index", istate->repo,
> + "sparse_path_count", path_count[0]);
> + if (restarted)
> + trace2_data_intmax("index", istate->repo,
> + "sparse_path_count_full", path_count[1]);
> + trace2_region_leave("index", "clear_skip_worktree_from_present_files",
> + istate->repo);
Same note about indentation here.
Thanks,
Taylor
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v4 0/2] index: add trace2 region for clear skip worktree
2022-10-31 0:56 ` [PATCH v3] " Anh Le via GitGitGadget
2022-10-31 22:34 ` Taylor Blau
@ 2022-11-03 23:04 ` Anh Le via GitGitGadget
2022-11-03 23:05 ` [PATCH v4 1/2] " Anh Le via GitGitGadget
` (2 more replies)
1 sibling, 3 replies; 18+ messages in thread
From: Anh Le via GitGitGadget @ 2022-11-03 23:04 UTC (permalink / raw)
To: git
Cc: Timothy Jones, Jeff Hostetler, Jeff Hostetler, Derrick Stolee,
Taylor Blau, Anh Le
In large repository using sparse checkout, checking whether a file marked
with skip worktree is present on disk and its skip worktree bit should be
cleared can take a considerable amount of time. Add a trace2 region to
surface this information.
Anh Le (2):
index: add trace2 region for clear skip worktree
index: raise a bug if the index is materialised more than once
sparse-index.c | 30 ++++++++++++++++++++++++------
1 file changed, 24 insertions(+), 6 deletions(-)
base-commit: 1fc3c0ad407008c2f71dd9ae1241d8b75f8ef886
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1368%2FHaizzz%2Fmaster-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1368/Haizzz/master-v4
Pull-Request: https://github.com/git/git/pull/1368
Range-diff vs v3:
1: d0d9f258b08 ! 1: 33e9b2afd91 index: add trace2 region for clear skip worktree
@@ Metadata
## Commit message ##
index: add trace2 region for clear skip worktree
- In a large repository using sparse checkout, checking whether a file marked
- with skip worktree is present on disk and its skip worktree bit should be
- cleared can take a considerable amount of time. Add a trace2 region to
- surface this information, keeping a count of how many paths have been
- checked and separately keep counts for after a full index is materialised.
+ When using sparse checkout, clear_skip_worktree_from_present_files() must
+ enumerate index entries to find ones with the SKIP_WORKTREE bit to
+ determine whether those index entries exist on disk (in which case their
+ SKIP_WORKTREE bit should be removed).
+
+ In a large repository, this may take considerable time depending on the
+ size of the index.
+
+ Add a trace2 region to surface this information, keeping a count of how
+ many paths have been checked. Separately, keep counts after a full index is
+ materialized.
Signed-off-by: Anh Le <anh@canva.com>
@@ sparse-index.c: void clear_skip_worktree_from_present_files(struct index_state *
return;
+ trace2_region_enter("index", "clear_skip_worktree_from_present_files",
-+ istate->repo);
++ istate->repo);
restart:
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *ce = istate->cache[i];
@@ sparse-index.c: void clear_skip_worktree_from_present_files(struct index_state *
+ path_count[restarted]++;
+ if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
+ if (S_ISSPARSEDIR(ce->ce_mode)) {
-+ if (restarted)
-+ BUG("ensure-full-index did not fully flatten?");
+ ensure_full_index(istate);
+ restarted = 1;
+ goto restart;
@@ sparse-index.c: void clear_skip_worktree_from_present_files(struct index_state *
+
+ if (path_count[0])
+ trace2_data_intmax("index", istate->repo,
-+ "sparse_path_count", path_count[0]);
++ "sparse_path_count", path_count[0]);
+ if (restarted)
+ trace2_data_intmax("index", istate->repo,
-+ "sparse_path_count_full", path_count[1]);
++ "sparse_path_count_full", path_count[1]);
+ trace2_region_leave("index", "clear_skip_worktree_from_present_files",
-+ istate->repo);
++ istate->repo);
}
/*
-: ----------- > 2: 91ad7973307 index: raise a bug if the index is materialised more than once
--
gitgitgadget
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH v4 1/2] index: add trace2 region for clear skip worktree
2022-11-03 23:04 ` [PATCH v4 0/2] " Anh Le via GitGitGadget
@ 2022-11-03 23:05 ` Anh Le via GitGitGadget
2022-11-03 23:05 ` [PATCH v4 2/2] index: raise a bug if the index is materialised more than once Anh Le via GitGitGadget
2022-11-05 0:29 ` [PATCH v4 0/2] index: add trace2 region for clear skip worktree Taylor Blau
2 siblings, 0 replies; 18+ messages in thread
From: Anh Le via GitGitGadget @ 2022-11-03 23:05 UTC (permalink / raw)
To: git
Cc: Timothy Jones, Jeff Hostetler, Jeff Hostetler, Derrick Stolee,
Taylor Blau, Anh Le, Anh Le
From: Anh Le <anh@canva.com>
When using sparse checkout, clear_skip_worktree_from_present_files() must
enumerate index entries to find ones with the SKIP_WORKTREE bit to
determine whether those index entries exist on disk (in which case their
SKIP_WORKTREE bit should be removed).
In a large repository, this may take considerable time depending on the
size of the index.
Add a trace2 region to surface this information, keeping a count of how
many paths have been checked. Separately, keep counts after a full index is
materialized.
Signed-off-by: Anh Le <anh@canva.com>
---
sparse-index.c | 28 ++++++++++++++++++++++------
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/sparse-index.c b/sparse-index.c
index e4a54ce1943..8713a15611d 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -493,24 +493,40 @@ void clear_skip_worktree_from_present_files(struct index_state *istate)
int dir_found = 1;
int i;
+ int path_count[2] = {0, 0};
+ int restarted = 0;
if (!core_apply_sparse_checkout ||
sparse_expect_files_outside_of_patterns)
return;
+ trace2_region_enter("index", "clear_skip_worktree_from_present_files",
+ istate->repo);
restart:
for (i = 0; i < istate->cache_nr; i++) {
struct cache_entry *ce = istate->cache[i];
- if (ce_skip_worktree(ce) &&
- path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
- if (S_ISSPARSEDIR(ce->ce_mode)) {
- ensure_full_index(istate);
- goto restart;
+ if (ce_skip_worktree(ce)) {
+ path_count[restarted]++;
+ if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
+ if (S_ISSPARSEDIR(ce->ce_mode)) {
+ ensure_full_index(istate);
+ restarted = 1;
+ goto restart;
+ }
+ ce->ce_flags &= ~CE_SKIP_WORKTREE;
}
- ce->ce_flags &= ~CE_SKIP_WORKTREE;
}
}
+
+ if (path_count[0])
+ trace2_data_intmax("index", istate->repo,
+ "sparse_path_count", path_count[0]);
+ if (restarted)
+ trace2_data_intmax("index", istate->repo,
+ "sparse_path_count_full", path_count[1]);
+ trace2_region_leave("index", "clear_skip_worktree_from_present_files",
+ istate->repo);
}
/*
--
gitgitgadget
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH v4 2/2] index: raise a bug if the index is materialised more than once
2022-11-03 23:04 ` [PATCH v4 0/2] " Anh Le via GitGitGadget
2022-11-03 23:05 ` [PATCH v4 1/2] " Anh Le via GitGitGadget
@ 2022-11-03 23:05 ` Anh Le via GitGitGadget
2022-11-05 0:29 ` [PATCH v4 0/2] index: add trace2 region for clear skip worktree Taylor Blau
2 siblings, 0 replies; 18+ messages in thread
From: Anh Le via GitGitGadget @ 2022-11-03 23:05 UTC (permalink / raw)
To: git
Cc: Timothy Jones, Jeff Hostetler, Jeff Hostetler, Derrick Stolee,
Taylor Blau, Anh Le, Anh Le
From: Anh Le <anh@canva.com>
If clear_skip_worktree_from_present_files() encounter a sparse directory,
it fully materialise the index which should expand any sparse directories
and start going through each entries again. If this happens more than once,
raise it with a BUG.
Signed-off-by: Anh Le <anh@canva.com>
---
sparse-index.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/sparse-index.c b/sparse-index.c
index 8713a15611d..8c269dab803 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -510,6 +510,8 @@ restart:
path_count[restarted]++;
if (path_found(ce->name, &last_dirname, &dir_len, &dir_found)) {
if (S_ISSPARSEDIR(ce->ce_mode)) {
+ if (restarted)
+ BUG("ensure-full-index did not fully flatten?");
ensure_full_index(istate);
restarted = 1;
goto restart;
--
gitgitgadget
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/2] index: add trace2 region for clear skip worktree
2022-11-03 23:04 ` [PATCH v4 0/2] " Anh Le via GitGitGadget
2022-11-03 23:05 ` [PATCH v4 1/2] " Anh Le via GitGitGadget
2022-11-03 23:05 ` [PATCH v4 2/2] index: raise a bug if the index is materialised more than once Anh Le via GitGitGadget
@ 2022-11-05 0:29 ` Taylor Blau
2022-11-07 20:50 ` Derrick Stolee
2 siblings, 1 reply; 18+ messages in thread
From: Taylor Blau @ 2022-11-05 0:29 UTC (permalink / raw)
To: Anh Le via GitGitGadget
Cc: git, Timothy Jones, Jeff Hostetler, Jeff Hostetler,
Derrick Stolee, Taylor Blau, Anh Le
On Thu, Nov 03, 2022 at 11:04:59PM +0000, Anh Le via GitGitGadget wrote:
> In large repository using sparse checkout, checking whether a file marked
> with skip worktree is present on disk and its skip worktree bit should be
> cleared can take a considerable amount of time. Add a trace2 region to
> surface this information.
>
> Anh Le (2):
> index: add trace2 region for clear skip worktree
> index: raise a bug if the index is materialised more than once
>
> sparse-index.c | 30 ++++++++++++++++++++++++------
> 1 file changed, 24 insertions(+), 6 deletions(-)
This version is looking good. Thanks, will queue.
Thanks,
Taylor
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH v4 0/2] index: add trace2 region for clear skip worktree
2022-11-05 0:29 ` [PATCH v4 0/2] index: add trace2 region for clear skip worktree Taylor Blau
@ 2022-11-07 20:50 ` Derrick Stolee
0 siblings, 0 replies; 18+ messages in thread
From: Derrick Stolee @ 2022-11-07 20:50 UTC (permalink / raw)
To: Taylor Blau, Anh Le via GitGitGadget
Cc: git, Timothy Jones, Jeff Hostetler, Jeff Hostetler, Anh Le
On 11/4/22 8:29 PM, Taylor Blau wrote:
> On Thu, Nov 03, 2022 at 11:04:59PM +0000, Anh Le via GitGitGadget wrote:
>> In large repository using sparse checkout, checking whether a file marked
>> with skip worktree is present on disk and its skip worktree bit should be
>> cleared can take a considerable amount of time. Add a trace2 region to
>> surface this information.
>>
>> Anh Le (2):
>> index: add trace2 region for clear skip worktree
>> index: raise a bug if the index is materialised more than once
>>
>> sparse-index.c | 30 ++++++++++++++++++++++++------
>> 1 file changed, 24 insertions(+), 6 deletions(-)
>
> This version is looking good. Thanks, will queue.
I also took a full re-read of this version and agree
that it is ready to merge.
Thanks,
-Stolee
^ permalink raw reply [flat|nested] 18+ messages in thread