* [RFC PATCH 0/1] dir: consider worktree config in path recursion
@ 2022-05-05 20:32 Goss Geppert
2022-05-05 20:32 ` [RFC PATCH 1/1] " Goss Geppert
` (7 more replies)
0 siblings, 8 replies; 31+ messages in thread
From: Goss Geppert @ 2022-05-05 20:32 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee
Please see the commit message for a description. Effectively, since
8d92fb2927 (dir: replace exponential algorithm with a linear one,
2020-04-01) the following no longer does what it used to do:
```
test_repo="$(mktemp -d /tmp/git-test-XXXXXXX)"
cd "$test_repo"
>test-file
git init
git config core.worktree /tmp
git add test-file
```
In terms of desired behavior, it seems that if it is possible to track
files that are outside the default worktree, then it should also be
possible to track files inside the default worktree location.
Personally, I was using this setup to track dotfiles and some other
system related snippets and liked that design more than alternatives I'm
aware of. I ran into this problem because I was recently unable to add a
file to an existing repository with this configuration.
There are a couple open items but I wanted to get some feedback to
confirm that the behavior of this patch is desirable or if I'm
overlooking something entirely.
TODOs:
[ ] add test
[ ] add signoffs
Goss Geppert (1):
dir: consider worktree config in path recursion
dir.c | 37 ++++++++++++++++++++++++++++++-------
1 file changed, 30 insertions(+), 7 deletions(-)
base-commit: 0f828332d5ac36fc63b7d8202652efa152809856
--
2.36.0
^ permalink raw reply [flat|nested] 31+ messages in thread
* [RFC PATCH 1/1] dir: consider worktree config in path recursion 2022-05-05 20:32 [RFC PATCH 0/1] dir: consider worktree config in path recursion Goss Geppert @ 2022-05-05 20:32 ` Goss Geppert 2022-05-07 3:26 ` Elijah Newren 2022-05-06 17:02 ` [RFC PATCH 0/1] " Junio C Hamano ` (6 subsequent siblings) 7 siblings, 1 reply; 31+ messages in thread From: Goss Geppert @ 2022-05-05 20:32 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee Since 8d92fb2927 (dir: replace exponential algorithm with a linear one, 2020-04-01) the following no longer works: 1) Initialize a repository. 2) Set the `core.worktree` location to a parent directory of the default worktree. 3) Add a file located in the default worktree location to the index (i.e. anywhere in the immediate parent directory of the gitdir). This commit adds a check to determine whether a nested repository that is encountered in recursing a path is actually `the_repository`. If so, simply treat the directory as if it doesn't contain a nested repository. Prior to this commit, the `add` operation was silently ignored. --- dir.c | 37 ++++++++++++++++++++++++++++++------- 1 file changed, 30 insertions(+), 7 deletions(-) diff --git a/dir.c b/dir.c index f2b0f24210..cef39f43d8 100644 --- a/dir.c +++ b/dir.c @@ -1861,7 +1861,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, */ enum path_treatment state; int matches_how = 0; - int nested_repo = 0, check_only, stop_early; + int check_only, stop_early; int old_ignored_nr, old_untracked_nr; /* The "len-1" is to strip the final '/' */ enum exist_status status = directory_exists_in_index(istate, dirname, len-1); @@ -1893,16 +1893,39 @@ static enum path_treatment treat_directory(struct dir_struct *dir, if ((dir->flags & DIR_SKIP_NESTED_GIT) || !(dir->flags & DIR_NO_GITLINKS)) { + /* + * Determine if `dirname` is a nested repo by confirming that: + * 1) we are in a nonbare repository, and + * 2) `dirname` is not an immediate parent of `the_repository->gitdir`, + * which could occur if the `worktree` location was manually + * configured by the user + */ + int nested_repo; struct strbuf sb = STRBUF_INIT; strbuf_addstr(&sb, dirname); nested_repo = is_nonbare_repository_dir(&sb); + + if (nested_repo) { + char *real_dirname, *real_gitdir; + strbuf_reset(&sb); + strbuf_addstr(&sb, dirname); + strbuf_complete(&sb, '/'); + strbuf_addstr(&sb, ".git"); + real_dirname = real_pathdup(sb.buf, 0); + real_gitdir = real_pathdup(the_repository->gitdir, 0); + + nested_repo = !!strcmp(real_dirname, real_gitdir); + free(real_gitdir); + free(real_dirname); + } strbuf_release(&sb); - } - if (nested_repo) { - if ((dir->flags & DIR_SKIP_NESTED_GIT) || - (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) - return path_none; - return excluded ? path_excluded : path_untracked; + + if (nested_repo) { + if ((dir->flags & DIR_SKIP_NESTED_GIT) || + (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) + return path_none; + return excluded ? path_excluded : path_untracked; + } } if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES)) { -- 2.36.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [RFC PATCH 1/1] dir: consider worktree config in path recursion 2022-05-05 20:32 ` [RFC PATCH 1/1] " Goss Geppert @ 2022-05-07 3:26 ` Elijah Newren 2022-05-07 17:59 ` oss dev 0 siblings, 1 reply; 31+ messages in thread From: Elijah Newren @ 2022-05-07 3:26 UTC (permalink / raw) To: Goss Geppert Cc: Git Mailing List, Junio C Hamano, christian w, Derrick Stolee On Thu, May 5, 2022 at 1:32 PM Goss Geppert <gg.oss.dev@gmail.com> wrote: > > Since 8d92fb2927 (dir: replace exponential algorithm with a linear one, > 2020-04-01) the following no longer works: > > 1) Initialize a repository. > 2) Set the `core.worktree` location to a parent directory of the > default worktree. > 3) Add a file located in the default worktree location to the index > (i.e. anywhere in the immediate parent directory of the gitdir). > > This commit adds a check to determine whether a nested repository that > is encountered in recursing a path is actually `the_repository`. If so, > simply treat the directory as if it doesn't contain a nested repository. > > Prior to this commit, the `add` operation was silently ignored. > --- > dir.c | 37 ++++++++++++++++++++++++++++++------- > 1 file changed, 30 insertions(+), 7 deletions(-) > > diff --git a/dir.c b/dir.c > index f2b0f24210..cef39f43d8 100644 > --- a/dir.c > +++ b/dir.c > @@ -1861,7 +1861,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > */ > enum path_treatment state; > int matches_how = 0; > - int nested_repo = 0, check_only, stop_early; > + int check_only, stop_early; This part of the patch, along with two other parts below, is orthogonal to the actual fix being made. It probably makes the code clearer, but should be done in an independent cleanup commit. > int old_ignored_nr, old_untracked_nr; > /* The "len-1" is to strip the final '/' */ > enum exist_status status = directory_exists_in_index(istate, dirname, len-1); > @@ -1893,16 +1893,39 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > > if ((dir->flags & DIR_SKIP_NESTED_GIT) || > !(dir->flags & DIR_NO_GITLINKS)) { > + /* > + * Determine if `dirname` is a nested repo by confirming that: > + * 1) we are in a nonbare repository, and > + * 2) `dirname` is not an immediate parent of `the_repository->gitdir`, > + * which could occur if the `worktree` location was manually > + * configured by the user > + */ This is a good comment, but it'd be really nice to be able to point a user at a testcase in the testsuite for them to inspect further (e.g. "see t####, 'TESTCASE DESCRIPTION' for an case where this matters"). > + int nested_repo; Just this line immediately above this comment is part of the orthogonal cleanup. > struct strbuf sb = STRBUF_INIT; > strbuf_addstr(&sb, dirname); > nested_repo = is_nonbare_repository_dir(&sb); > + > + if (nested_repo) { > + char *real_dirname, *real_gitdir; > + strbuf_reset(&sb); > + strbuf_addstr(&sb, dirname); > + strbuf_complete(&sb, '/'); > + strbuf_addstr(&sb, ".git"); > + real_dirname = real_pathdup(sb.buf, 0); > + real_gitdir = real_pathdup(the_repository->gitdir, 0); I haven't thought it through, but is there a reason you can't just compare the_repository->gitdir to sb.buf at this point? Why is real_pathdup needed? > + > + nested_repo = !!strcmp(real_dirname, real_gitdir); > + free(real_gitdir); > + free(real_dirname); > + } Everything up to here other than the two parts I mentioned as being orthogonal (both relating to where "nested_repo" is defined), makes up the actual fix. > strbuf_release(&sb); > - } > - if (nested_repo) { > - if ((dir->flags & DIR_SKIP_NESTED_GIT) || > - (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) > - return path_none; > - return excluded ? path_excluded : path_untracked; > + > + if (nested_repo) { > + if ((dir->flags & DIR_SKIP_NESTED_GIT) || > + (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) > + return path_none; > + return excluded ? path_excluded : path_untracked; > + } This final block is part of the orthogonal code cleanup that belongs in a separate commit. > } > > if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES)) { > -- > 2.36.0 Thanks for finding, reporting, and sending in a patch. Could you split up the patch as indicated above, and add a testcase to the patch with the fix, and include your Signed-off-by trailer on the commits? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH 1/1] dir: consider worktree config in path recursion 2022-05-07 3:26 ` Elijah Newren @ 2022-05-07 17:59 ` oss dev 0 siblings, 0 replies; 31+ messages in thread From: oss dev @ 2022-05-07 17:59 UTC (permalink / raw) To: Elijah Newren Cc: Git Mailing List, Junio C Hamano, christian w, Derrick Stolee On Fri, May 6, 2022 at 11:26 PM Elijah Newren <newren@gmail.com> wrote: > I haven't thought it through, but is there a reason you can't just > compare the_repository->gitdir to sb.buf at this point? Why is > real_pathdup needed? I used `real_pathdup` here because `the_repository->gitdir` is not necessarily normalized as required for the `strcmp`. For example, when using option `--git-dir=./././xyz/.git` while outside the worktree, the path will remain as is (see [1] and [2]). [1]: https://github.com/git/git/blob/e8005e4871f130c4e402ddca2032c111252f070a/setup.c#L962 [2]: https://github.com/git/git/blob/e8005e4871f130c4e402ddca2032c111252f070a/environment.c#L353 > Thanks for finding, reporting, and sending in a patch. Could you > split up the patch as indicated above, and add a testcase to the patch > with the fix, and include your Signed-off-by trailer on the commits? WIll do, thx. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH 0/1] dir: consider worktree config in path recursion 2022-05-05 20:32 [RFC PATCH 0/1] dir: consider worktree config in path recursion Goss Geppert 2022-05-05 20:32 ` [RFC PATCH 1/1] " Goss Geppert @ 2022-05-06 17:02 ` Junio C Hamano 2022-05-06 20:00 ` oss dev 2022-05-10 17:15 ` [PATCH v2 0/2] " Goss Geppert ` (5 subsequent siblings) 7 siblings, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2022-05-06 17:02 UTC (permalink / raw) To: Goss Geppert; +Cc: git, christian w, Elijah Newren, Derrick Stolee Goss Geppert <gg.oss.dev@gmail.com> writes: > Please see the commit message for a description. Effectively, since > 8d92fb2927 (dir: replace exponential algorithm with a linear one, > 2020-04-01) the following no longer does what it used to do: > > ``` > test_repo="$(mktemp -d /tmp/git-test-XXXXXXX)" > cd "$test_repo" > >test-file > git init > git config core.worktree /tmp > git add test-file And this is supposed to add a new entry to the index at "git-test-123456/test-file"? If it added a new index entry 'test-file', that means "git config core.worktree" is completely ignored, because we are not treating "/tmp" as the root of the working tree as it is instructed and that sounds like a bug. Is that what you mean by "no longer does what it used to do"? Thanks for raising an issue (although it is not quite clear yet to me what the issue is, exactly ;-). ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [RFC PATCH 0/1] dir: consider worktree config in path recursion 2022-05-06 17:02 ` [RFC PATCH 0/1] " Junio C Hamano @ 2022-05-06 20:00 ` oss dev 0 siblings, 0 replies; 31+ messages in thread From: oss dev @ 2022-05-06 20:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, christian w, Elijah Newren, Derrick Stolee On Fri, May 6, 2022 at 1:02 PM Junio C Hamano <gitster@pobox.com> wrote: > And this is supposed to add a new entry to the index at > "git-test-123456/test-file"? Correct, I expected an index entry at "git-test-123456/test-file". If I run the following: ``` test_repo="$(mktemp -d /tmp/git-test-XXXXXXX)" cd "$test_repo" >test-file >/tmp/test-file git init git config core.worktree /tmp git add test-file git add /tmp/test-file ``` .. then with current git I get an index entry for "/tmp/test-file" but not "/tmp/git-test-123456/test-file". That is, I can add files outside the standard worktree location but not inside it. A playground example of the prior behavior (git 2.25.1) is here: https://www.onlinegdb.com/OXt2cvL8S ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v2 0/2] dir: consider worktree config in path recursion 2022-05-05 20:32 [RFC PATCH 0/1] dir: consider worktree config in path recursion Goss Geppert 2022-05-05 20:32 ` [RFC PATCH 1/1] " Goss Geppert 2022-05-06 17:02 ` [RFC PATCH 0/1] " Junio C Hamano @ 2022-05-10 17:15 ` Goss Geppert 2022-05-10 17:15 ` [PATCH v2 1/2] " Goss Geppert 2022-05-10 17:15 ` [PATCH v2 2/2] dir: minor refactoring / clean-up Goss Geppert 2022-05-20 19:28 ` [PATCH v3 0/3] dir: traverse into repository Goss Geppert ` (4 subsequent siblings) 7 siblings, 2 replies; 31+ messages in thread From: Goss Geppert @ 2022-05-10 17:15 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee Updated for requested revisions: * Split patch between fix and clean-up * Added a test * Included my sign-off It wasn't clear to me where the testcases should be placed--let me know if a different location is more appropriate. Goss Geppert (2): dir: consider worktree config in path recursion dir: minor refactoring / clean-up dir.c | 38 ++++++++++++--- t/t2205-add-worktree-config.sh | 89 ++++++++++++++++++++++++++++++++++ 2 files changed, 120 insertions(+), 7 deletions(-) create mode 100755 t/t2205-add-worktree-config.sh Range-diff against v1: 1: b9ca67e61d ! 1: 83bf9f40ac dir: consider worktree config in path recursion @@ Commit message Prior to this commit, the `add` operation was silently ignored. + Signed-off-by: Goss Geppert <ggossdev@gmail.com> + ## dir.c ## @@ dir.c: static enum path_treatment treat_directory(struct dir_struct *dir, - */ - enum path_treatment state; - int matches_how = 0; -- int nested_repo = 0, check_only, stop_early; -+ int check_only, stop_early; - int old_ignored_nr, old_untracked_nr; - /* The "len-1" is to strip the final '/' */ - enum exist_status status = directory_exists_in_index(istate, dirname, len-1); -@@ dir.c: static enum path_treatment treat_directory(struct dir_struct *dir, if ((dir->flags & DIR_SKIP_NESTED_GIT) || !(dir->flags & DIR_NO_GITLINKS)) { @@ dir.c: static enum path_treatment treat_directory(struct dir_struct *dir, + * 1) we are in a nonbare repository, and + * 2) `dirname` is not an immediate parent of `the_repository->gitdir`, + * which could occur if the `worktree` location was manually -+ * configured by the user ++ * configured by the user; see t2205 testcases 1a-1d for examples ++ * where this matters + */ -+ int nested_repo; struct strbuf sb = STRBUF_INIT; strbuf_addstr(&sb, dirname); nested_repo = is_nonbare_repository_dir(&sb); @@ dir.c: static enum path_treatment treat_directory(struct dir_struct *dir, + free(real_dirname); + } strbuf_release(&sb); -- } -- if (nested_repo) { -- if ((dir->flags & DIR_SKIP_NESTED_GIT) || -- (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) -- return path_none; -- return excluded ? path_excluded : path_untracked; -+ -+ if (nested_repo) { -+ if ((dir->flags & DIR_SKIP_NESTED_GIT) || -+ (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) -+ return path_none; -+ return excluded ? path_excluded : path_untracked; -+ } } - - if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES)) { + if (nested_repo) { + + ## t/t2205-add-worktree-config.sh (new) ## +@@ ++#!/bin/sh ++ ++test_description='directory traversal respects worktree config ++ ++This test configures the repository`s worktree to be two levels above the ++`.git` directory and checks whether we are able to add to the index those files ++that are in either (1) the manually configured worktree directory or (2) the ++standard worktree location with respect to the `.git` directory (i.e. ensuring ++that the encountered `.git` directory is not treated as belonging to a foreign ++nested repository)' ++ ++. ./test-lib.sh ++ ++test_expect_success '1a: setup' ' ++ test_create_repo test1 && ++ git --git-dir="test1/.git" config core.worktree "$(pwd)" && ++ ++ mkdir -p outside-tracked outside-untracked && ++ mkdir -p test1/inside-tracked test1/inside-untracked && ++ >file-tracked && ++ >file-untracked && ++ >outside-tracked/file && ++ >outside-untracked/file && ++ >test1/file-tracked && ++ >test1/file-untracked && ++ >test1/inside-tracked/file && ++ >test1/inside-untracked/file && ++ ++ cat >expect-tracked-unsorted <<-EOF && ++ ../file-tracked ++ ../outside-tracked/file ++ file-tracked ++ inside-tracked/file ++ EOF ++ ++ cat >expect-untracked-unsorted <<-EOF && ++ ../file-untracked ++ ../outside-untracked/file ++ file-untracked ++ inside-untracked/file ++ EOF ++ ++ cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && ++ ++ cat >.gitignore <<-EOF ++ .gitignore ++ actual-* ++ expect-* ++ EOF ++' ++ ++test_expect_success '1b: pre-add all' ' ++ local parent_dir="$(pwd)" && ++ ( ++ cd test1 && ++ git ls-files -o --exclude-standard "$parent_dir" >../actual-all-unsorted ++ ) && ++ sort actual-all-unsorted >actual-all && ++ sort expect-all-unsorted >expect-all && ++ test_cmp expect-all actual-all ++' ++ ++test_expect_success '1c: post-add tracked' ' ++ local parent_dir="$(pwd)" && ++ ( ++ cd test1 && ++ git add file-tracked && ++ git add inside-tracked && ++ git add ../outside-tracked && ++ git add "$parent_dir/file-tracked" && ++ git ls-files "$parent_dir" >../actual-tracked-unsorted ++ ) && ++ sort actual-tracked-unsorted >actual-tracked && ++ sort expect-tracked-unsorted >expect-tracked && ++ test_cmp expect-tracked actual-tracked ++' ++ ++test_expect_success '1d: post-add untracked' ' ++ local parent_dir="$(pwd)" && ++ ( ++ cd test1 && ++ git ls-files -o --exclude-standard "$parent_dir" >../actual-untracked-unsorted ++ ) && ++ sort actual-untracked-unsorted >actual-untracked && ++ sort expect-untracked-unsorted >expect-untracked && ++ test_cmp expect-untracked actual-untracked ++' ++ ++test_done -: ---------- > 2: 0da9804776 dir: minor refactoring / clean-up -- 2.36.0 ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v2 1/2] dir: consider worktree config in path recursion 2022-05-10 17:15 ` [PATCH v2 0/2] " Goss Geppert @ 2022-05-10 17:15 ` Goss Geppert 2022-05-11 16:37 ` Junio C Hamano ` (2 more replies) 2022-05-10 17:15 ` [PATCH v2 2/2] dir: minor refactoring / clean-up Goss Geppert 1 sibling, 3 replies; 31+ messages in thread From: Goss Geppert @ 2022-05-10 17:15 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee Since 8d92fb2927 (dir: replace exponential algorithm with a linear one, 2020-04-01) the following no longer works: 1) Initialize a repository. 2) Set the `core.worktree` location to a parent directory of the default worktree. 3) Add a file located in the default worktree location to the index (i.e. anywhere in the immediate parent directory of the gitdir). This commit adds a check to determine whether a nested repository that is encountered in recursing a path is actually `the_repository`. If so, simply treat the directory as if it doesn't contain a nested repository. Prior to this commit, the `add` operation was silently ignored. Signed-off-by: Goss Geppert <ggossdev@gmail.com> --- dir.c | 22 +++++++++ t/t2205-add-worktree-config.sh | 89 ++++++++++++++++++++++++++++++++++ 2 files changed, 111 insertions(+) create mode 100755 t/t2205-add-worktree-config.sh diff --git a/dir.c b/dir.c index f2b0f24210..a1886e61a3 100644 --- a/dir.c +++ b/dir.c @@ -1893,9 +1893,31 @@ static enum path_treatment treat_directory(struct dir_struct *dir, if ((dir->flags & DIR_SKIP_NESTED_GIT) || !(dir->flags & DIR_NO_GITLINKS)) { + /* + * Determine if `dirname` is a nested repo by confirming that: + * 1) we are in a nonbare repository, and + * 2) `dirname` is not an immediate parent of `the_repository->gitdir`, + * which could occur if the `worktree` location was manually + * configured by the user; see t2205 testcases 1a-1d for examples + * where this matters + */ struct strbuf sb = STRBUF_INIT; strbuf_addstr(&sb, dirname); nested_repo = is_nonbare_repository_dir(&sb); + + if (nested_repo) { + char *real_dirname, *real_gitdir; + strbuf_reset(&sb); + strbuf_addstr(&sb, dirname); + strbuf_complete(&sb, '/'); + strbuf_addstr(&sb, ".git"); + real_dirname = real_pathdup(sb.buf, 0); + real_gitdir = real_pathdup(the_repository->gitdir, 0); + + nested_repo = !!strcmp(real_dirname, real_gitdir); + free(real_gitdir); + free(real_dirname); + } strbuf_release(&sb); } if (nested_repo) { diff --git a/t/t2205-add-worktree-config.sh b/t/t2205-add-worktree-config.sh new file mode 100755 index 0000000000..ca70cf3fe2 --- /dev/null +++ b/t/t2205-add-worktree-config.sh @@ -0,0 +1,89 @@ +#!/bin/sh + +test_description='directory traversal respects worktree config + +This test configures the repository`s worktree to be two levels above the +`.git` directory and checks whether we are able to add to the index those files +that are in either (1) the manually configured worktree directory or (2) the +standard worktree location with respect to the `.git` directory (i.e. ensuring +that the encountered `.git` directory is not treated as belonging to a foreign +nested repository)' + +. ./test-lib.sh + +test_expect_success '1a: setup' ' + test_create_repo test1 && + git --git-dir="test1/.git" config core.worktree "$(pwd)" && + + mkdir -p outside-tracked outside-untracked && + mkdir -p test1/inside-tracked test1/inside-untracked && + >file-tracked && + >file-untracked && + >outside-tracked/file && + >outside-untracked/file && + >test1/file-tracked && + >test1/file-untracked && + >test1/inside-tracked/file && + >test1/inside-untracked/file && + + cat >expect-tracked-unsorted <<-EOF && + ../file-tracked + ../outside-tracked/file + file-tracked + inside-tracked/file + EOF + + cat >expect-untracked-unsorted <<-EOF && + ../file-untracked + ../outside-untracked/file + file-untracked + inside-untracked/file + EOF + + cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && + + cat >.gitignore <<-EOF + .gitignore + actual-* + expect-* + EOF +' + +test_expect_success '1b: pre-add all' ' + local parent_dir="$(pwd)" && + ( + cd test1 && + git ls-files -o --exclude-standard "$parent_dir" >../actual-all-unsorted + ) && + sort actual-all-unsorted >actual-all && + sort expect-all-unsorted >expect-all && + test_cmp expect-all actual-all +' + +test_expect_success '1c: post-add tracked' ' + local parent_dir="$(pwd)" && + ( + cd test1 && + git add file-tracked && + git add inside-tracked && + git add ../outside-tracked && + git add "$parent_dir/file-tracked" && + git ls-files "$parent_dir" >../actual-tracked-unsorted + ) && + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked +' + +test_expect_success '1d: post-add untracked' ' + local parent_dir="$(pwd)" && + ( + cd test1 && + git ls-files -o --exclude-standard "$parent_dir" >../actual-untracked-unsorted + ) && + sort actual-untracked-unsorted >actual-untracked && + sort expect-untracked-unsorted >expect-untracked && + test_cmp expect-untracked actual-untracked +' + +test_done -- 2.36.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v2 1/2] dir: consider worktree config in path recursion 2022-05-10 17:15 ` [PATCH v2 1/2] " Goss Geppert @ 2022-05-11 16:37 ` Junio C Hamano 2022-05-20 19:45 ` oss dev 2022-05-24 14:29 ` Elijah Newren 2022-05-11 23:07 ` Junio C Hamano 2022-05-23 19:23 ` Derrick Stolee 2 siblings, 2 replies; 31+ messages in thread From: Junio C Hamano @ 2022-05-11 16:37 UTC (permalink / raw) To: Goss Geppert; +Cc: git, christian w, Elijah Newren, Derrick Stolee Goss Geppert <gg.oss.dev@gmail.com> writes: > diff --git a/dir.c b/dir.c > index f2b0f24210..a1886e61a3 100644 > --- a/dir.c > +++ b/dir.c > @@ -1893,9 +1893,31 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > if ((dir->flags & DIR_SKIP_NESTED_GIT) || > !(dir->flags & DIR_NO_GITLINKS)) { > + /* > + * Determine if `dirname` is a nested repo by confirming that: > + * 1) we are in a nonbare repository, and > + * 2) `dirname` is not an immediate parent of `the_repository->gitdir`, > + * which could occur if the `worktree` location was manually > + * configured by the user; see t2205 testcases 1a-1d for examples > + * where this matters > + */ > struct strbuf sb = STRBUF_INIT; > strbuf_addstr(&sb, dirname); > nested_repo = is_nonbare_repository_dir(&sb); > + > + if (nested_repo) { > + char *real_dirname, *real_gitdir; > + strbuf_reset(&sb); > + strbuf_addstr(&sb, dirname); > + strbuf_complete(&sb, '/'); > + strbuf_addstr(&sb, ".git"); > + real_dirname = real_pathdup(sb.buf, 0); Is this _complete() necessary? Near the top of this function there is /* The "len-1" is to strip the final '/' */ enum exist_status status = directory_exists_in_index(istate, dirname, len-1); which indicates that we are safe to assume dirname ends with '/'. > + real_gitdir = real_pathdup(the_repository->gitdir, 0); This function is repeatedly called during the traversal. How expensive is it to keep calling real_pathdup() on the constant the_repository->gitdir just in case it might be the same as our true GIT_DIR? > + > + nested_repo = !!strcmp(real_dirname, real_gitdir); > + free(real_gitdir); > + free(real_dirname); > + } > strbuf_release(&sb); > } > if (nested_repo) { > diff --git a/t/t2205-add-worktree-config.sh b/t/t2205-add-worktree-config.sh > new file mode 100755 > index 0000000000..ca70cf3fe2 > --- /dev/null > +++ b/t/t2205-add-worktree-config.sh > @@ -0,0 +1,89 @@ > +#!/bin/sh > + > +test_description='directory traversal respects worktree config > + > +This test configures the repository`s worktree to be two levels above the > +`.git` directory and checks whether we are able to add to the index those files > +that are in either (1) the manually configured worktree directory or (2) the > +standard worktree location with respect to the `.git` directory (i.e. ensuring > +that the encountered `.git` directory is not treated as belonging to a foreign > +nested repository)' > + > +. ./test-lib.sh > + > +test_expect_success '1a: setup' ' > + test_create_repo test1 && > + git --git-dir="test1/.git" config core.worktree "$(pwd)" && > + > + mkdir -p outside-tracked outside-untracked && > + mkdir -p test1/inside-tracked test1/inside-untracked && > + >file-tracked && > + >file-untracked && > + >outside-tracked/file && > + >outside-untracked/file && > + >test1/file-tracked && > + >test1/file-untracked && > + >test1/inside-tracked/file && > + >test1/inside-untracked/file && > + > + cat >expect-tracked-unsorted <<-EOF && > + ../file-tracked > + ../outside-tracked/file > + file-tracked > + inside-tracked/file > + EOF > + > + cat >expect-untracked-unsorted <<-EOF && > + ../file-untracked > + ../outside-untracked/file > + file-untracked > + inside-untracked/file > + EOF > + > + cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && > + > + cat >.gitignore <<-EOF > + .gitignore > + actual-* > + expect-* > + EOF > +' > + > +test_expect_success '1b: pre-add all' ' > + local parent_dir="$(pwd)" && > + ( > + cd test1 && > + git ls-files -o --exclude-standard "$parent_dir" >../actual-all-unsorted > + ) && > + sort actual-all-unsorted >actual-all && > + sort expect-all-unsorted >expect-all && > + test_cmp expect-all actual-all > +' > + > +test_expect_success '1c: post-add tracked' ' > + local parent_dir="$(pwd)" && > + ( > + cd test1 && > + git add file-tracked && > + git add inside-tracked && > + git add ../outside-tracked && > + git add "$parent_dir/file-tracked" && > + git ls-files "$parent_dir" >../actual-tracked-unsorted > + ) && > + sort actual-tracked-unsorted >actual-tracked && > + sort expect-tracked-unsorted >expect-tracked && > + test_cmp expect-tracked actual-tracked > +' > + > +test_expect_success '1d: post-add untracked' ' > + local parent_dir="$(pwd)" && > + ( > + cd test1 && > + git ls-files -o --exclude-standard "$parent_dir" >../actual-untracked-unsorted > + ) && > + sort actual-untracked-unsorted >actual-untracked && > + sort expect-untracked-unsorted >expect-untracked && > + test_cmp expect-untracked actual-untracked > +' > + > +test_done ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2 1/2] dir: consider worktree config in path recursion 2022-05-11 16:37 ` Junio C Hamano @ 2022-05-20 19:45 ` oss dev 2022-05-24 14:29 ` Elijah Newren 1 sibling, 0 replies; 31+ messages in thread From: oss dev @ 2022-05-20 19:45 UTC (permalink / raw) To: Junio C Hamano Cc: Git Mailing List, christian w, Elijah Newren, Derrick Stolee > Is this _complete() necessary? It is not. Though _complete won't actually change the buffer if the trailing character matches, I've removed it. The '/' is added prior to each call here [1]. [1]: https://github.com/git/git/blob/277cf0bc36094f6dc4297d8c9cef79df045b735d/dir.c#L2329 > This function is repeatedly called during the traversal. > > How expensive is it to keep calling real_pathdup() on the constant > the_repository->gitdir just in case it might be the same as our true > GIT_DIR? Using trace2, `real_pathdup` takes around 40 usecs on my machine compared to 10 usecs for an empty region. Please see the cover letter for v3 of the patch series for additional details on both of these items. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2 1/2] dir: consider worktree config in path recursion 2022-05-11 16:37 ` Junio C Hamano 2022-05-20 19:45 ` oss dev @ 2022-05-24 14:29 ` Elijah Newren 2022-05-24 19:45 ` Junio C Hamano 1 sibling, 1 reply; 31+ messages in thread From: Elijah Newren @ 2022-05-24 14:29 UTC (permalink / raw) To: Junio C Hamano Cc: Goss Geppert, Git Mailing List, christian w, Derrick Stolee On Wed, May 11, 2022 at 9:37 AM Junio C Hamano <gitster@pobox.com> wrote: > > Goss Geppert <gg.oss.dev@gmail.com> writes: > > > diff --git a/dir.c b/dir.c > > index f2b0f24210..a1886e61a3 100644 > > --- a/dir.c > > +++ b/dir.c > > @@ -1893,9 +1893,31 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > [...] > > > + real_gitdir = real_pathdup(the_repository->gitdir, 0); > > This function is repeatedly called during the traversal. > > How expensive is it to keep calling real_pathdup() on the constant > the_repository->gitdir just in case it might be the same as our true > GIT_DIR? I agree that treat_directory is called many times, but this real_pathdup() call is inside the "if (nested_repo)" block, so this new real_pathdup() invocation should occur very seldom. Or are you worried about cases where users have *very* large numbers of bare repositories nested under the working directory? Even in that case, which seems pathological to me, I'd suspect the is_nonbare_repository_dir() -> read_gitfile_gently()/is_git_directory() codepath (used to determine the value of nested_repo) would be much more expensive than this call to real_pathdup(), so would it be worth trying to optimize this real_pathdup() call away even in that rare case? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2 1/2] dir: consider worktree config in path recursion 2022-05-24 14:29 ` Elijah Newren @ 2022-05-24 19:45 ` Junio C Hamano 2022-05-25 3:46 ` Elijah Newren 0 siblings, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2022-05-24 19:45 UTC (permalink / raw) To: Elijah Newren; +Cc: Goss Geppert, Git Mailing List, christian w, Derrick Stolee Elijah Newren <newren@gmail.com> writes: > On Wed, May 11, 2022 at 9:37 AM Junio C Hamano <gitster@pobox.com> wrote: >> >> Goss Geppert <gg.oss.dev@gmail.com> writes: >> >> > diff --git a/dir.c b/dir.c >> > index f2b0f24210..a1886e61a3 100644 >> > --- a/dir.c >> > +++ b/dir.c >> > @@ -1893,9 +1893,31 @@ static enum path_treatment treat_directory(struct dir_struct *dir, >> > [...] >> >> > + real_gitdir = real_pathdup(the_repository->gitdir, 0); >> >> This function is repeatedly called during the traversal. >> >> How expensive is it to keep calling real_pathdup() on the constant >> the_repository->gitdir just in case it might be the same as our true >> GIT_DIR? > > I agree that treat_directory is called many times, but this > real_pathdup() call is inside the "if (nested_repo)" block, so this > new real_pathdup() invocation should occur very seldom. Or are you > worried about cases where users have *very* large numbers of bare > repositories nested under the working directory? No. I wasn't worried about anything in particular. I just wanted to get the feel of how deep a thought the patch was backed by by spot checking what was and what was not taken into account when designing the change. I do not care too much when there are very large numbers of things that cause this codepath to be exercised. Strange situations can be left for later optimization only when they turn up in the real world and prove to be a problem. By the way, where is a bare repository involved? did you mean non-bare aka worktree-full repository? ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2 1/2] dir: consider worktree config in path recursion 2022-05-24 19:45 ` Junio C Hamano @ 2022-05-25 3:46 ` Elijah Newren 0 siblings, 0 replies; 31+ messages in thread From: Elijah Newren @ 2022-05-25 3:46 UTC (permalink / raw) To: Junio C Hamano Cc: Goss Geppert, Git Mailing List, christian w, Derrick Stolee On Tue, May 24, 2022 at 12:45 PM Junio C Hamano <gitster@pobox.com> wrote: > > Elijah Newren <newren@gmail.com> writes: > > > On Wed, May 11, 2022 at 9:37 AM Junio C Hamano <gitster@pobox.com> wrote: > >> > >> Goss Geppert <gg.oss.dev@gmail.com> writes: > >> > >> > diff --git a/dir.c b/dir.c > >> > index f2b0f24210..a1886e61a3 100644 > >> > --- a/dir.c > >> > +++ b/dir.c > >> > @@ -1893,9 +1893,31 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > >> > > [...] > >> > >> > + real_gitdir = real_pathdup(the_repository->gitdir, 0); > >> > >> This function is repeatedly called during the traversal. > >> > >> How expensive is it to keep calling real_pathdup() on the constant > >> the_repository->gitdir just in case it might be the same as our true > >> GIT_DIR? > > > > I agree that treat_directory is called many times, but this > > real_pathdup() call is inside the "if (nested_repo)" block, so this > > new real_pathdup() invocation should occur very seldom. Or are you > > worried about cases where users have *very* large numbers of bare > > repositories nested under the working directory? > > No. I wasn't worried about anything in particular. I just wanted > to get the feel of how deep a thought the patch was backed by by > spot checking what was and what was not taken into account when > designing the change. > > I do not care too much when there are very large numbers of things > that cause this codepath to be exercised. Strange situations can be > left for later optimization only when they turn up in the real world > and prove to be a problem. Ah, gotcha. > By the way, where is a bare repository involved? did you mean > non-bare aka worktree-full repository? Yes, sorry, I meant non-bare repository. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2 1/2] dir: consider worktree config in path recursion 2022-05-10 17:15 ` [PATCH v2 1/2] " Goss Geppert 2022-05-11 16:37 ` Junio C Hamano @ 2022-05-11 23:07 ` Junio C Hamano 2022-05-20 20:01 ` oss dev 2022-05-23 19:23 ` Derrick Stolee 2 siblings, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2022-05-11 23:07 UTC (permalink / raw) To: Goss Geppert; +Cc: git, christian w, Elijah Newren, Derrick Stolee Goss Geppert <gg.oss.dev@gmail.com> writes: > Since 8d92fb2927 (dir: replace exponential algorithm with a linear one, > 2020-04-01) the following no longer works: > > 1) Initialize a repository. > 2) Set the `core.worktree` location to a parent directory of the > default worktree. > 3) Add a file located in the default worktree location to the index > (i.e. anywhere in the immediate parent directory of the gitdir). I think I've mentioned this ealier, but the above only describes the scenario and does not say what behaviour is expected and what behaviour is observed. "no longer works" is OK, but not sufficient. Easily remedied by saying "It used to do X, but now it does Y" after the above description of the scenario to describe what happens and what you want to happen. > + git --git-dir="test1/.git" config core.worktree "$(pwd)" && I wonder if this lets funny paths to be added to the index, e.g. would "git add test1" recursively add everything in that directory? Thanks. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2 1/2] dir: consider worktree config in path recursion 2022-05-11 23:07 ` Junio C Hamano @ 2022-05-20 20:01 ` oss dev 0 siblings, 0 replies; 31+ messages in thread From: oss dev @ 2022-05-20 20:01 UTC (permalink / raw) To: Junio C Hamano Cc: Git Mailing List, christian w, Elijah Newren, Derrick Stolee > I think I've mentioned this ealier, but the above only describes the > scenario and does not say what behaviour is expected and what behaviour > is observed. "no longer works" is OK, but not sufficient. I've updated the commit message for the fact that using option `--git-dir` can also cause the traversal to skip over the repository. I've therefore had to be a bit more generic in the description, so let me know if it's not specific enough. > > + git --git-dir="test1/.git" config core.worktree "$(pwd)" && > > I wonder if this lets funny paths to be added to the index, e.g. > would "git add test1" recursively add everything in that directory? Yes, it would recursively add files inside test1. I've added testcase 3c which I believe does what we want but let me know if that doesn't address your question. ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2 1/2] dir: consider worktree config in path recursion 2022-05-10 17:15 ` [PATCH v2 1/2] " Goss Geppert 2022-05-11 16:37 ` Junio C Hamano 2022-05-11 23:07 ` Junio C Hamano @ 2022-05-23 19:23 ` Derrick Stolee 2022-05-30 18:48 ` oss dev 2 siblings, 1 reply; 31+ messages in thread From: Derrick Stolee @ 2022-05-23 19:23 UTC (permalink / raw) To: Goss Geppert, git; +Cc: Junio C Hamano, christian w, Elijah Newren On 5/10/22 1:15 PM, Goss Geppert wrote: > Since 8d92fb2927 (dir: replace exponential algorithm with a linear one, > 2020-04-01) the following no longer works: > > 1) Initialize a repository. > 2) Set the `core.worktree` location to a parent directory of the > default worktree. > 3) Add a file located in the default worktree location to the index > (i.e. anywhere in the immediate parent directory of the gitdir). > > This commit adds a check to determine whether a nested repository that > is encountered in recursing a path is actually `the_repository`. If so, > simply treat the directory as if it doesn't contain a nested repository. > > Prior to this commit, the `add` operation was silently ignored. > > Signed-off-by: Goss Geppert <ggossdev@gmail.com> > --- > dir.c | 22 +++++++++ > t/t2205-add-worktree-config.sh | 89 ++++++++++++++++++++++++++++++++++ > 2 files changed, 111 insertions(+) > create mode 100755 t/t2205-add-worktree-config.sh > > diff --git a/dir.c b/dir.c > index f2b0f24210..a1886e61a3 100644 > --- a/dir.c > +++ b/dir.c > @@ -1893,9 +1893,31 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > > if ((dir->flags & DIR_SKIP_NESTED_GIT) || > !(dir->flags & DIR_NO_GITLINKS)) { > + /* > + * Determine if `dirname` is a nested repo by confirming that: > + * 1) we are in a nonbare repository, and > + * 2) `dirname` is not an immediate parent of `the_repository->gitdir`, > + * which could occur if the `worktree` location was manually > + * configured by the user; see t2205 testcases 1a-1d for examples > + * where this matters > + */ > struct strbuf sb = STRBUF_INIT; > strbuf_addstr(&sb, dirname); > nested_repo = is_nonbare_repository_dir(&sb); > + > + if (nested_repo) { > + char *real_dirname, *real_gitdir; > + strbuf_reset(&sb); > + strbuf_addstr(&sb, dirname); > + strbuf_complete(&sb, '/'); > + strbuf_addstr(&sb, ".git"); This could be strbuf_add(&sb, ".git", 4); to avoid a (minor) strlen() computation. > + real_dirname = real_pathdup(sb.buf, 0); > + real_gitdir = real_pathdup(the_repository->gitdir, 0); With regards to Junio's comment about repeating real_pathdup() on the_repository->gitdir, we might be able to short-circuit this by making real_gitdir static and only calling real_pathdup() if real_gitdir is NULL. It would cut the cost of these checks by half for big traversals. > + > + nested_repo = !!strcmp(real_dirname, real_gitdir); > + free(real_gitdir); > + free(real_dirname); > + } ... > + > +test_description='directory traversal respects worktree config > + > +This test configures the repository`s worktree to be two levels above the > +`.git` directory and checks whether we are able to add to the index those files > +that are in either (1) the manually configured worktree directory or (2) the > +standard worktree location with respect to the `.git` directory (i.e. ensuring > +that the encountered `.git` directory is not treated as belonging to a foreign > +nested repository)' The test description should be a single line. The longer paragraph could be a comment before your "setup" test case. > +. ./test-lib.sh > + > +test_expect_success '1a: setup' ' Also, there is no need to number your tests in their names, as the testing infrastructure will apply numbers based on their order in the test file. These labels will grow stale if tests are removed or inserted into the order. > + test_create_repo test1 && > + git --git-dir="test1/.git" config core.worktree "$(pwd)" && > + > + mkdir -p outside-tracked outside-untracked && > + mkdir -p test1/inside-tracked test1/inside-untracked && > + >file-tracked && > + >file-untracked && > + >outside-tracked/file && > + >outside-untracked/file && > + >test1/file-tracked && > + >test1/file-untracked && > + >test1/inside-tracked/file && > + >test1/inside-untracked/file && > + > + cat >expect-tracked-unsorted <<-EOF && > + ../file-tracked > + ../outside-tracked/file > + file-tracked > + inside-tracked/file > + EOF > + > + cat >expect-untracked-unsorted <<-EOF && > + ../file-untracked > + ../outside-untracked/file > + file-untracked > + inside-untracked/file > + EOF > + > + cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && > + > + cat >.gitignore <<-EOF > + .gitignore > + actual-* > + expect-* > + EOF This use of .gitignore to ignore the helper files being used during the test is interesting. I was thinking it would be better to create isolated directories where the only activity was that which you were testing: mkdir -p worktree && test_create_repo worktree/contains-git && ...or something like that. The names are more descriptive, and we don't need the .gitignore trick. > +' > + > +test_expect_success '1b: pre-add all' ' > + local parent_dir="$(pwd)" && I was surprised by this "local". It isn't needed at this point in the test script. We use it for helper methods to be sure that we aren't stomping variables from test scripts, but since the test is being executed inside an eval(), this local isn't important. > + ( > + cd test1 && > + git ls-files -o --exclude-standard "$parent_dir" >../actual-all-unsorted You can avoid the sub-shell using "git -C test1 ls-files ..." right? > + ) && > + sort actual-all-unsorted >actual-all && > + sort expect-all-unsorted >expect-all && > + test_cmp expect-all actual-all > +' > + > +test_expect_success '1c: post-add tracked' ' > + local parent_dir="$(pwd)" && > + ( > + cd test1 && > + git add file-tracked && > + git add inside-tracked && > + git add ../outside-tracked && > + git add "$parent_dir/file-tracked" && > + git ls-files "$parent_dir" >../actual-tracked-unsorted > + ) && This sub-shell is important because of the relative paths involved. OK. This also checks to see if _any_ of these "git add" commands fail, as opposed to failing immediately after the first one fails. I think your approach is simpler and should be relatively simple to identify which command did the wrong thing by looking at the test_cmp output. > + sort actual-tracked-unsorted >actual-tracked && > + sort expect-tracked-unsorted >expect-tracked && > + test_cmp expect-tracked actual-tracked > +' > + > +test_expect_success '1d: post-add untracked' ' > + local parent_dir="$(pwd)" && > + ( > + cd test1 && > + git ls-files -o --exclude-standard "$parent_dir" >../actual-untracked-unsorted > + ) && Again, this one is not needed. > + sort actual-untracked-unsorted >actual-untracked && > + sort expect-untracked-unsorted >expect-untracked && > + test_cmp expect-untracked actual-untracked > +' > + Thanks, -Stolee ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2 1/2] dir: consider worktree config in path recursion 2022-05-23 19:23 ` Derrick Stolee @ 2022-05-30 18:48 ` oss dev 0 siblings, 0 replies; 31+ messages in thread From: oss dev @ 2022-05-30 18:48 UTC (permalink / raw) To: Derrick Stolee Cc: Git Mailing List, Junio C Hamano, christian w, Elijah Newren > This could be strbuf_add(&sb, ".git", 4); to avoid a (minor) > strlen() computation. I checked gcc's disassembly--the strbuf_addstr is inlined and the strlen is evaluated at compile time. I believe gcc has evaluated strlen at compile time for string literals for quite some time. Also seems common in the rest of the code base to use strbuf_addstr with string literals. > With regards to Junio's comment about repeating real_pathdup() > on the_repository->gitdir, we might be able to short-circuit > this by making real_gitdir static and only calling > real_pathdup() if real_gitdir is NULL. It would cut the cost > of these checks by half for big traversals. Yeah, I couldn't come up with a super clean way of caching real_pathdup (assuming we don't want to leak the memory), but based on subsequent comments it sounds like it was more a kicking-the-tires question than real concern. > The test description should be a single line. The longer paragraph > could be a comment before your "setup" test case. I'm not sure what the current best practice is, but I expected best practice to be reflected here [1] and many other tests use multi-line descriptions (see e.g. t0000 [2] or t1512 [3] for a nice ascii graphic right in the description). [1]: https://git-scm.com/docs/MyFirstContribution#write-new-test [2]: https://github.com/git/git/blob/8ddf593a250e07d388059f7e3f471078e1d2ed5c/t/t0000-basic.sh [3]: https://github.com/git/git/blob/8ddf593a250e07d388059f7e3f471078e1d2ed5c/t/t1512-rev-parse-disambiguation.sh > Also, there is no need to number your tests in their names, as the > testing infrastructure will apply numbers based on their order in > the test file. These labels will grow stale if tests are removed > or inserted into the order. In v1 of the patch, I was asked to include a comment in the code that refers to a test for which the changes that are being made in this patch are meaningful. Grepping for other instances where this was done, similar comments generally refer to numbered testcases (see e.g. [1]). I figured if I don't number it then someone might suggest that similar comments generally refer to numbered test cases. Though I'll concede that referring to any specific tests will likely eventually result in stale references. For example, t6043 referenced by [1] (currently in master) appears to have been moved elsewhere. Personally, I'm indifferent. [1]: https://github.com/git/git/blob/8ddf593a250e07d388059f7e3f471078e1d2ed5c/merge-recursive.c#L1620 > This use of .gitignore to ignore the helper files being used > during the test is interesting. I was thinking it would be better > to create isolated directories where the only activity was that > which you were testing: > > mkdir -p worktree && > test_create_repo worktree/contains-git && > > ...or something like that. The names are more descriptive, and > we don't need the .gitignore trick. While trying to figure out what section this test should go into, I looked through several tests and was likely inspired by e.g. [1] or [2]. I also think there could be a benefit to ensuring that .gitignore works correctly with the new traversal behavior in the nested repo, i.e. the .gitignore becomes part of the test. In principle, I don't mind testing .gitignore more explicitly, but I also think the tests are quite lengthy already for a small patch and doing more with less seems attractive. [1]: https://github.com/git/git/blob/8ddf593a250e07d388059f7e3f471078e1d2ed5c/t/t1501-work-tree.sh#L198-L202 [2]: https://github.com/git/git/blob/8ddf593a250e07d388059f7e3f471078e1d2ed5c/t/t2202-add-addremove.sh#L13-L17 > I was surprised by this "local". It isn't needed at this > point in the test script. We use it for helper methods to > be sure that we aren't stomping variables from test scripts, > but since the test is being executed inside an eval(), this > local isn't important. I tend to default to `local` or `declare` to limit unintended side effects after refactoring, etc., but I don't mind removing it if it's not desired. I'm ok either way, pls confirm. > You can avoid the sub-shell using "git -C test1 ls-files ..." > right? Ok, I can update this and similar instances. > This also checks to see if _any_ of these "git add" > commands fail, as opposed to failing immediately after > the first one fails. I think your approach is simpler and > should be relatively simple to identify which command did > the wrong thing by looking at the test_cmp output. Agreed, I'm combining several `git add`s into one test here -- I didn't want to get too long-winded since it seems like a minor patch. Thanks for the comments; v3 of the patch had already been sent prior to these, so none is incorporated there. I'll include confirmed changes in the next version. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v2 2/2] dir: minor refactoring / clean-up 2022-05-10 17:15 ` [PATCH v2 0/2] " Goss Geppert 2022-05-10 17:15 ` [PATCH v2 1/2] " Goss Geppert @ 2022-05-10 17:15 ` Goss Geppert 2022-05-11 16:51 ` Junio C Hamano 1 sibling, 1 reply; 31+ messages in thread From: Goss Geppert @ 2022-05-10 17:15 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee Improve readability. Signed-off-by: Goss Geppert <ggossdev@gmail.com> --- dir.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/dir.c b/dir.c index a1886e61a3..329db72999 100644 --- a/dir.c +++ b/dir.c @@ -1861,7 +1861,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, */ enum path_treatment state; int matches_how = 0; - int nested_repo = 0, check_only, stop_early; + int check_only, stop_early; int old_ignored_nr, old_untracked_nr; /* The "len-1" is to strip the final '/' */ enum exist_status status = directory_exists_in_index(istate, dirname, len-1); @@ -1901,6 +1901,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, * configured by the user; see t2205 testcases 1a-1d for examples * where this matters */ + int nested_repo; struct strbuf sb = STRBUF_INIT; strbuf_addstr(&sb, dirname); nested_repo = is_nonbare_repository_dir(&sb); @@ -1919,12 +1920,13 @@ static enum path_treatment treat_directory(struct dir_struct *dir, free(real_dirname); } strbuf_release(&sb); - } - if (nested_repo) { - if ((dir->flags & DIR_SKIP_NESTED_GIT) || - (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) - return path_none; - return excluded ? path_excluded : path_untracked; + + if (nested_repo) { + if ((dir->flags & DIR_SKIP_NESTED_GIT) || + (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) + return path_none; + return excluded ? path_excluded : path_untracked; + } } if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES)) { -- 2.36.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v2 2/2] dir: minor refactoring / clean-up 2022-05-10 17:15 ` [PATCH v2 2/2] dir: minor refactoring / clean-up Goss Geppert @ 2022-05-11 16:51 ` Junio C Hamano 2022-05-20 20:03 ` oss dev 0 siblings, 1 reply; 31+ messages in thread From: Junio C Hamano @ 2022-05-11 16:51 UTC (permalink / raw) To: Goss Geppert; +Cc: git, christian w, Elijah Newren, Derrick Stolee Goss Geppert <gg.oss.dev@gmail.com> writes: > Improve readability. It reads somewhat a subjective opinion, without explaining how/why the change makes it more readable. Perhaps Narrow the scope of the nested_repo variable to the block that uses it. or something? I was hoping that we can/should apply these two patches near where the breakage happened, but unfortunately this part has been updated a bit since then, so merging upwards would involve a bit of conflict resolution. We should be able to manage. Thanks. > Signed-off-by: Goss Geppert <ggossdev@gmail.com> > --- > dir.c | 16 +++++++++------- > 1 file changed, 9 insertions(+), 7 deletions(-) > > diff --git a/dir.c b/dir.c > index a1886e61a3..329db72999 100644 > --- a/dir.c > +++ b/dir.c > @@ -1861,7 +1861,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > */ > enum path_treatment state; > int matches_how = 0; > - int nested_repo = 0, check_only, stop_early; > + int check_only, stop_early; > int old_ignored_nr, old_untracked_nr; > /* The "len-1" is to strip the final '/' */ > enum exist_status status = directory_exists_in_index(istate, dirname, len-1); > @@ -1901,6 +1901,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > * configured by the user; see t2205 testcases 1a-1d for examples > * where this matters > */ > + int nested_repo; > struct strbuf sb = STRBUF_INIT; > strbuf_addstr(&sb, dirname); > nested_repo = is_nonbare_repository_dir(&sb); > @@ -1919,12 +1920,13 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > free(real_dirname); > } > strbuf_release(&sb); > - } > - if (nested_repo) { > - if ((dir->flags & DIR_SKIP_NESTED_GIT) || > - (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) > - return path_none; > - return excluded ? path_excluded : path_untracked; > + > + if (nested_repo) { > + if ((dir->flags & DIR_SKIP_NESTED_GIT) || > + (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) > + return path_none; > + return excluded ? path_excluded : path_untracked; > + } > } > > if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES)) { ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v2 2/2] dir: minor refactoring / clean-up 2022-05-11 16:51 ` Junio C Hamano @ 2022-05-20 20:03 ` oss dev 0 siblings, 0 replies; 31+ messages in thread From: oss dev @ 2022-05-20 20:03 UTC (permalink / raw) To: Junio C Hamano Cc: Git Mailing List, christian w, Elijah Newren, Derrick Stolee > > Improve readability. > > It reads somewhat a subjective opinion, without explaining how/why > the change makes it more readable. Perhaps > > Narrow the scope of the nested_repo variable to the block > that uses it. > > or something? I've updated this message. Thanks! ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v3 0/3] dir: traverse into repository 2022-05-05 20:32 [RFC PATCH 0/1] dir: consider worktree config in path recursion Goss Geppert ` (2 preceding siblings ...) 2022-05-10 17:15 ` [PATCH v2 0/2] " Goss Geppert @ 2022-05-20 19:28 ` Goss Geppert 2022-05-20 19:28 ` [PATCH v3 1/3] " Goss Geppert ` (2 more replies) 2022-06-16 23:19 ` [PATCH v4 0/2] dir: traverse into repository Goss Geppert ` (3 subsequent siblings) 7 siblings, 3 replies; 31+ messages in thread From: Goss Geppert @ 2022-05-20 19:28 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee Thanks for your feedback so far. This revision includes a number of changes: * I realized that using option `--git-dir` (or environment variable `GIT_DIR`) when the working directory is outside the repository can also cause the directory traversal to fail to enter `the_repository`. I've added test cases and updated comments / messages to reflect this. * I noticed that `real_pathdup` can fail even for valid paths (and return NULL) if e.g. the path contains too many nested symlinks (currently 32). We'll now `die_on_error`. * I no longer reset the string buffer holding the `dirname` and just reuse it. While the prior call to `is_nonbare_repository_dir` does not treat its argument as const, the function does reset the argument to its effective input state. The `strbuf_addstr` and `strbuf_complete` have also been removed. * I added a separate commit to cache the `real_gitdir` value. On my mystery machine, the call to `real_pathdup`, instrumented with trace2, takes an anecdotal 40 usecs for a somewhat lengthy path [1]. This compares to around 10 usecs for an empty trace2 region. I can remove the commit if you don't think the performance savings are worth the clutter. Let me know if you have further comments or suggestions. [1]: path with one symlink: './repo/link/a/b/././././..//////c/d/./xyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxyxy/././../e/../f/../g/../h/../././///../../..' Goss Geppert (3): dir: traverse into repository dir: cache git_dir's realpath dir: minor refactoring / clean-up dir.c | 90 ++++++++--- t/t2205-add-worktree-config.sh | 274 +++++++++++++++++++++++++++++++++ 2 files changed, 341 insertions(+), 23 deletions(-) create mode 100755 t/t2205-add-worktree-config.sh Range-diff against v2: 1: 26e6a878cd ! 1: 391fd4d8cf dir: consider worktree config in path recursion @@ Metadata Author: Goss Geppert <ggossdev@gmail.com> ## Commit message ## - dir: consider worktree config in path recursion + dir: traverse into repository Since 8d92fb2927 (dir: replace exponential algorithm with a linear one, - 2020-04-01) the following no longer works: + 2020-04-01) traversing into a repository's directory tree when the + traversal began outside the repository's standard location has failed + because the encountered repository was identified as a nested foreign + repository. - 1) Initialize a repository. - 2) Set the `core.worktree` location to a parent directory of the - default worktree. - 3) Add a file located in the default worktree location to the index - (i.e. anywhere in the immediate parent directory of the gitdir). + Prior to this commit, the failure to traverse into a repository's + default worktree location was observable from a user's perspective under + either of the following conditions (there may be others): + + 1) Set the `core.worktree` location to a parent directory of the + default worktree; or + 2) Use the `--git_dir` option while the working directory is outside + the repository's default worktree location + + Under either of these conditions, symptoms of the failure to traverse + into the repository's default worktree location include the inability to + add files to the index or get a list of untracked files via ls-files. This commit adds a check to determine whether a nested repository that is encountered in recursing a path is actually `the_repository`. If so, - simply treat the directory as if it doesn't contain a nested repository. + we simply treat the directory as if it doesn't contain a nested + repository. - Prior to this commit, the `add` operation was silently ignored. + The commit includes test-cases to reduce the likelihood of future + regressions. Signed-off-by: Goss Geppert <ggossdev@gmail.com> @@ dir.c: static enum path_treatment treat_directory(struct dir_struct *dir, + * Determine if `dirname` is a nested repo by confirming that: + * 1) we are in a nonbare repository, and + * 2) `dirname` is not an immediate parent of `the_repository->gitdir`, -+ * which could occur if the `worktree` location was manually -+ * configured by the user; see t2205 testcases 1a-1d for examples -+ * where this matters ++ * which could occur if the git_dir or worktree location was ++ * manually configured by the user; see t2205 testcases 1-3 for ++ * examples where this matters + */ struct strbuf sb = STRBUF_INIT; strbuf_addstr(&sb, dirname); @@ dir.c: static enum path_treatment treat_directory(struct dir_struct *dir, + + if (nested_repo) { + char *real_dirname, *real_gitdir; -+ strbuf_reset(&sb); -+ strbuf_addstr(&sb, dirname); -+ strbuf_complete(&sb, '/'); + strbuf_addstr(&sb, ".git"); -+ real_dirname = real_pathdup(sb.buf, 0); -+ real_gitdir = real_pathdup(the_repository->gitdir, 0); ++ real_dirname = real_pathdup(sb.buf, 1); ++ real_gitdir = real_pathdup(the_repository->gitdir, 1); + + nested_repo = !!strcmp(real_dirname, real_gitdir); + free(real_gitdir); @@ t/t2205-add-worktree-config.sh (new) @@ +#!/bin/sh + -+test_description='directory traversal respects worktree config ++test_description='directory traversal respects user config ++ ++This test verifies the traversal of the directory tree when the traversal begins ++outside the repository. Two instances for which this can occur are tested: + -+This test configures the repository`s worktree to be two levels above the -+`.git` directory and checks whether we are able to add to the index those files -+that are in either (1) the manually configured worktree directory or (2) the -+standard worktree location with respect to the `.git` directory (i.e. ensuring -+that the encountered `.git` directory is not treated as belonging to a foreign -+nested repository)' ++ 1) The user manually sets the worktree. For this instance, the test sets ++ the worktree two levels above the `.git` directory and checks whether we ++ are able to add to the index those files that are in either (1) the ++ manually configured worktree directory or (2) the standard worktree ++ location with respect to the `.git` directory (i.e. ensuring that the ++ encountered `.git` directory is not treated as belonging to a foreign ++ nested repository). ++ 2) The user manually sets the `git_dir` while the working directory is ++ outside the repository. The test checks that files inside the ++ repository can be added to the index. ++ ' + +. ./test-lib.sh + -+test_expect_success '1a: setup' ' -+ test_create_repo test1 && -+ git --git-dir="test1/.git" config core.worktree "$(pwd)" && ++test_expect_success '1a: setup--config worktree' ' ++ mkdir test1 && ++ ( ++ cd test1 && ++ test_create_repo repo && ++ git --git-dir="repo/.git" config core.worktree "$(pwd)" && + + mkdir -p outside-tracked outside-untracked && -+ mkdir -p test1/inside-tracked test1/inside-untracked && ++ mkdir -p repo/inside-tracked repo/inside-untracked && + >file-tracked && + >file-untracked && + >outside-tracked/file && + >outside-untracked/file && -+ >test1/file-tracked && -+ >test1/file-untracked && -+ >test1/inside-tracked/file && -+ >test1/inside-untracked/file && ++ >repo/file-tracked && ++ >repo/file-untracked && ++ >repo/inside-tracked/file && ++ >repo/inside-untracked/file && + + cat >expect-tracked-unsorted <<-EOF && + ../file-tracked @@ t/t2205-add-worktree-config.sh (new) + inside-untracked/file + EOF + ++ cat >expect-all-dir-unsorted <<-EOF && ++ ../file-untracked ++ ../file-tracked ++ ../outside-untracked/ ++ ../outside-tracked/ ++ ./ ++ EOF ++ + cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && + + cat >.gitignore <<-EOF @@ t/t2205-add-worktree-config.sh (new) + actual-* + expect-* + EOF ++ ) +' + +test_expect_success '1b: pre-add all' ' ++ ( ++ cd test1 && + local parent_dir="$(pwd)" && + ( -+ cd test1 && ++ cd repo && + git ls-files -o --exclude-standard "$parent_dir" >../actual-all-unsorted + ) && + sort actual-all-unsorted >actual-all && + sort expect-all-unsorted >expect-all && + test_cmp expect-all actual-all ++ ) +' + -+test_expect_success '1c: post-add tracked' ' ++test_expect_success '1c: pre-add dir all' ' ++ ( ++ cd test1 && + local parent_dir="$(pwd)" && + ( -+ cd test1 && ++ cd repo && ++ git ls-files -o --directory --exclude-standard "$parent_dir" >../actual-all-dir-unsorted ++ ) && ++ sort actual-all-dir-unsorted >actual-all && ++ sort expect-all-dir-unsorted >expect-all && ++ test_cmp expect-all actual-all ++ ) ++' ++ ++test_expect_success '1d: post-add tracked' ' ++ ( ++ cd test1 && ++ local parent_dir="$(pwd)" && ++ ( ++ cd repo && + git add file-tracked && + git add inside-tracked && + git add ../outside-tracked && @@ t/t2205-add-worktree-config.sh (new) + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked ++ ) +' + -+test_expect_success '1d: post-add untracked' ' ++test_expect_success '1e: post-add untracked' ' ++ ( ++ cd test1 && + local parent_dir="$(pwd)" && + ( -+ cd test1 && ++ cd repo && + git ls-files -o --exclude-standard "$parent_dir" >../actual-untracked-unsorted + ) && + sort actual-untracked-unsorted >actual-untracked && + sort expect-untracked-unsorted >expect-untracked && + test_cmp expect-untracked actual-untracked ++ ) ++' ++ ++test_expect_success '2a: setup--set git-dir' ' ++ mkdir test2 && ++ ( ++ cd test2 && ++ test_create_repo repo && ++ # create two foreign repositories that should remain untracked ++ test_create_repo repo-outside && ++ test_create_repo repo/repo-inside && ++ ++ mkdir -p repo/inside-tracked repo/inside-untracked && ++ >repo/file-tracked && ++ >repo/file-untracked && ++ >repo/inside-tracked/file && ++ >repo/inside-untracked/file && ++ >repo-outside/file && ++ >repo/repo-inside/file && ++ ++ cat >expect-tracked-unsorted <<-EOF && ++ repo/file-tracked ++ repo/inside-tracked/file ++ EOF ++ ++ cat >expect-untracked-unsorted <<-EOF && ++ repo/file-untracked ++ repo/inside-untracked/file ++ repo/repo-inside/ ++ repo-outside/ ++ EOF ++ ++ cat >expect-all-dir-unsorted <<-EOF && ++ repo/ ++ repo-outside/ ++ EOF ++ ++ cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && ++ ++ cat >.gitignore <<-EOF ++ .gitignore ++ actual-* ++ expect-* ++ EOF ++ ) ++' ++ ++test_expect_success '2b: pre-add all' ' ++ ( ++ cd test2 && ++ git --git-dir=repo/.git ls-files -o --exclude-standard >actual-all-unsorted && ++ sort actual-all-unsorted >actual-all && ++ sort expect-all-unsorted >expect-all && ++ test_cmp expect-all actual-all ++ ) ++' ++ ++test_expect_success '2c: pre-add dir all' ' ++ ( ++ cd test2 && ++ git --git-dir=repo/.git ls-files -o --directory --exclude-standard >actual-all-dir-unsorted && ++ sort actual-all-dir-unsorted >actual-all && ++ sort expect-all-dir-unsorted >expect-all && ++ test_cmp expect-all actual-all ++ ) ++' ++ ++test_expect_success '2d: post-add tracked' ' ++ ( ++ cd test2 && ++ git --git-dir=repo/.git add repo/file-tracked && ++ git --git-dir=repo/.git add repo/inside-tracked && ++ git --git-dir=repo/.git ls-files >actual-tracked-unsorted && ++ sort actual-tracked-unsorted >actual-tracked && ++ sort expect-tracked-unsorted >expect-tracked && ++ test_cmp expect-tracked actual-tracked ++ ) ++' ++ ++test_expect_success '2e: post-add untracked' ' ++ ( ++ cd test2 && ++ git --git-dir=repo/.git ls-files -o --exclude-standard >actual-untracked-unsorted && ++ sort actual-untracked-unsorted >actual-untracked && ++ sort expect-untracked-unsorted >expect-untracked && ++ test_cmp expect-untracked actual-untracked ++ ) ++' ++ ++test_expect_success '3a: setup--add repo dir' ' ++ mkdir test3 && ++ ( ++ cd test3 && ++ test_create_repo repo && ++ ++ mkdir -p repo/inside-tracked repo/inside-ignored && ++ >repo/file-tracked && ++ >repo/file-ignored && ++ >repo/inside-tracked/file && ++ >repo/inside-ignored/file && ++ ++ cat >.gitignore <<-EOF && ++ .gitignore ++ actual-* ++ expect-* ++ *ignored ++ EOF ++ ++ cat >expect-tracked-unsorted <<-EOF && ++ repo/file-tracked ++ repo/inside-tracked/file ++ EOF ++ ++ cat >expect-ignored-unsorted <<-EOF ++ repo/file-ignored ++ repo/inside-ignored/ ++ .gitignore ++ actual-ignored-unsorted ++ expect-ignored-unsorted ++ expect-tracked-unsorted ++ EOF ++ ) ++' ++ ++test_expect_success '3b: ignored' ' ++ ( ++ cd test3 && ++ git --git-dir=repo/.git ls-files -io --directory --exclude-standard >actual-ignored-unsorted && ++ sort actual-ignored-unsorted >actual-ignored && ++ sort expect-ignored-unsorted >expect-ignored && ++ test_cmp expect-ignored actual-ignored ++ ) ++' ++ ++test_expect_success '3c: add repo' ' ++ ( ++ cd test3 && ++ git --git-dir=repo/.git add repo && ++ git --git-dir=repo/.git ls-files >actual-tracked-unsorted && ++ sort actual-tracked-unsorted >actual-tracked && ++ sort expect-tracked-unsorted >expect-tracked && ++ test_cmp expect-tracked actual-tracked ++ ) +' + +test_done -: ---------- > 2: 16e87a0345 dir: cache git_dir's realpath 2: 2e0a178c78 ! 3: 6de7932731 dir: minor refactoring / clean-up @@ Metadata ## Commit message ## dir: minor refactoring / clean-up - Improve readability. + Narrow the scope of the `nested_repo` variable and conditional return + statement to the block where the variable is set. Signed-off-by: Goss Geppert <ggossdev@gmail.com> @@ dir.c: static enum path_treatment treat_directory(struct dir_struct *dir, /* The "len-1" is to strip the final '/' */ enum exist_status status = directory_exists_in_index(istate, dirname, len-1); @@ dir.c: static enum path_treatment treat_directory(struct dir_struct *dir, - * configured by the user; see t2205 testcases 1a-1d for examples - * where this matters + * manually configured by the user; see t2205 testcases 1-3 for + * examples where this matters */ + int nested_repo; struct strbuf sb = STRBUF_INIT; -- 2.36.0 ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v3 1/3] dir: traverse into repository 2022-05-20 19:28 ` [PATCH v3 0/3] dir: traverse into repository Goss Geppert @ 2022-05-20 19:28 ` Goss Geppert 2022-05-20 19:28 ` [PATCH v3 2/3] dir: cache git_dir's realpath Goss Geppert 2022-05-20 19:28 ` [PATCH v3 3/3] dir: minor refactoring / clean-up Goss Geppert 2 siblings, 0 replies; 31+ messages in thread From: Goss Geppert @ 2022-05-20 19:28 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee Since 8d92fb2927 (dir: replace exponential algorithm with a linear one, 2020-04-01) traversing into a repository's directory tree when the traversal began outside the repository's standard location has failed because the encountered repository was identified as a nested foreign repository. Prior to this commit, the failure to traverse into a repository's default worktree location was observable from a user's perspective under either of the following conditions (there may be others): 1) Set the `core.worktree` location to a parent directory of the default worktree; or 2) Use the `--git_dir` option while the working directory is outside the repository's default worktree location Under either of these conditions, symptoms of the failure to traverse into the repository's default worktree location include the inability to add files to the index or get a list of untracked files via ls-files. This commit adds a check to determine whether a nested repository that is encountered in recursing a path is actually `the_repository`. If so, we simply treat the directory as if it doesn't contain a nested repository. The commit includes test-cases to reduce the likelihood of future regressions. Signed-off-by: Goss Geppert <ggossdev@gmail.com> --- dir.c | 19 +++ t/t2205-add-worktree-config.sh | 274 +++++++++++++++++++++++++++++++++ 2 files changed, 293 insertions(+) create mode 100755 t/t2205-add-worktree-config.sh diff --git a/dir.c b/dir.c index 26c4d141ab..e45e117875 100644 --- a/dir.c +++ b/dir.c @@ -1893,9 +1893,28 @@ static enum path_treatment treat_directory(struct dir_struct *dir, if ((dir->flags & DIR_SKIP_NESTED_GIT) || !(dir->flags & DIR_NO_GITLINKS)) { + /* + * Determine if `dirname` is a nested repo by confirming that: + * 1) we are in a nonbare repository, and + * 2) `dirname` is not an immediate parent of `the_repository->gitdir`, + * which could occur if the git_dir or worktree location was + * manually configured by the user; see t2205 testcases 1-3 for + * examples where this matters + */ struct strbuf sb = STRBUF_INIT; strbuf_addstr(&sb, dirname); nested_repo = is_nonbare_repository_dir(&sb); + + if (nested_repo) { + char *real_dirname, *real_gitdir; + strbuf_addstr(&sb, ".git"); + real_dirname = real_pathdup(sb.buf, 1); + real_gitdir = real_pathdup(the_repository->gitdir, 1); + + nested_repo = !!strcmp(real_dirname, real_gitdir); + free(real_gitdir); + free(real_dirname); + } strbuf_release(&sb); } if (nested_repo) { diff --git a/t/t2205-add-worktree-config.sh b/t/t2205-add-worktree-config.sh new file mode 100755 index 0000000000..056894eb94 --- /dev/null +++ b/t/t2205-add-worktree-config.sh @@ -0,0 +1,274 @@ +#!/bin/sh + +test_description='directory traversal respects user config + +This test verifies the traversal of the directory tree when the traversal begins +outside the repository. Two instances for which this can occur are tested: + + 1) The user manually sets the worktree. For this instance, the test sets + the worktree two levels above the `.git` directory and checks whether we + are able to add to the index those files that are in either (1) the + manually configured worktree directory or (2) the standard worktree + location with respect to the `.git` directory (i.e. ensuring that the + encountered `.git` directory is not treated as belonging to a foreign + nested repository). + 2) The user manually sets the `git_dir` while the working directory is + outside the repository. The test checks that files inside the + repository can be added to the index. + ' + +. ./test-lib.sh + +test_expect_success '1a: setup--config worktree' ' + mkdir test1 && + ( + cd test1 && + test_create_repo repo && + git --git-dir="repo/.git" config core.worktree "$(pwd)" && + + mkdir -p outside-tracked outside-untracked && + mkdir -p repo/inside-tracked repo/inside-untracked && + >file-tracked && + >file-untracked && + >outside-tracked/file && + >outside-untracked/file && + >repo/file-tracked && + >repo/file-untracked && + >repo/inside-tracked/file && + >repo/inside-untracked/file && + + cat >expect-tracked-unsorted <<-EOF && + ../file-tracked + ../outside-tracked/file + file-tracked + inside-tracked/file + EOF + + cat >expect-untracked-unsorted <<-EOF && + ../file-untracked + ../outside-untracked/file + file-untracked + inside-untracked/file + EOF + + cat >expect-all-dir-unsorted <<-EOF && + ../file-untracked + ../file-tracked + ../outside-untracked/ + ../outside-tracked/ + ./ + EOF + + cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && + + cat >.gitignore <<-EOF + .gitignore + actual-* + expect-* + EOF + ) +' + +test_expect_success '1b: pre-add all' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + ( + cd repo && + git ls-files -o --exclude-standard "$parent_dir" >../actual-all-unsorted + ) && + sort actual-all-unsorted >actual-all && + sort expect-all-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '1c: pre-add dir all' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + ( + cd repo && + git ls-files -o --directory --exclude-standard "$parent_dir" >../actual-all-dir-unsorted + ) && + sort actual-all-dir-unsorted >actual-all && + sort expect-all-dir-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '1d: post-add tracked' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + ( + cd repo && + git add file-tracked && + git add inside-tracked && + git add ../outside-tracked && + git add "$parent_dir/file-tracked" && + git ls-files "$parent_dir" >../actual-tracked-unsorted + ) && + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked + ) +' + +test_expect_success '1e: post-add untracked' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + ( + cd repo && + git ls-files -o --exclude-standard "$parent_dir" >../actual-untracked-unsorted + ) && + sort actual-untracked-unsorted >actual-untracked && + sort expect-untracked-unsorted >expect-untracked && + test_cmp expect-untracked actual-untracked + ) +' + +test_expect_success '2a: setup--set git-dir' ' + mkdir test2 && + ( + cd test2 && + test_create_repo repo && + # create two foreign repositories that should remain untracked + test_create_repo repo-outside && + test_create_repo repo/repo-inside && + + mkdir -p repo/inside-tracked repo/inside-untracked && + >repo/file-tracked && + >repo/file-untracked && + >repo/inside-tracked/file && + >repo/inside-untracked/file && + >repo-outside/file && + >repo/repo-inside/file && + + cat >expect-tracked-unsorted <<-EOF && + repo/file-tracked + repo/inside-tracked/file + EOF + + cat >expect-untracked-unsorted <<-EOF && + repo/file-untracked + repo/inside-untracked/file + repo/repo-inside/ + repo-outside/ + EOF + + cat >expect-all-dir-unsorted <<-EOF && + repo/ + repo-outside/ + EOF + + cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && + + cat >.gitignore <<-EOF + .gitignore + actual-* + expect-* + EOF + ) +' + +test_expect_success '2b: pre-add all' ' + ( + cd test2 && + git --git-dir=repo/.git ls-files -o --exclude-standard >actual-all-unsorted && + sort actual-all-unsorted >actual-all && + sort expect-all-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '2c: pre-add dir all' ' + ( + cd test2 && + git --git-dir=repo/.git ls-files -o --directory --exclude-standard >actual-all-dir-unsorted && + sort actual-all-dir-unsorted >actual-all && + sort expect-all-dir-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '2d: post-add tracked' ' + ( + cd test2 && + git --git-dir=repo/.git add repo/file-tracked && + git --git-dir=repo/.git add repo/inside-tracked && + git --git-dir=repo/.git ls-files >actual-tracked-unsorted && + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked + ) +' + +test_expect_success '2e: post-add untracked' ' + ( + cd test2 && + git --git-dir=repo/.git ls-files -o --exclude-standard >actual-untracked-unsorted && + sort actual-untracked-unsorted >actual-untracked && + sort expect-untracked-unsorted >expect-untracked && + test_cmp expect-untracked actual-untracked + ) +' + +test_expect_success '3a: setup--add repo dir' ' + mkdir test3 && + ( + cd test3 && + test_create_repo repo && + + mkdir -p repo/inside-tracked repo/inside-ignored && + >repo/file-tracked && + >repo/file-ignored && + >repo/inside-tracked/file && + >repo/inside-ignored/file && + + cat >.gitignore <<-EOF && + .gitignore + actual-* + expect-* + *ignored + EOF + + cat >expect-tracked-unsorted <<-EOF && + repo/file-tracked + repo/inside-tracked/file + EOF + + cat >expect-ignored-unsorted <<-EOF + repo/file-ignored + repo/inside-ignored/ + .gitignore + actual-ignored-unsorted + expect-ignored-unsorted + expect-tracked-unsorted + EOF + ) +' + +test_expect_success '3b: ignored' ' + ( + cd test3 && + git --git-dir=repo/.git ls-files -io --directory --exclude-standard >actual-ignored-unsorted && + sort actual-ignored-unsorted >actual-ignored && + sort expect-ignored-unsorted >expect-ignored && + test_cmp expect-ignored actual-ignored + ) +' + +test_expect_success '3c: add repo' ' + ( + cd test3 && + git --git-dir=repo/.git add repo && + git --git-dir=repo/.git ls-files >actual-tracked-unsorted && + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked + ) +' + +test_done -- 2.36.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v3 2/3] dir: cache git_dir's realpath 2022-05-20 19:28 ` [PATCH v3 0/3] dir: traverse into repository Goss Geppert 2022-05-20 19:28 ` [PATCH v3 1/3] " Goss Geppert @ 2022-05-20 19:28 ` Goss Geppert 2022-05-24 14:32 ` Elijah Newren 2022-05-20 19:28 ` [PATCH v3 3/3] dir: minor refactoring / clean-up Goss Geppert 2 siblings, 1 reply; 31+ messages in thread From: Goss Geppert @ 2022-05-20 19:28 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee When traversing the directory tree, the realpath of the `git_dir` is required each time a nested repository is encountered to determine how the nested repository should be treated. To prevent excessive resource usage in pathological cases (e.g. many nested repositories, a long path, symlinks, etc.), cache the realpath after the first usage until the traversal is complete. --- dir.c | 63 ++++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 43 insertions(+), 20 deletions(-) diff --git a/dir.c b/dir.c index e45e117875..45b89560fc 100644 --- a/dir.c +++ b/dir.c @@ -47,10 +47,22 @@ struct cached_dir { struct untracked_cache_dir *ucd; }; +/* + * Support data structure for the recursing into the directory tree. + */ +struct traversal_cache { + /* + * The realpath for the `git_dir` may be required in the recursion and is + * cached for repeated use. + */ + char *real_gitdir; +}; + static enum path_treatment read_directory_recursive(struct dir_struct *dir, struct index_state *istate, const char *path, int len, struct untracked_cache_dir *untracked, - int check_only, int stop_at_first_file, const struct pathspec *pathspec); + int check_only, int stop_at_first_file, const struct pathspec *pathspec, + struct traversal_cache *cache); static int resolve_dtype(int dtype, struct index_state *istate, const char *path, int len); struct dirent *readdir_skip_dot_and_dotdot(DIR *dirp) @@ -1852,7 +1864,8 @@ static enum path_treatment treat_directory(struct dir_struct *dir, struct index_state *istate, struct untracked_cache_dir *untracked, const char *dirname, int len, int baselen, int excluded, - const struct pathspec *pathspec) + const struct pathspec *pathspec, + struct traversal_cache *cache) { /* * WARNING: From this function, you can return path_recurse or you @@ -1906,13 +1919,13 @@ static enum path_treatment treat_directory(struct dir_struct *dir, nested_repo = is_nonbare_repository_dir(&sb); if (nested_repo) { - char *real_dirname, *real_gitdir; + char *real_dirname; strbuf_addstr(&sb, ".git"); real_dirname = real_pathdup(sb.buf, 1); - real_gitdir = real_pathdup(the_repository->gitdir, 1); + if (!cache->real_gitdir) + cache->real_gitdir = real_pathdup(the_repository->gitdir, 1); - nested_repo = !!strcmp(real_dirname, real_gitdir); - free(real_gitdir); + nested_repo = !!strcmp(real_dirname, cache->real_gitdir); free(real_dirname); } strbuf_release(&sb); @@ -1944,7 +1957,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, return path_excluded; if (read_directory_recursive(dir, istate, dirname, len, - untracked, 1, 1, pathspec) == path_excluded) + untracked, 1, 1, pathspec, cache) == path_excluded) return path_excluded; return path_none; @@ -2045,7 +2058,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, untracked = lookup_untracked(dir->untracked, untracked, dirname + baselen, len - baselen); state = read_directory_recursive(dir, istate, dirname, len, untracked, - check_only, stop_early, pathspec); + check_only, stop_early, pathspec, cache); /* There are a variety of reasons we may need to fixup the state... */ if (state == path_excluded) { @@ -2242,7 +2255,8 @@ static enum path_treatment treat_path_fast(struct dir_struct *dir, struct index_state *istate, struct strbuf *path, int baselen, - const struct pathspec *pathspec) + const struct pathspec *pathspec, + struct traversal_cache *cache) { /* * WARNING: From this function, you can return path_recurse or you @@ -2264,7 +2278,7 @@ static enum path_treatment treat_path_fast(struct dir_struct *dir, * with check_only set. */ return read_directory_recursive(dir, istate, path->buf, path->len, - cdir->ucd, 1, 0, pathspec); + cdir->ucd, 1, 0, pathspec, cache); /* * We get path_recurse in the first run when * directory_exists_in_index() returns index_nonexistent. We @@ -2280,13 +2294,14 @@ static enum path_treatment treat_path(struct dir_struct *dir, struct index_state *istate, struct strbuf *path, int baselen, - const struct pathspec *pathspec) + const struct pathspec *pathspec, + struct traversal_cache *cache) { int has_path_in_index, dtype, excluded; if (!cdir->d_name) return treat_path_fast(dir, cdir, istate, path, - baselen, pathspec); + baselen, pathspec, cache); if (is_dot_or_dotdot(cdir->d_name) || !fspathcmp(cdir->d_name, ".git")) return path_none; strbuf_setlen(path, baselen); @@ -2348,7 +2363,7 @@ static enum path_treatment treat_path(struct dir_struct *dir, strbuf_addch(path, '/'); return treat_directory(dir, istate, untracked, path->buf, path->len, - baselen, excluded, pathspec); + baselen, excluded, pathspec, cache); case DT_REG: case DT_LNK: if (pathspec && @@ -2549,7 +2564,8 @@ static void add_path_to_appropriate_result_list(struct dir_struct *dir, static enum path_treatment read_directory_recursive(struct dir_struct *dir, struct index_state *istate, const char *base, int baselen, struct untracked_cache_dir *untracked, int check_only, - int stop_at_first_file, const struct pathspec *pathspec) + int stop_at_first_file, const struct pathspec *pathspec, + struct traversal_cache *cache) { /* * WARNING: Do NOT recurse unless path_recurse is returned from @@ -2572,7 +2588,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir, while (!read_cached_dir(&cdir)) { /* check how the file or directory should be treated */ state = treat_path(dir, untracked, &cdir, istate, &path, - baselen, pathspec); + baselen, pathspec, cache); dir->visited_paths++; if (state > dir_state) @@ -2587,7 +2603,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir, subdir_state = read_directory_recursive(dir, istate, path.buf, path.len, ud, - check_only, stop_at_first_file, pathspec); + check_only, stop_at_first_file, pathspec, cache); if (subdir_state > dir_state) dir_state = subdir_state; @@ -2661,7 +2677,8 @@ int check_dir_entry_contains(const struct dir_entry *out, const struct dir_entry static int treat_leading_path(struct dir_struct *dir, struct index_state *istate, const char *path, int len, - const struct pathspec *pathspec) + const struct pathspec *pathspec, + struct traversal_cache *cache) { struct strbuf sb = STRBUF_INIT; struct strbuf subdir = STRBUF_INIT; @@ -2713,7 +2730,8 @@ static int treat_leading_path(struct dir_struct *dir, strbuf_reset(&subdir); strbuf_add(&subdir, path+prevlen, baselen-prevlen); cdir.d_name = subdir.buf; - state = treat_path(dir, NULL, &cdir, istate, &sb, prevlen, pathspec); + + state = treat_path(dir, NULL, &cdir, istate, &sb, prevlen, pathspec, cache); if (state != path_recurse) break; /* do not recurse into it */ @@ -2986,6 +3004,9 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, const char *path, int len, const struct pathspec *pathspec) { struct untracked_cache_dir *untracked; + struct traversal_cache cache = { + .real_gitdir = NULL, + }; trace2_region_enter("dir", "read_directory", istate->repo); dir->visited_paths = 0; @@ -3003,8 +3024,10 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, * e.g. prep_exclude() */ dir->untracked = NULL; - if (!len || treat_leading_path(dir, istate, path, len, pathspec)) - read_directory_recursive(dir, istate, path, len, untracked, 0, 0, pathspec); + if (!len || treat_leading_path(dir, istate, path, len, pathspec, &cache)) { + read_directory_recursive(dir, istate, path, len, untracked, 0, 0, pathspec, &cache); + } + FREE_AND_NULL(cache.real_gitdir); QSORT(dir->entries, dir->nr, cmp_dir_entry); QSORT(dir->ignored, dir->ignored_nr, cmp_dir_entry); -- 2.36.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* Re: [PATCH v3 2/3] dir: cache git_dir's realpath 2022-05-20 19:28 ` [PATCH v3 2/3] dir: cache git_dir's realpath Goss Geppert @ 2022-05-24 14:32 ` Elijah Newren 0 siblings, 0 replies; 31+ messages in thread From: Elijah Newren @ 2022-05-24 14:32 UTC (permalink / raw) To: Goss Geppert Cc: Git Mailing List, Junio C Hamano, christian w, Derrick Stolee On Fri, May 20, 2022 at 12:29 PM Goss Geppert <gg.oss.dev@gmail.com> wrote: > > When traversing the directory tree, the realpath of the `git_dir` is > required each time a nested repository is encountered to determine how > the nested repository should be treated. To prevent excessive resource > usage in pathological cases (e.g. many nested repositories, a long path, > symlinks, etc.), cache the realpath after the first usage until the > traversal is complete. > --- > dir.c | 63 ++++++++++++++++++++++++++++++++++++++++------------------- > 1 file changed, 43 insertions(+), 20 deletions(-) > > diff --git a/dir.c b/dir.c > index e45e117875..45b89560fc 100644 > --- a/dir.c > +++ b/dir.c > @@ -47,10 +47,22 @@ struct cached_dir { > struct untracked_cache_dir *ucd; > }; > > +/* > + * Support data structure for the recursing into the directory tree. > + */ > +struct traversal_cache { > + /* > + * The realpath for the `git_dir` may be required in the recursion and is > + * cached for repeated use. > + */ > + char *real_gitdir; > +}; > + > static enum path_treatment read_directory_recursive(struct dir_struct *dir, > struct index_state *istate, const char *path, int len, > struct untracked_cache_dir *untracked, > - int check_only, int stop_at_first_file, const struct pathspec *pathspec); > + int check_only, int stop_at_first_file, const struct pathspec *pathspec, > + struct traversal_cache *cache); > static int resolve_dtype(int dtype, struct index_state *istate, > const char *path, int len); > struct dirent *readdir_skip_dot_and_dotdot(DIR *dirp) > @@ -1852,7 +1864,8 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > struct index_state *istate, > struct untracked_cache_dir *untracked, > const char *dirname, int len, int baselen, int excluded, > - const struct pathspec *pathspec) > + const struct pathspec *pathspec, > + struct traversal_cache *cache) > { > /* > * WARNING: From this function, you can return path_recurse or you > @@ -1906,13 +1919,13 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > nested_repo = is_nonbare_repository_dir(&sb); > > if (nested_repo) { > - char *real_dirname, *real_gitdir; > + char *real_dirname; > strbuf_addstr(&sb, ".git"); > real_dirname = real_pathdup(sb.buf, 1); > - real_gitdir = real_pathdup(the_repository->gitdir, 1); > + if (!cache->real_gitdir) > + cache->real_gitdir = real_pathdup(the_repository->gitdir, 1); > > - nested_repo = !!strcmp(real_dirname, real_gitdir); > - free(real_gitdir); > + nested_repo = !!strcmp(real_dirname, cache->real_gitdir); > free(real_dirname); > } > strbuf_release(&sb); > @@ -1944,7 +1957,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > return path_excluded; > > if (read_directory_recursive(dir, istate, dirname, len, > - untracked, 1, 1, pathspec) == path_excluded) > + untracked, 1, 1, pathspec, cache) == path_excluded) > return path_excluded; > > return path_none; > @@ -2045,7 +2058,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, > untracked = lookup_untracked(dir->untracked, untracked, > dirname + baselen, len - baselen); > state = read_directory_recursive(dir, istate, dirname, len, untracked, > - check_only, stop_early, pathspec); > + check_only, stop_early, pathspec, cache); > > /* There are a variety of reasons we may need to fixup the state... */ > if (state == path_excluded) { > @@ -2242,7 +2255,8 @@ static enum path_treatment treat_path_fast(struct dir_struct *dir, > struct index_state *istate, > struct strbuf *path, > int baselen, > - const struct pathspec *pathspec) > + const struct pathspec *pathspec, > + struct traversal_cache *cache) > { > /* > * WARNING: From this function, you can return path_recurse or you > @@ -2264,7 +2278,7 @@ static enum path_treatment treat_path_fast(struct dir_struct *dir, > * with check_only set. > */ > return read_directory_recursive(dir, istate, path->buf, path->len, > - cdir->ucd, 1, 0, pathspec); > + cdir->ucd, 1, 0, pathspec, cache); > /* > * We get path_recurse in the first run when > * directory_exists_in_index() returns index_nonexistent. We > @@ -2280,13 +2294,14 @@ static enum path_treatment treat_path(struct dir_struct *dir, > struct index_state *istate, > struct strbuf *path, > int baselen, > - const struct pathspec *pathspec) > + const struct pathspec *pathspec, > + struct traversal_cache *cache) > { > int has_path_in_index, dtype, excluded; > > if (!cdir->d_name) > return treat_path_fast(dir, cdir, istate, path, > - baselen, pathspec); > + baselen, pathspec, cache); > if (is_dot_or_dotdot(cdir->d_name) || !fspathcmp(cdir->d_name, ".git")) > return path_none; > strbuf_setlen(path, baselen); > @@ -2348,7 +2363,7 @@ static enum path_treatment treat_path(struct dir_struct *dir, > strbuf_addch(path, '/'); > return treat_directory(dir, istate, untracked, > path->buf, path->len, > - baselen, excluded, pathspec); > + baselen, excluded, pathspec, cache); > case DT_REG: > case DT_LNK: > if (pathspec && > @@ -2549,7 +2564,8 @@ static void add_path_to_appropriate_result_list(struct dir_struct *dir, > static enum path_treatment read_directory_recursive(struct dir_struct *dir, > struct index_state *istate, const char *base, int baselen, > struct untracked_cache_dir *untracked, int check_only, > - int stop_at_first_file, const struct pathspec *pathspec) > + int stop_at_first_file, const struct pathspec *pathspec, > + struct traversal_cache *cache) > { > /* > * WARNING: Do NOT recurse unless path_recurse is returned from > @@ -2572,7 +2588,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir, > while (!read_cached_dir(&cdir)) { > /* check how the file or directory should be treated */ > state = treat_path(dir, untracked, &cdir, istate, &path, > - baselen, pathspec); > + baselen, pathspec, cache); > dir->visited_paths++; > > if (state > dir_state) > @@ -2587,7 +2603,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir, > subdir_state = > read_directory_recursive(dir, istate, path.buf, > path.len, ud, > - check_only, stop_at_first_file, pathspec); > + check_only, stop_at_first_file, pathspec, cache); > if (subdir_state > dir_state) > dir_state = subdir_state; > > @@ -2661,7 +2677,8 @@ int check_dir_entry_contains(const struct dir_entry *out, const struct dir_entry > static int treat_leading_path(struct dir_struct *dir, > struct index_state *istate, > const char *path, int len, > - const struct pathspec *pathspec) > + const struct pathspec *pathspec, > + struct traversal_cache *cache) > { > struct strbuf sb = STRBUF_INIT; > struct strbuf subdir = STRBUF_INIT; > @@ -2713,7 +2730,8 @@ static int treat_leading_path(struct dir_struct *dir, > strbuf_reset(&subdir); > strbuf_add(&subdir, path+prevlen, baselen-prevlen); > cdir.d_name = subdir.buf; > - state = treat_path(dir, NULL, &cdir, istate, &sb, prevlen, pathspec); > + > + state = treat_path(dir, NULL, &cdir, istate, &sb, prevlen, pathspec, cache); > > if (state != path_recurse) > break; /* do not recurse into it */ > @@ -2986,6 +3004,9 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, > const char *path, int len, const struct pathspec *pathspec) > { > struct untracked_cache_dir *untracked; > + struct traversal_cache cache = { > + .real_gitdir = NULL, > + }; > > trace2_region_enter("dir", "read_directory", istate->repo); > dir->visited_paths = 0; > @@ -3003,8 +3024,10 @@ int read_directory(struct dir_struct *dir, struct index_state *istate, > * e.g. prep_exclude() > */ > dir->untracked = NULL; > - if (!len || treat_leading_path(dir, istate, path, len, pathspec)) > - read_directory_recursive(dir, istate, path, len, untracked, 0, 0, pathspec); > + if (!len || treat_leading_path(dir, istate, path, len, pathspec, &cache)) { > + read_directory_recursive(dir, istate, path, len, untracked, 0, 0, pathspec, &cache); > + } > + FREE_AND_NULL(cache.real_gitdir); > QSORT(dir->entries, dir->nr, cmp_dir_entry); > QSORT(dir->ignored, dir->ignored_nr, cmp_dir_entry); > > -- > 2.36.0 This makes sense as a way to avoid the call to real_pathdup() if that's what we want, but I feel it's a micro-optimization that isn't worth this code churn. I'd rather drop this patch from the series and just use the real_pathdup() you had before (see https://lore.kernel.org/git/CABPp-BGXRzYCvyM38dEUvQ125+VtRu++7L9UiRz98u+1=Lov7A@mail.gmail.com/). Let's see what Junio says, but I'm hoping he agrees this patch isn't needed. ^ permalink raw reply [flat|nested] 31+ messages in thread
* [PATCH v3 3/3] dir: minor refactoring / clean-up 2022-05-20 19:28 ` [PATCH v3 0/3] dir: traverse into repository Goss Geppert 2022-05-20 19:28 ` [PATCH v3 1/3] " Goss Geppert 2022-05-20 19:28 ` [PATCH v3 2/3] dir: cache git_dir's realpath Goss Geppert @ 2022-05-20 19:28 ` Goss Geppert 2 siblings, 0 replies; 31+ messages in thread From: Goss Geppert @ 2022-05-20 19:28 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee Narrow the scope of the `nested_repo` variable and conditional return statement to the block where the variable is set. Signed-off-by: Goss Geppert <ggossdev@gmail.com> --- dir.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/dir.c b/dir.c index 45b89560fc..92bdd4daf8 100644 --- a/dir.c +++ b/dir.c @@ -1874,7 +1874,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, */ enum path_treatment state; int matches_how = 0; - int nested_repo = 0, check_only, stop_early; + int check_only, stop_early; int old_ignored_nr, old_untracked_nr; /* The "len-1" is to strip the final '/' */ enum exist_status status = directory_exists_in_index(istate, dirname, len-1); @@ -1914,6 +1914,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, * manually configured by the user; see t2205 testcases 1-3 for * examples where this matters */ + int nested_repo; struct strbuf sb = STRBUF_INIT; strbuf_addstr(&sb, dirname); nested_repo = is_nonbare_repository_dir(&sb); @@ -1929,12 +1930,13 @@ static enum path_treatment treat_directory(struct dir_struct *dir, free(real_dirname); } strbuf_release(&sb); - } - if (nested_repo) { - if ((dir->flags & DIR_SKIP_NESTED_GIT) || - (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) - return path_none; - return excluded ? path_excluded : path_untracked; + + if (nested_repo) { + if ((dir->flags & DIR_SKIP_NESTED_GIT) || + (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) + return path_none; + return excluded ? path_excluded : path_untracked; + } } if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES)) { -- 2.36.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v4 0/2] dir: traverse into repository 2022-05-05 20:32 [RFC PATCH 0/1] dir: consider worktree config in path recursion Goss Geppert ` (3 preceding siblings ...) 2022-05-20 19:28 ` [PATCH v3 0/3] dir: traverse into repository Goss Geppert @ 2022-06-16 23:19 ` Goss Geppert 2022-06-22 4:57 ` Elijah Newren [not found] ` <20220616231956.154-1-gg.oss@outlook.com> ` (2 subsequent siblings) 7 siblings, 1 reply; 31+ messages in thread From: Goss Geppert @ 2022-06-16 23:19 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee From: Goss Geppert <ggossdev@gmail.com> This latest version of the patch series contains relatively minor modifications relative to the previous version. Let me know if there is anything else I need to do. Changes compared to v3: * remove the commit that caches the gitdir's realpath as this was deemed a premature optimization given the amount of code churn * use git option `-C` in some of the testcases to avoid creating a subshell Goss Geppert (2): dir: traverse into repository dir: minor refactoring / clean-up dir.c | 35 ++++- t/t2205-add-worktree-config.sh | 265 +++++++++++++++++++++++++++++++++ 2 files changed, 293 insertions(+), 7 deletions(-) create mode 100755 t/t2205-add-worktree-config.sh Range-diff against v3: 1: 0fc8886f1e ! 1: f84cefe731 dir: traverse into repository @@ t/t2205-add-worktree-config.sh (new) + ( + cd test1 && + local parent_dir="$(pwd)" && -+ ( -+ cd repo && -+ git ls-files -o --exclude-standard "$parent_dir" >../actual-all-unsorted -+ ) && ++ git -C repo ls-files -o --exclude-standard "$parent_dir" >actual-all-unsorted && + sort actual-all-unsorted >actual-all && + sort expect-all-unsorted >expect-all && + test_cmp expect-all actual-all @@ t/t2205-add-worktree-config.sh (new) + ( + cd test1 && + local parent_dir="$(pwd)" && -+ ( -+ cd repo && -+ git ls-files -o --directory --exclude-standard "$parent_dir" >../actual-all-dir-unsorted -+ ) && ++ git -C repo ls-files -o --directory --exclude-standard "$parent_dir" >actual-all-dir-unsorted && + sort actual-all-dir-unsorted >actual-all && + sort expect-all-dir-unsorted >expect-all && + test_cmp expect-all actual-all @@ t/t2205-add-worktree-config.sh (new) + ( + cd test1 && + local parent_dir="$(pwd)" && -+ ( -+ cd repo && -+ git ls-files -o --exclude-standard "$parent_dir" >../actual-untracked-unsorted -+ ) && ++ git -C repo ls-files -o --exclude-standard "$parent_dir" >actual-untracked-unsorted && + sort actual-untracked-unsorted >actual-untracked && + sort expect-untracked-unsorted >expect-untracked && + test_cmp expect-untracked actual-untracked 2: a80cbd5517 < -: ---------- dir: cache git_dir's realpath 3: 899c69300c = 2: d4ff1bd40a dir: minor refactoring / clean-up -- 2.36.0 ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [PATCH v4 0/2] dir: traverse into repository 2022-06-16 23:19 ` [PATCH v4 0/2] dir: traverse into repository Goss Geppert @ 2022-06-22 4:57 ` Elijah Newren 0 siblings, 0 replies; 31+ messages in thread From: Elijah Newren @ 2022-06-22 4:57 UTC (permalink / raw) To: Goss Geppert Cc: Git Mailing List, Junio C Hamano, christian w, Derrick Stolee On Thu, Jun 16, 2022 at 4:20 PM Goss Geppert <gg.oss@outlook.com> wrote: > > From: Goss Geppert <ggossdev@gmail.com> > > This latest version of the patch series contains relatively minor > modifications relative to the previous version. Let me know if there > is anything else I need to do. > > Changes compared to v3: > * remove the commit that caches the gitdir's realpath as this was deemed > a premature optimization given the amount of code churn > * use git option `-C` in some of the testcases to avoid creating a > subshell > > Goss Geppert (2): > dir: traverse into repository > dir: minor refactoring / clean-up > > dir.c | 35 ++++- > t/t2205-add-worktree-config.sh | 265 +++++++++++++++++++++++++++++++++ > 2 files changed, 293 insertions(+), 7 deletions(-) > create mode 100755 t/t2205-add-worktree-config.sh > > Range-diff against v3: > 1: 0fc8886f1e ! 1: f84cefe731 dir: traverse into repository > @@ t/t2205-add-worktree-config.sh (new) > + ( > + cd test1 && > + local parent_dir="$(pwd)" && > -+ ( > -+ cd repo && > -+ git ls-files -o --exclude-standard "$parent_dir" >../actual-all-unsorted > -+ ) && > ++ git -C repo ls-files -o --exclude-standard "$parent_dir" >actual-all-unsorted && > + sort actual-all-unsorted >actual-all && > + sort expect-all-unsorted >expect-all && > + test_cmp expect-all actual-all > @@ t/t2205-add-worktree-config.sh (new) > + ( > + cd test1 && > + local parent_dir="$(pwd)" && > -+ ( > -+ cd repo && > -+ git ls-files -o --directory --exclude-standard "$parent_dir" >../actual-all-dir-unsorted > -+ ) && > ++ git -C repo ls-files -o --directory --exclude-standard "$parent_dir" >actual-all-dir-unsorted && > + sort actual-all-dir-unsorted >actual-all && > + sort expect-all-dir-unsorted >expect-all && > + test_cmp expect-all actual-all > @@ t/t2205-add-worktree-config.sh (new) > + ( > + cd test1 && > + local parent_dir="$(pwd)" && > -+ ( > -+ cd repo && > -+ git ls-files -o --exclude-standard "$parent_dir" >../actual-untracked-unsorted > -+ ) && > ++ git -C repo ls-files -o --exclude-standard "$parent_dir" >actual-untracked-unsorted && > + sort actual-untracked-unsorted >actual-untracked && > + sort expect-untracked-unsorted >expect-untracked && > + test_cmp expect-untracked actual-untracked > 2: a80cbd5517 < -: ---------- dir: cache git_dir's realpath > 3: 899c69300c = 2: d4ff1bd40a dir: minor refactoring / clean-up Thanks, this version looks good to me. Reviewed-by: Elijah Newren <newren@gmail.com> ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <20220616231956.154-1-gg.oss@outlook.com>]
* [PATCH v4 1/2] dir: traverse into repository [not found] ` <20220616231956.154-1-gg.oss@outlook.com> @ 2022-06-16 23:19 ` Goss Geppert 0 siblings, 0 replies; 31+ messages in thread From: Goss Geppert @ 2022-06-16 23:19 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee From: Goss Geppert <ggossdev@gmail.com> Since 8d92fb2927 (dir: replace exponential algorithm with a linear one, 2020-04-01) traversing into a repository's directory tree when the traversal began outside the repository's standard location has failed because the encountered repository was identified as a nested foreign repository. Prior to this commit, the failure to traverse into a repository's default worktree location was observable from a user's perspective under either of the following conditions (there may be others): 1) Set the `core.worktree` location to a parent directory of the default worktree; or 2) Use the `--git_dir` option while the working directory is outside the repository's default worktree location Under either of these conditions, symptoms of the failure to traverse into the repository's default worktree location include the inability to add files to the index or get a list of untracked files via ls-files. This commit adds a check to determine whether a nested repository that is encountered in recursing a path is actually `the_repository`. If so, we simply treat the directory as if it doesn't contain a nested repository. The commit includes test-cases to reduce the likelihood of future regressions. Signed-off-by: Goss Geppert <ggossdev@gmail.com> --- dir.c | 19 +++ t/t2205-add-worktree-config.sh | 265 +++++++++++++++++++++++++++++++++ 2 files changed, 284 insertions(+) create mode 100755 t/t2205-add-worktree-config.sh diff --git a/dir.c b/dir.c index 6ca2ef5f04..c4f41247e0 100644 --- a/dir.c +++ b/dir.c @@ -1893,9 +1893,28 @@ static enum path_treatment treat_directory(struct dir_struct *dir, if ((dir->flags & DIR_SKIP_NESTED_GIT) || !(dir->flags & DIR_NO_GITLINKS)) { + /* + * Determine if `dirname` is a nested repo by confirming that: + * 1) we are in a nonbare repository, and + * 2) `dirname` is not an immediate parent of `the_repository->gitdir`, + * which could occur if the git_dir or worktree location was + * manually configured by the user; see t2205 testcases 1-3 for + * examples where this matters + */ struct strbuf sb = STRBUF_INIT; strbuf_addstr(&sb, dirname); nested_repo = is_nonbare_repository_dir(&sb); + + if (nested_repo) { + char *real_dirname, *real_gitdir; + strbuf_addstr(&sb, ".git"); + real_dirname = real_pathdup(sb.buf, 1); + real_gitdir = real_pathdup(the_repository->gitdir, 1); + + nested_repo = !!strcmp(real_dirname, real_gitdir); + free(real_gitdir); + free(real_dirname); + } strbuf_release(&sb); } if (nested_repo) { diff --git a/t/t2205-add-worktree-config.sh b/t/t2205-add-worktree-config.sh new file mode 100755 index 0000000000..43d950de64 --- /dev/null +++ b/t/t2205-add-worktree-config.sh @@ -0,0 +1,265 @@ +#!/bin/sh + +test_description='directory traversal respects user config + +This test verifies the traversal of the directory tree when the traversal begins +outside the repository. Two instances for which this can occur are tested: + + 1) The user manually sets the worktree. For this instance, the test sets + the worktree two levels above the `.git` directory and checks whether we + are able to add to the index those files that are in either (1) the + manually configured worktree directory or (2) the standard worktree + location with respect to the `.git` directory (i.e. ensuring that the + encountered `.git` directory is not treated as belonging to a foreign + nested repository). + 2) The user manually sets the `git_dir` while the working directory is + outside the repository. The test checks that files inside the + repository can be added to the index. + ' + +. ./test-lib.sh + +test_expect_success '1a: setup--config worktree' ' + mkdir test1 && + ( + cd test1 && + test_create_repo repo && + git --git-dir="repo/.git" config core.worktree "$(pwd)" && + + mkdir -p outside-tracked outside-untracked && + mkdir -p repo/inside-tracked repo/inside-untracked && + >file-tracked && + >file-untracked && + >outside-tracked/file && + >outside-untracked/file && + >repo/file-tracked && + >repo/file-untracked && + >repo/inside-tracked/file && + >repo/inside-untracked/file && + + cat >expect-tracked-unsorted <<-EOF && + ../file-tracked + ../outside-tracked/file + file-tracked + inside-tracked/file + EOF + + cat >expect-untracked-unsorted <<-EOF && + ../file-untracked + ../outside-untracked/file + file-untracked + inside-untracked/file + EOF + + cat >expect-all-dir-unsorted <<-EOF && + ../file-untracked + ../file-tracked + ../outside-untracked/ + ../outside-tracked/ + ./ + EOF + + cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && + + cat >.gitignore <<-EOF + .gitignore + actual-* + expect-* + EOF + ) +' + +test_expect_success '1b: pre-add all' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + git -C repo ls-files -o --exclude-standard "$parent_dir" >actual-all-unsorted && + sort actual-all-unsorted >actual-all && + sort expect-all-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '1c: pre-add dir all' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + git -C repo ls-files -o --directory --exclude-standard "$parent_dir" >actual-all-dir-unsorted && + sort actual-all-dir-unsorted >actual-all && + sort expect-all-dir-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '1d: post-add tracked' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + ( + cd repo && + git add file-tracked && + git add inside-tracked && + git add ../outside-tracked && + git add "$parent_dir/file-tracked" && + git ls-files "$parent_dir" >../actual-tracked-unsorted + ) && + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked + ) +' + +test_expect_success '1e: post-add untracked' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + git -C repo ls-files -o --exclude-standard "$parent_dir" >actual-untracked-unsorted && + sort actual-untracked-unsorted >actual-untracked && + sort expect-untracked-unsorted >expect-untracked && + test_cmp expect-untracked actual-untracked + ) +' + +test_expect_success '2a: setup--set git-dir' ' + mkdir test2 && + ( + cd test2 && + test_create_repo repo && + # create two foreign repositories that should remain untracked + test_create_repo repo-outside && + test_create_repo repo/repo-inside && + + mkdir -p repo/inside-tracked repo/inside-untracked && + >repo/file-tracked && + >repo/file-untracked && + >repo/inside-tracked/file && + >repo/inside-untracked/file && + >repo-outside/file && + >repo/repo-inside/file && + + cat >expect-tracked-unsorted <<-EOF && + repo/file-tracked + repo/inside-tracked/file + EOF + + cat >expect-untracked-unsorted <<-EOF && + repo/file-untracked + repo/inside-untracked/file + repo/repo-inside/ + repo-outside/ + EOF + + cat >expect-all-dir-unsorted <<-EOF && + repo/ + repo-outside/ + EOF + + cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && + + cat >.gitignore <<-EOF + .gitignore + actual-* + expect-* + EOF + ) +' + +test_expect_success '2b: pre-add all' ' + ( + cd test2 && + git --git-dir=repo/.git ls-files -o --exclude-standard >actual-all-unsorted && + sort actual-all-unsorted >actual-all && + sort expect-all-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '2c: pre-add dir all' ' + ( + cd test2 && + git --git-dir=repo/.git ls-files -o --directory --exclude-standard >actual-all-dir-unsorted && + sort actual-all-dir-unsorted >actual-all && + sort expect-all-dir-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '2d: post-add tracked' ' + ( + cd test2 && + git --git-dir=repo/.git add repo/file-tracked && + git --git-dir=repo/.git add repo/inside-tracked && + git --git-dir=repo/.git ls-files >actual-tracked-unsorted && + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked + ) +' + +test_expect_success '2e: post-add untracked' ' + ( + cd test2 && + git --git-dir=repo/.git ls-files -o --exclude-standard >actual-untracked-unsorted && + sort actual-untracked-unsorted >actual-untracked && + sort expect-untracked-unsorted >expect-untracked && + test_cmp expect-untracked actual-untracked + ) +' + +test_expect_success '3a: setup--add repo dir' ' + mkdir test3 && + ( + cd test3 && + test_create_repo repo && + + mkdir -p repo/inside-tracked repo/inside-ignored && + >repo/file-tracked && + >repo/file-ignored && + >repo/inside-tracked/file && + >repo/inside-ignored/file && + + cat >.gitignore <<-EOF && + .gitignore + actual-* + expect-* + *ignored + EOF + + cat >expect-tracked-unsorted <<-EOF && + repo/file-tracked + repo/inside-tracked/file + EOF + + cat >expect-ignored-unsorted <<-EOF + repo/file-ignored + repo/inside-ignored/ + .gitignore + actual-ignored-unsorted + expect-ignored-unsorted + expect-tracked-unsorted + EOF + ) +' + +test_expect_success '3b: ignored' ' + ( + cd test3 && + git --git-dir=repo/.git ls-files -io --directory --exclude-standard >actual-ignored-unsorted && + sort actual-ignored-unsorted >actual-ignored && + sort expect-ignored-unsorted >expect-ignored && + test_cmp expect-ignored actual-ignored + ) +' + +test_expect_success '3c: add repo' ' + ( + cd test3 && + git --git-dir=repo/.git add repo && + git --git-dir=repo/.git ls-files >actual-tracked-unsorted && + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked + ) +' + +test_done -- 2.36.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v4 0/2] dir: traverse into repository (resending) 2022-05-05 20:32 [RFC PATCH 0/1] dir: consider worktree config in path recursion Goss Geppert ` (5 preceding siblings ...) [not found] ` <20220616231956.154-1-gg.oss@outlook.com> @ 2022-06-16 23:44 ` Goss Geppert [not found] ` <20220616234433.225-1-gg.oss@outlook.com> 7 siblings, 0 replies; 31+ messages in thread From: Goss Geppert @ 2022-06-16 23:44 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee From: Goss Geppert <ggossdev@gmail.com> Attempt #2 to send this. It seems sending patch v4 2/2 resulted in an error with the outgoing server. Original message below: This latest version of the patch series contains relatively minor modifications relative to the previous version. Let me know if there is anything else I need to do. Changes compared to v3: * remove the commit that caches the gitdir's realpath as this was deemed a premature optimization given the amount of code churn * use git option `-C` in some of the testcases to avoid creating a subshell Goss Geppert (2): dir: traverse into repository dir: minor refactoring / clean-up dir.c | 35 ++++- t/t2205-add-worktree-config.sh | 265 +++++++++++++++++++++++++++++++++ 2 files changed, 293 insertions(+), 7 deletions(-) create mode 100755 t/t2205-add-worktree-config.sh Range-diff against v3: 1: 0fc8886f1e ! 1: f84cefe731 dir: traverse into repository @@ t/t2205-add-worktree-config.sh (new) + ( + cd test1 && + local parent_dir="$(pwd)" && -+ ( -+ cd repo && -+ git ls-files -o --exclude-standard "$parent_dir" >../actual-all-unsorted -+ ) && ++ git -C repo ls-files -o --exclude-standard "$parent_dir" >actual-all-unsorted && + sort actual-all-unsorted >actual-all && + sort expect-all-unsorted >expect-all && + test_cmp expect-all actual-all @@ t/t2205-add-worktree-config.sh (new) + ( + cd test1 && + local parent_dir="$(pwd)" && -+ ( -+ cd repo && -+ git ls-files -o --directory --exclude-standard "$parent_dir" >../actual-all-dir-unsorted -+ ) && ++ git -C repo ls-files -o --directory --exclude-standard "$parent_dir" >actual-all-dir-unsorted && + sort actual-all-dir-unsorted >actual-all && + sort expect-all-dir-unsorted >expect-all && + test_cmp expect-all actual-all @@ t/t2205-add-worktree-config.sh (new) + ( + cd test1 && + local parent_dir="$(pwd)" && -+ ( -+ cd repo && -+ git ls-files -o --exclude-standard "$parent_dir" >../actual-untracked-unsorted -+ ) && ++ git -C repo ls-files -o --exclude-standard "$parent_dir" >actual-untracked-unsorted && + sort actual-untracked-unsorted >actual-untracked && + sort expect-untracked-unsorted >expect-untracked && + test_cmp expect-untracked actual-untracked 2: a80cbd5517 < -: ---------- dir: cache git_dir's realpath 3: 899c69300c = 2: d4ff1bd40a dir: minor refactoring / clean-up -- 2.36.0 ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <20220616234433.225-1-gg.oss@outlook.com>]
* [PATCH v4 1/2] dir: traverse into repository [not found] ` <20220616234433.225-1-gg.oss@outlook.com> @ 2022-06-16 23:44 ` Goss Geppert 2022-06-16 23:44 ` [PATCH v4 2/2] dir: minor refactoring / clean-up Goss Geppert 1 sibling, 0 replies; 31+ messages in thread From: Goss Geppert @ 2022-06-16 23:44 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee From: Goss Geppert <ggossdev@gmail.com> Since 8d92fb2927 (dir: replace exponential algorithm with a linear one, 2020-04-01) traversing into a repository's directory tree when the traversal began outside the repository's standard location has failed because the encountered repository was identified as a nested foreign repository. Prior to this commit, the failure to traverse into a repository's default worktree location was observable from a user's perspective under either of the following conditions (there may be others): 1) Set the `core.worktree` location to a parent directory of the default worktree; or 2) Use the `--git_dir` option while the working directory is outside the repository's default worktree location Under either of these conditions, symptoms of the failure to traverse into the repository's default worktree location include the inability to add files to the index or get a list of untracked files via ls-files. This commit adds a check to determine whether a nested repository that is encountered in recursing a path is actually `the_repository`. If so, we simply treat the directory as if it doesn't contain a nested repository. The commit includes test-cases to reduce the likelihood of future regressions. Signed-off-by: Goss Geppert <ggossdev@gmail.com> --- dir.c | 19 +++ t/t2205-add-worktree-config.sh | 265 +++++++++++++++++++++++++++++++++ 2 files changed, 284 insertions(+) create mode 100755 t/t2205-add-worktree-config.sh diff --git a/dir.c b/dir.c index 6ca2ef5f04..c4f41247e0 100644 --- a/dir.c +++ b/dir.c @@ -1893,9 +1893,28 @@ static enum path_treatment treat_directory(struct dir_struct *dir, if ((dir->flags & DIR_SKIP_NESTED_GIT) || !(dir->flags & DIR_NO_GITLINKS)) { + /* + * Determine if `dirname` is a nested repo by confirming that: + * 1) we are in a nonbare repository, and + * 2) `dirname` is not an immediate parent of `the_repository->gitdir`, + * which could occur if the git_dir or worktree location was + * manually configured by the user; see t2205 testcases 1-3 for + * examples where this matters + */ struct strbuf sb = STRBUF_INIT; strbuf_addstr(&sb, dirname); nested_repo = is_nonbare_repository_dir(&sb); + + if (nested_repo) { + char *real_dirname, *real_gitdir; + strbuf_addstr(&sb, ".git"); + real_dirname = real_pathdup(sb.buf, 1); + real_gitdir = real_pathdup(the_repository->gitdir, 1); + + nested_repo = !!strcmp(real_dirname, real_gitdir); + free(real_gitdir); + free(real_dirname); + } strbuf_release(&sb); } if (nested_repo) { diff --git a/t/t2205-add-worktree-config.sh b/t/t2205-add-worktree-config.sh new file mode 100755 index 0000000000..43d950de64 --- /dev/null +++ b/t/t2205-add-worktree-config.sh @@ -0,0 +1,265 @@ +#!/bin/sh + +test_description='directory traversal respects user config + +This test verifies the traversal of the directory tree when the traversal begins +outside the repository. Two instances for which this can occur are tested: + + 1) The user manually sets the worktree. For this instance, the test sets + the worktree two levels above the `.git` directory and checks whether we + are able to add to the index those files that are in either (1) the + manually configured worktree directory or (2) the standard worktree + location with respect to the `.git` directory (i.e. ensuring that the + encountered `.git` directory is not treated as belonging to a foreign + nested repository). + 2) The user manually sets the `git_dir` while the working directory is + outside the repository. The test checks that files inside the + repository can be added to the index. + ' + +. ./test-lib.sh + +test_expect_success '1a: setup--config worktree' ' + mkdir test1 && + ( + cd test1 && + test_create_repo repo && + git --git-dir="repo/.git" config core.worktree "$(pwd)" && + + mkdir -p outside-tracked outside-untracked && + mkdir -p repo/inside-tracked repo/inside-untracked && + >file-tracked && + >file-untracked && + >outside-tracked/file && + >outside-untracked/file && + >repo/file-tracked && + >repo/file-untracked && + >repo/inside-tracked/file && + >repo/inside-untracked/file && + + cat >expect-tracked-unsorted <<-EOF && + ../file-tracked + ../outside-tracked/file + file-tracked + inside-tracked/file + EOF + + cat >expect-untracked-unsorted <<-EOF && + ../file-untracked + ../outside-untracked/file + file-untracked + inside-untracked/file + EOF + + cat >expect-all-dir-unsorted <<-EOF && + ../file-untracked + ../file-tracked + ../outside-untracked/ + ../outside-tracked/ + ./ + EOF + + cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && + + cat >.gitignore <<-EOF + .gitignore + actual-* + expect-* + EOF + ) +' + +test_expect_success '1b: pre-add all' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + git -C repo ls-files -o --exclude-standard "$parent_dir" >actual-all-unsorted && + sort actual-all-unsorted >actual-all && + sort expect-all-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '1c: pre-add dir all' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + git -C repo ls-files -o --directory --exclude-standard "$parent_dir" >actual-all-dir-unsorted && + sort actual-all-dir-unsorted >actual-all && + sort expect-all-dir-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '1d: post-add tracked' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + ( + cd repo && + git add file-tracked && + git add inside-tracked && + git add ../outside-tracked && + git add "$parent_dir/file-tracked" && + git ls-files "$parent_dir" >../actual-tracked-unsorted + ) && + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked + ) +' + +test_expect_success '1e: post-add untracked' ' + ( + cd test1 && + local parent_dir="$(pwd)" && + git -C repo ls-files -o --exclude-standard "$parent_dir" >actual-untracked-unsorted && + sort actual-untracked-unsorted >actual-untracked && + sort expect-untracked-unsorted >expect-untracked && + test_cmp expect-untracked actual-untracked + ) +' + +test_expect_success '2a: setup--set git-dir' ' + mkdir test2 && + ( + cd test2 && + test_create_repo repo && + # create two foreign repositories that should remain untracked + test_create_repo repo-outside && + test_create_repo repo/repo-inside && + + mkdir -p repo/inside-tracked repo/inside-untracked && + >repo/file-tracked && + >repo/file-untracked && + >repo/inside-tracked/file && + >repo/inside-untracked/file && + >repo-outside/file && + >repo/repo-inside/file && + + cat >expect-tracked-unsorted <<-EOF && + repo/file-tracked + repo/inside-tracked/file + EOF + + cat >expect-untracked-unsorted <<-EOF && + repo/file-untracked + repo/inside-untracked/file + repo/repo-inside/ + repo-outside/ + EOF + + cat >expect-all-dir-unsorted <<-EOF && + repo/ + repo-outside/ + EOF + + cat expect-tracked-unsorted expect-untracked-unsorted >expect-all-unsorted && + + cat >.gitignore <<-EOF + .gitignore + actual-* + expect-* + EOF + ) +' + +test_expect_success '2b: pre-add all' ' + ( + cd test2 && + git --git-dir=repo/.git ls-files -o --exclude-standard >actual-all-unsorted && + sort actual-all-unsorted >actual-all && + sort expect-all-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '2c: pre-add dir all' ' + ( + cd test2 && + git --git-dir=repo/.git ls-files -o --directory --exclude-standard >actual-all-dir-unsorted && + sort actual-all-dir-unsorted >actual-all && + sort expect-all-dir-unsorted >expect-all && + test_cmp expect-all actual-all + ) +' + +test_expect_success '2d: post-add tracked' ' + ( + cd test2 && + git --git-dir=repo/.git add repo/file-tracked && + git --git-dir=repo/.git add repo/inside-tracked && + git --git-dir=repo/.git ls-files >actual-tracked-unsorted && + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked + ) +' + +test_expect_success '2e: post-add untracked' ' + ( + cd test2 && + git --git-dir=repo/.git ls-files -o --exclude-standard >actual-untracked-unsorted && + sort actual-untracked-unsorted >actual-untracked && + sort expect-untracked-unsorted >expect-untracked && + test_cmp expect-untracked actual-untracked + ) +' + +test_expect_success '3a: setup--add repo dir' ' + mkdir test3 && + ( + cd test3 && + test_create_repo repo && + + mkdir -p repo/inside-tracked repo/inside-ignored && + >repo/file-tracked && + >repo/file-ignored && + >repo/inside-tracked/file && + >repo/inside-ignored/file && + + cat >.gitignore <<-EOF && + .gitignore + actual-* + expect-* + *ignored + EOF + + cat >expect-tracked-unsorted <<-EOF && + repo/file-tracked + repo/inside-tracked/file + EOF + + cat >expect-ignored-unsorted <<-EOF + repo/file-ignored + repo/inside-ignored/ + .gitignore + actual-ignored-unsorted + expect-ignored-unsorted + expect-tracked-unsorted + EOF + ) +' + +test_expect_success '3b: ignored' ' + ( + cd test3 && + git --git-dir=repo/.git ls-files -io --directory --exclude-standard >actual-ignored-unsorted && + sort actual-ignored-unsorted >actual-ignored && + sort expect-ignored-unsorted >expect-ignored && + test_cmp expect-ignored actual-ignored + ) +' + +test_expect_success '3c: add repo' ' + ( + cd test3 && + git --git-dir=repo/.git add repo && + git --git-dir=repo/.git ls-files >actual-tracked-unsorted && + sort actual-tracked-unsorted >actual-tracked && + sort expect-tracked-unsorted >expect-tracked && + test_cmp expect-tracked actual-tracked + ) +' + +test_done -- 2.36.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
* [PATCH v4 2/2] dir: minor refactoring / clean-up [not found] ` <20220616234433.225-1-gg.oss@outlook.com> 2022-06-16 23:44 ` [PATCH v4 1/2] dir: traverse into repository Goss Geppert @ 2022-06-16 23:44 ` Goss Geppert 1 sibling, 0 replies; 31+ messages in thread From: Goss Geppert @ 2022-06-16 23:44 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, christian w, Elijah Newren, Derrick Stolee From: Goss Geppert <ggossdev@gmail.com> Narrow the scope of the `nested_repo` variable and conditional return statement to the block where the variable is set. Signed-off-by: Goss Geppert <ggossdev@gmail.com> --- dir.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/dir.c b/dir.c index c4f41247e0..d7cfb08e44 100644 --- a/dir.c +++ b/dir.c @@ -1861,7 +1861,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, */ enum path_treatment state; int matches_how = 0; - int nested_repo = 0, check_only, stop_early; + int check_only, stop_early; int old_ignored_nr, old_untracked_nr; /* The "len-1" is to strip the final '/' */ enum exist_status status = directory_exists_in_index(istate, dirname, len-1); @@ -1901,6 +1901,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, * manually configured by the user; see t2205 testcases 1-3 for * examples where this matters */ + int nested_repo; struct strbuf sb = STRBUF_INIT; strbuf_addstr(&sb, dirname); nested_repo = is_nonbare_repository_dir(&sb); @@ -1916,12 +1917,13 @@ static enum path_treatment treat_directory(struct dir_struct *dir, free(real_dirname); } strbuf_release(&sb); - } - if (nested_repo) { - if ((dir->flags & DIR_SKIP_NESTED_GIT) || - (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) - return path_none; - return excluded ? path_excluded : path_untracked; + + if (nested_repo) { + if ((dir->flags & DIR_SKIP_NESTED_GIT) || + (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) + return path_none; + return excluded ? path_excluded : path_untracked; + } } if (!(dir->flags & DIR_SHOW_OTHER_DIRECTORIES)) { -- 2.36.0 ^ permalink raw reply related [flat|nested] 31+ messages in thread
end of thread, other threads:[~2022-06-22 5:01 UTC | newest] Thread overview: 31+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2022-05-05 20:32 [RFC PATCH 0/1] dir: consider worktree config in path recursion Goss Geppert 2022-05-05 20:32 ` [RFC PATCH 1/1] " Goss Geppert 2022-05-07 3:26 ` Elijah Newren 2022-05-07 17:59 ` oss dev 2022-05-06 17:02 ` [RFC PATCH 0/1] " Junio C Hamano 2022-05-06 20:00 ` oss dev 2022-05-10 17:15 ` [PATCH v2 0/2] " Goss Geppert 2022-05-10 17:15 ` [PATCH v2 1/2] " Goss Geppert 2022-05-11 16:37 ` Junio C Hamano 2022-05-20 19:45 ` oss dev 2022-05-24 14:29 ` Elijah Newren 2022-05-24 19:45 ` Junio C Hamano 2022-05-25 3:46 ` Elijah Newren 2022-05-11 23:07 ` Junio C Hamano 2022-05-20 20:01 ` oss dev 2022-05-23 19:23 ` Derrick Stolee 2022-05-30 18:48 ` oss dev 2022-05-10 17:15 ` [PATCH v2 2/2] dir: minor refactoring / clean-up Goss Geppert 2022-05-11 16:51 ` Junio C Hamano 2022-05-20 20:03 ` oss dev 2022-05-20 19:28 ` [PATCH v3 0/3] dir: traverse into repository Goss Geppert 2022-05-20 19:28 ` [PATCH v3 1/3] " Goss Geppert 2022-05-20 19:28 ` [PATCH v3 2/3] dir: cache git_dir's realpath Goss Geppert 2022-05-24 14:32 ` Elijah Newren 2022-05-20 19:28 ` [PATCH v3 3/3] dir: minor refactoring / clean-up Goss Geppert 2022-06-16 23:19 ` [PATCH v4 0/2] dir: traverse into repository Goss Geppert 2022-06-22 4:57 ` Elijah Newren [not found] ` <20220616231956.154-1-gg.oss@outlook.com> 2022-06-16 23:19 ` [PATCH v4 1/2] " Goss Geppert 2022-06-16 23:44 ` [PATCH v4 0/2] dir: traverse into repository (resending) Goss Geppert [not found] ` <20220616234433.225-1-gg.oss@outlook.com> 2022-06-16 23:44 ` [PATCH v4 1/2] dir: traverse into repository Goss Geppert 2022-06-16 23:44 ` [PATCH v4 2/2] dir: minor refactoring / clean-up Goss Geppert
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).