From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> To: git@vger.kernel.org Cc: Eric Sunshine <sunshine@sunshineco.com>, Elijah Newren <newren@gmail.com>, Derrick Stolee <stolee@gmail.com>, Jeff King <peff@peff.net>, Philip Oakley <philipoakley@iee.email>, Jeff Hostetler <jeffhost@microsoft.com>, Josh Steadmon <steadmon@google.com>, Elijah Newren <newren@gmail.com>, Elijah Newren <newren@gmail.com> Subject: [PATCH v3 6/8] dir: avoid unnecessary traversal into ignored directory Date: Sat, 08 May 2021 19:59:02 +0000 [thread overview] Message-ID: <66ffc7f02d08f3f07cb3cb2605b113a630f1e127.1620503945.git.gitgitgadget@gmail.com> (raw) In-Reply-To: <pull.1020.v3.git.git.1620503945.gitgitgadget@gmail.com> From: Elijah Newren <newren@gmail.com> The show_other_directories case in treat_directory() tried to handle both excludes and untracked files with the same logic, and mishandled both the excludes and the untracked files in the process, in different ways. Split that logic apart, and then focus on the logic for the excludes; a subsequent commit will address the logic for untracked files. For show_other_directories, an excluded directory means that every path underneath that directory will also be excluded. Given that the calling code requested to just show directories when everything under a directory had the same state (that's what the "DIR_SHOW_OTHER_DIRECTORIES" flag means), we generally do not need to traverse into such directories and can just immediately mark them as ignored (i.e. as path_excluded). The only reason we cannot just immediately return path_excluded is the DIR_HIDE_EMPTY_DIRECTORIES flag and the possibility that the ignored directory is an empty directory. The code previously treated DIR_SHOW_IGNORED_TOO in most cases as an exception as well, which was wrong. It can sometimes reduce the number of cases where we need to recurse (namely if DIR_SHOW_IGNORED_TOO_MODE_MATCHING is also set), but should not be able to increase the number of cases where we need to recurse. Fix the logic accordingly. Some sidenotes about possible confusion with dir.c: * "ignored" often refers to an untracked ignore", i.e. a file which is not tracked which matches one of the ignore/exclusion rules. But you can also have a "tracked ignore", a tracked file that happens to match one of the ignore/exclusion rules and which dir.c has to worry about since "git ls-files -c -i" is supposed to list them. * The dir code often uses "ignored" and "excluded" interchangeably, which you need to keep in mind while reading the code. Sadly, though, it can get very confusing since ignore rules can have exclusions, as in the last of the following .gitignore rules: .gitignore *~ *.log !settings.log In the last entry above, (pathspec->items[3].magic & PATHSPEC_EXCLUDE) will be true due the the '!' negating the rule. Someone might refer to this as "excluded". That means the file 'settings.log' will not match, and thus not be ignored. So we won't return path_excluded for it. So it's an exclude rule that prevents the file from being an exclude. The non-excluded rules are the ones that result in files being excludes. Great fun, eh? Sometimes it feels like dir.c needs its own glossary with its many definitions, including the multiply-defined terms. Reported-by: Jason Gore <Jason.Gore@microsoft.com> Signed-off-by: Elijah Newren <newren@gmail.com> --- dir.c | 44 +++++++++++++++++++++++++++++--------------- t/t7300-clean.sh | 2 +- 2 files changed, 30 insertions(+), 16 deletions(-) diff --git a/dir.c b/dir.c index dfb174227b36..3f2cfef2c2bb 100644 --- a/dir.c +++ b/dir.c @@ -1844,6 +1844,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, } /* This is the "show_other_directories" case */ + assert(dir->flags & DIR_SHOW_OTHER_DIRECTORIES); /* * If we have a pathspec which could match something _below_ this @@ -1854,27 +1855,40 @@ static enum path_treatment treat_directory(struct dir_struct *dir, if (matches_how == MATCHED_RECURSIVELY_LEADING_PATHSPEC) return path_recurse; + /* Special cases for where this directory is excluded/ignored */ + if (excluded) { + /* + * In the show_other_directories case, if we're not + * hiding empty directories, there is no need to + * recurse into an ignored directory. + */ + if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES)) + return path_excluded; + + /* + * Even if we are hiding empty directories, we can still avoid + * recursing into ignored directories for DIR_SHOW_IGNORED_TOO + * if DIR_SHOW_IGNORED_TOO_MODE_MATCHING is also set. + */ + if ((dir->flags & DIR_SHOW_IGNORED_TOO) && + (dir->flags & DIR_SHOW_IGNORED_TOO_MODE_MATCHING)) + return path_excluded; + } + /* - * Other than the path_recurse case immediately above, we only need - * to recurse into untracked/ignored directories if either of the - * following bits is set: + * Other than the path_recurse case above, we only need to + * recurse into untracked directories if either of the following + * bits is set: * - DIR_SHOW_IGNORED_TOO (because then we need to determine if * there are ignored entries below) * - DIR_HIDE_EMPTY_DIRECTORIES (because we have to determine if * the directory is empty) */ - if (!(dir->flags & (DIR_SHOW_IGNORED_TOO | DIR_HIDE_EMPTY_DIRECTORIES))) - return excluded ? path_excluded : path_untracked; - - /* - * ...and even if DIR_SHOW_IGNORED_TOO is set, we can still avoid - * recursing into ignored directories if the path is excluded and - * DIR_SHOW_IGNORED_TOO_MODE_MATCHING is also set. - */ - if (excluded && - (dir->flags & DIR_SHOW_IGNORED_TOO) && - (dir->flags & DIR_SHOW_IGNORED_TOO_MODE_MATCHING)) - return path_excluded; + if (!excluded && + !(dir->flags & (DIR_SHOW_IGNORED_TOO | + DIR_HIDE_EMPTY_DIRECTORIES))) { + return path_untracked; + } /* * Even if we don't want to know all the paths under an untracked or diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh index 74d395838708..a1d695ee9fe9 100755 --- a/t/t7300-clean.sh +++ b/t/t7300-clean.sh @@ -746,7 +746,7 @@ test_expect_success 'clean untracked paths by pathspec' ' test_must_be_empty actual ' -test_expect_failure 'avoid traversing into ignored directories' ' +test_expect_success 'avoid traversing into ignored directories' ' test_when_finished rm -f output error trace.* && test_create_repo avoid-traversing-deep-hierarchy && ( -- gitgitgadget
next prev parent reply other threads:[~2021-05-08 19:59 UTC|newest] Thread overview: 90+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-05-07 4:04 [PATCH 0/5] Directory traversal fixes Elijah Newren via GitGitGadget 2021-05-07 4:04 ` [PATCH 1/5] t7300: add testcase showing unnecessary traversal into ignored directory Elijah Newren via GitGitGadget 2021-05-07 4:27 ` Eric Sunshine 2021-05-07 5:00 ` Elijah Newren 2021-05-07 5:31 ` Eric Sunshine 2021-05-07 5:42 ` Elijah Newren 2021-05-07 5:56 ` Eric Sunshine 2021-05-07 23:05 ` Jeff King 2021-05-07 23:15 ` Eric Sunshine 2021-05-08 0:04 ` Elijah Newren 2021-05-08 0:10 ` Eric Sunshine 2021-05-08 17:20 ` Elijah Newren 2021-05-08 11:13 ` Philip Oakley 2021-05-08 17:20 ` Elijah Newren 2021-05-07 4:04 ` [PATCH 2/5] t3001, t7300: add testcase showcasing missed directory traversal Elijah Newren via GitGitGadget 2021-05-07 4:04 ` [PATCH 3/5] dir: avoid unnecessary traversal into ignored directory Elijah Newren via GitGitGadget 2021-05-07 4:04 ` [PATCH 4/5] dir: traverse into untracked directories if they may have ignored subfiles Elijah Newren via GitGitGadget 2021-05-07 4:05 ` [PATCH 5/5] [RFC] ls-files: error out on -i unless -o or -c are specified Elijah Newren via GitGitGadget 2021-05-07 16:22 ` [PATCH 6/5] dir: update stale description of treat_directory() Derrick Stolee 2021-05-07 17:57 ` Elijah Newren 2021-05-07 16:27 ` [PATCH 0/5] Directory traversal fixes Derrick Stolee 2021-05-08 0:08 ` [PATCH v2 0/8] " Elijah Newren via GitGitGadget 2021-05-08 0:08 ` [PATCH v2 1/8] t7300: add testcase showing unnecessary traversal into ignored directory Elijah Newren via GitGitGadget 2021-05-08 10:13 ` Junio C Hamano 2021-05-08 17:34 ` Elijah Newren 2021-05-08 10:19 ` Junio C Hamano 2021-05-08 17:41 ` Elijah Newren 2021-05-08 0:08 ` [PATCH v2 2/8] t3001, t7300: add testcase showcasing missed directory traversal Elijah Newren via GitGitGadget 2021-05-08 0:08 ` [PATCH v2 3/8] dir: avoid unnecessary traversal into ignored directory Elijah Newren via GitGitGadget 2021-05-08 0:08 ` [PATCH v2 4/8] dir: traverse into untracked directories if they may have ignored subfiles Elijah Newren via GitGitGadget 2021-05-08 0:08 ` [PATCH v2 5/8] [RFC] ls-files: error out on -i unless -o or -c are specified Elijah Newren via GitGitGadget 2021-05-08 0:08 ` [PATCH v2 6/8] dir: update stale description of treat_directory() Derrick Stolee via GitGitGadget 2021-05-08 0:08 ` [PATCH v2 7/8] [RFC] dir: convert trace calls to trace2 equivalents Elijah Newren via GitGitGadget 2021-05-08 0:08 ` [PATCH v2 8/8] [RFC] dir: reported number of visited directories and paths with trace2 Elijah Newren via GitGitGadget 2021-05-08 19:58 ` [PATCH v3 0/8] Directory traversal fixes Elijah Newren via GitGitGadget 2021-05-08 19:58 ` [PATCH v3 1/8] [RFC] dir: convert trace calls to trace2 equivalents Elijah Newren via GitGitGadget 2021-05-10 4:49 ` Junio C Hamano 2021-05-11 17:23 ` Elijah Newren 2021-05-11 16:17 ` Jeff Hostetler 2021-05-11 17:29 ` Elijah Newren 2021-05-08 19:58 ` [PATCH v3 2/8] [RFC] dir: report number of visited directories and paths with trace2 Elijah Newren via GitGitGadget 2021-05-10 5:00 ` Junio C Hamano 2021-05-08 19:58 ` [PATCH v3 3/8] [RFC] ls-files: error out on -i unless -o or -c are specified Elijah Newren via GitGitGadget 2021-05-10 5:09 ` Junio C Hamano 2021-05-11 17:40 ` Elijah Newren 2021-05-11 22:32 ` Junio C Hamano 2021-05-08 19:59 ` [PATCH v3 4/8] t7300: add testcase showing unnecessary traversal into ignored directory Elijah Newren via GitGitGadget 2021-05-10 5:28 ` Junio C Hamano 2021-05-11 17:45 ` Elijah Newren 2021-05-11 22:43 ` Junio C Hamano 2021-05-12 2:07 ` Elijah Newren 2021-05-12 3:17 ` Junio C Hamano 2021-05-08 19:59 ` [PATCH v3 5/8] t3001, t7300: add testcase showcasing missed directory traversal Elijah Newren via GitGitGadget 2021-05-08 19:59 ` Elijah Newren via GitGitGadget [this message] 2021-05-10 5:48 ` [PATCH v3 6/8] dir: avoid unnecessary traversal into ignored directory Junio C Hamano 2021-05-11 17:57 ` Elijah Newren 2021-05-08 19:59 ` [PATCH v3 7/8] dir: traverse into untracked directories if they may have ignored subfiles Elijah Newren via GitGitGadget 2021-05-08 19:59 ` [PATCH v3 8/8] dir: update stale description of treat_directory() Derrick Stolee via GitGitGadget 2021-05-11 18:34 ` [PATCH v4 0/8] Directory traversal fixes Elijah Newren via GitGitGadget 2021-05-11 18:34 ` [PATCH v4 1/8] dir: convert trace calls to trace2 equivalents Elijah Newren via GitGitGadget 2021-05-11 19:06 ` Jeff Hostetler 2021-05-11 20:12 ` Elijah Newren 2021-05-11 23:12 ` Jeff Hostetler 2021-05-12 0:44 ` Elijah Newren 2021-05-12 12:26 ` Jeff Hostetler 2021-05-12 15:24 ` Elijah Newren 2021-05-11 18:34 ` [PATCH v4 2/8] dir: report number of visited directories and paths with trace2 Elijah Newren via GitGitGadget 2021-05-11 18:34 ` [PATCH v4 3/8] ls-files: error out on -i unless -o or -c are specified Elijah Newren via GitGitGadget 2021-05-11 18:34 ` [PATCH v4 4/8] t7300: add testcase showing unnecessary traversal into ignored directory Elijah Newren via GitGitGadget 2021-05-11 18:34 ` [PATCH v4 5/8] t3001, t7300: add testcase showcasing missed directory traversal Elijah Newren via GitGitGadget 2021-05-11 18:34 ` [PATCH v4 6/8] dir: avoid unnecessary traversal into ignored directory Elijah Newren via GitGitGadget 2021-05-11 18:34 ` [PATCH v4 7/8] dir: traverse into untracked directories if they may have ignored subfiles Elijah Newren via GitGitGadget 2021-05-11 18:34 ` [PATCH v4 8/8] dir: update stale description of treat_directory() Derrick Stolee via GitGitGadget 2021-05-12 17:28 ` [PATCH v5 0/9] Directory traversal fixes Elijah Newren via GitGitGadget 2021-05-12 17:28 ` [PATCH v5 1/9] dir: convert trace calls to trace2 equivalents Elijah Newren via GitGitGadget 2021-05-12 17:28 ` [PATCH v5 2/9] dir: report number of visited directories and paths with trace2 Elijah Newren via GitGitGadget 2021-05-12 17:28 ` [PATCH v5 3/9] ls-files: error out on -i unless -o or -c are specified Elijah Newren via GitGitGadget 2021-05-12 17:28 ` [PATCH v5 4/9] t7300: add testcase showing unnecessary traversal into ignored directory Elijah Newren via GitGitGadget 2021-05-12 17:28 ` [PATCH v5 5/9] t3001, t7300: add testcase showcasing missed directory traversal Elijah Newren via GitGitGadget 2021-05-12 17:28 ` [PATCH v5 6/9] dir: avoid unnecessary traversal into ignored directory Elijah Newren via GitGitGadget 2021-05-12 17:28 ` [PATCH v5 7/9] dir: traverse into untracked directories if they may have ignored subfiles Elijah Newren via GitGitGadget 2021-05-12 17:28 ` [PATCH v5 8/9] dir: update stale description of treat_directory() Derrick Stolee via GitGitGadget 2021-05-17 17:20 ` Derrick Stolee 2021-05-17 19:44 ` Junio C Hamano 2021-05-18 3:32 ` Elijah Newren 2021-05-19 1:44 ` Junio C Hamano 2021-05-12 17:28 ` [PATCH v5 9/9] dir: introduce readdir_skip_dot_and_dotdot() helper Elijah Newren via GitGitGadget 2021-05-17 17:22 ` Derrick Stolee 2021-05-18 3:34 ` Elijah Newren 2021-05-17 17:23 ` [PATCH v5 0/9] Directory traversal fixes Derrick Stolee
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style List information: http://vger.kernel.org/majordomo-info.html * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=66ffc7f02d08f3f07cb3cb2605b113a630f1e127.1620503945.git.gitgitgadget@gmail.com \ --to=gitgitgadget@gmail.com \ --cc=git@vger.kernel.org \ --cc=jeffhost@microsoft.com \ --cc=newren@gmail.com \ --cc=peff@peff.net \ --cc=philipoakley@iee.email \ --cc=steadmon@google.com \ --cc=stolee@gmail.com \ --cc=sunshine@sunshineco.com \ --subject='Re: [PATCH v3 6/8] dir: avoid unnecessary traversal into ignored directory' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Code repositories for project(s) associated with this inbox: https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).