git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Elijah Newren <newren@gmail.com>
To: git@vger.kernel.org
Cc: sxlijin@gmail.com, peff@peff.net, Elijah Newren <newren@gmail.com>
Subject: [RFC PATCH 6/7] dir: If our pathspec might match files under a dir, recurse into it
Date: Thu,  5 Apr 2018 10:34:45 -0700	[thread overview]
Message-ID: <20180405173446.32372-7-newren@gmail.com> (raw)
In-Reply-To: <20180405173446.32372-1-newren@gmail.com>

For git clean, if a directory is entirely untracked and the user did not
specify -d (corresponding to DIR_SHOW_IGNORED_TOO), then we usually do
not want to remove that directory and thus do not recurse into it.
However, if the user manually specified specific (or even globbed) paths
somewhere under that directory to remove, then we need to recurse into
the directory to make sure we remove the relevant paths under that
directory as the user requested.

Note that this does not mean that the recursed-into directory will be
added to dir->entries for later removal; as of a few commits earlier in
this series, there is another more strict match check that is run after
returning from a recursed-into directory before deciding to add it to the
list of entries.  Therefore, this will only result in files underneath
the given directory which match one of the pathspecs being added to the
entries list.

Two particular considerations for this patch:

  * If we want to only recurse into a directory when it is specifically
    matched rather than matched-via-glob (e.g. '*.c'), then we could do
    so via making the final non-zero return in match_pathspec_item be
    MATCHED_RECURSIVELY instead of MATCHED_RECURSIVELY_LEADING_PATHSPEC.
    (See final patch of this RFC series for details; note that the
    relative order of MATCHED_RECURSIVELY_LEADING_PATHSPEC and
    MATCHED_RECURSIVELY are important for such a change.))

  * There is a growing amount of logic in read_directory_recursive() for
    deciding whether to recurse into a subdirectory.  However, there is
    a comment immediately preceding this logic that says to recurse if
    instructed by treat_path().   It may be better for the logic in
    read_directory_recursive() to be moved to treat_path() (or another
    function it calls, such as treat_directory()), but I did not feel
    strongly about this and just left the logic where it was while
    adding to it.  Do others have strong opinions on this?

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.c            | 10 ++++++----
 dir.h            |  5 +++--
 t/t7300-clean.sh |  4 ++--
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/dir.c b/dir.c
index b0bca179fd..f55e24f149 100644
--- a/dir.c
+++ b/dir.c
@@ -386,7 +386,7 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
 		if ((namelen < matchlen) &&
 		    (match[namelen-1] == '/') &&
 		    !ps_strncmp(item, match, name, namelen))
-			return MATCHED_RECURSIVELY;
+			return MATCHED_RECURSIVELY_LEADING_PATHSPEC;
 
 		/* name" doesn't match up to the first wild character */
 		if (item->nowildcard_len < item->len &&
@@ -403,7 +403,7 @@ static int match_pathspec_item(const struct pathspec_item *item, int prefix,
 		 * The submodules themselves will be able to perform more
 		 * accurate matching to determine if the pathspec matches.
 		 */
-		return MATCHED_RECURSIVELY;
+		return MATCHED_RECURSIVELY_LEADING_PATHSPEC;
 	}
 
 	return 0;
@@ -1961,8 +1961,10 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* recurse into subdir if instructed by treat_path */
 		if ((state == path_recurse) ||
 			((state == path_untracked) &&
-			 (dir->flags & DIR_SHOW_IGNORED_TOO) &&
-			 (get_dtype(cdir.de, istate, path.buf, path.len) == DT_DIR))) {
+			 (get_dtype(cdir.de, istate, path.buf, path.len) == DT_DIR) &&
+			 ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
+			  do_match_pathspec(pathspec, path.buf, path.len,
+					    baselen, NULL, DO_MATCH_LEADING_PATHSPEC) == MATCHED_RECURSIVELY_LEADING_PATHSPEC))) {
 			struct untracked_cache_dir *ud;
 			ud = lookup_untracked(dir->untracked, untracked,
 					      path.buf + baselen,
diff --git a/dir.h b/dir.h
index b0758b82a2..0573f99ae0 100644
--- a/dir.h
+++ b/dir.h
@@ -210,8 +210,9 @@ extern int count_slashes(const char *s);
  * when populating the seen[] array.
  */
 #define MATCHED_RECURSIVELY 1
-#define MATCHED_FNMATCH 2
-#define MATCHED_EXACTLY 3
+#define MATCHED_RECURSIVELY_LEADING_PATHSPEC 2
+#define MATCHED_FNMATCH 3
+#define MATCHED_EXACTLY 4
 extern int simple_length(const char *match);
 extern int no_wildcard(const char *string);
 extern char *common_prefix(const struct pathspec *pathspec);
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 31f2d0d8ba..889b3401e4 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -669,7 +669,7 @@ test_expect_success 'git clean -d skips untracked dirs containing ignored files'
 	test_path_is_missing foo/b/bb
 '
 
-test_expect_failure 'git clean handles being told what to clean' '
+test_expect_success 'git clean handles being told what to clean' '
 	mkdir -p d1 d2 &&
 	touch d1/ut d2/ut &&
 	git clean -f */ut &&
@@ -685,7 +685,7 @@ test_expect_success 'git clean handles being told what to clean, with -d' '
 	test_path_is_missing d2/ut
 '
 
-test_expect_failure 'git clean handles being told a glob to clean' '
+test_expect_success 'git clean handles being told a glob to clean' '
 	mkdir -p d1 d2 &&
 	touch d1/ut d2/ut &&
 	git clean -f "*ut" &&
-- 
2.17.0.7.g0b50f94d69


  parent reply	other threads:[~2018-04-05 17:34 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-05 17:34 [RFC PATCH 0/7] Fix `git clean` with pathspecs Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 1/7] dir.c: Fix typo in comment Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 2/7] dir.c: fix off-by-one error in match_pathspec_item Elijah Newren
2018-04-05 17:49   ` Jeff King
2018-04-05 18:36     ` Elijah Newren
2018-04-05 19:04       ` Jeff King
2018-04-05 20:06         ` Elijah Newren
2018-04-06 17:53           ` Jeff King
2018-04-05 17:34 ` [RFC PATCH 3/7] t7300: Add some testcases showing failure to clean specified pathspecs Elijah Newren
2018-04-05 17:34 ` [RFC PATCH 4/7] dir: Directories should be checked for matching pathspecs too Elijah Newren
2018-04-05 18:58   ` Jeff King
2018-04-05 19:15     ` Elijah Newren
2018-04-05 19:31       ` Jeff King
2018-04-09  2:07         ` Junio C Hamano
2018-04-05 17:34 ` [RFC PATCH 5/7] dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2018-04-05 17:34 ` Elijah Newren [this message]
2018-04-05 17:34 ` [RFC PATCH 7/7] If we do not want globs to recurse into subdirs without -d Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180405173446.32372-7-newren@gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=sxlijin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).