git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Elijah Newren <newren@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Rafael Ascensão" <rafa.almas@gmail.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Samuel Lijin" <sxlijin@gmail.com>,
	"Elijah Newren" <newren@gmail.com>
Subject: [PATCH v4 06/12] dir: if our pathspec might match files under a dir, recurse into it
Date: Tue, 17 Sep 2019 09:34:58 -0700
Message-ID: <20190917163504.14566-7-newren@gmail.com> (raw)
In-Reply-To: <20190917163504.14566-1-newren@gmail.com>

For git clean, if a directory is entirely untracked and the user did not
specify -d (corresponding to DIR_SHOW_IGNORED_TOO), then we usually do
not want to remove that directory and thus do not recurse into it.
However, if the user manually specified specific (or even globbed) paths
somewhere under that directory to remove, then we need to recurse into
the directory to make sure we remove the relevant paths under that
directory as the user requested.

Note that this does not mean that the recursed-into directory will be
added to dir->entries for later removal; as of a few commits earlier in
this series, there is another more strict match check that is run after
returning from a recursed-into directory before deciding to add it to the
list of entries.  Therefore, this will only result in files underneath
the given directory which match one of the pathspecs being added to the
entries list.

Two notes of potential interest to future readers:

  * If we wanted to only recurse into a directory when it is specifically
    matched rather than matched-via-glob (e.g. '*.c'), then we could do
    so via making the final non-zero return in match_pathspec_item be
    MATCHED_RECURSIVELY instead of MATCHED_RECURSIVELY_LEADING_PATHSPEC.
    (Note that the relative order of MATCHED_RECURSIVELY_LEADING_PATHSPEC
    and MATCHED_RECURSIVELY are important for such a change.)  I was
    leaving open that possibility while writing an RFC asking for the
    behavior we want, but even though we don't want it, that knowledge
    might help you understand the code flow better.

  * There is a growing amount of logic in read_directory_recursive() for
    deciding whether to recurse into a subdirectory.  However, there is a
    comment immediately preceding this logic that says to recurse if
    instructed by treat_path().   It may be better for the logic in
    read_directory_recursive() to ultimately be moved to treat_path() (or
    another function it calls, such as treat_directory()), but I have
    left that for someone else to tackle in the future.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 dir.c            | 10 ++++++----
 dir.h            |  5 +++--
 t/t7300-clean.sh |  4 ++--
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/dir.c b/dir.c
index b4d656192e..47c0a99cb5 100644
--- a/dir.c
+++ b/dir.c
@@ -360,7 +360,7 @@ static int match_pathspec_item(const struct index_state *istate,
 		if ((namelen < matchlen) &&
 		    (match[namelen-offset] == '/') &&
 		    !ps_strncmp(item, match, name, namelen))
-			return MATCHED_RECURSIVELY;
+			return MATCHED_RECURSIVELY_LEADING_PATHSPEC;
 
 		/* name" doesn't match up to the first wild character */
 		if (item->nowildcard_len < item->len &&
@@ -377,7 +377,7 @@ static int match_pathspec_item(const struct index_state *istate,
 		 * The submodules themselves will be able to perform more
 		 * accurate matching to determine if the pathspec matches.
 		 */
-		return MATCHED_RECURSIVELY;
+		return MATCHED_RECURSIVELY_LEADING_PATHSPEC;
 	}
 
 	return 0;
@@ -1939,8 +1939,10 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		/* recurse into subdir if instructed by treat_path */
 		if ((state == path_recurse) ||
 			((state == path_untracked) &&
-			 (dir->flags & DIR_SHOW_IGNORED_TOO) &&
-			 (get_dtype(cdir.de, istate, path.buf, path.len) == DT_DIR))) {
+			 (get_dtype(cdir.de, istate, path.buf, path.len) == DT_DIR) &&
+			 ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
+			  do_match_pathspec(istate, pathspec, path.buf, path.len,
+					    baselen, NULL, DO_MATCH_LEADING_PATHSPEC) == MATCHED_RECURSIVELY_LEADING_PATHSPEC))) {
 			struct untracked_cache_dir *ud;
 			ud = lookup_untracked(dir->untracked, untracked,
 					      path.buf + baselen,
diff --git a/dir.h b/dir.h
index 680079bbe3..46c238ab49 100644
--- a/dir.h
+++ b/dir.h
@@ -211,8 +211,9 @@ int count_slashes(const char *s);
  * when populating the seen[] array.
  */
 #define MATCHED_RECURSIVELY 1
-#define MATCHED_FNMATCH 2
-#define MATCHED_EXACTLY 3
+#define MATCHED_RECURSIVELY_LEADING_PATHSPEC 2
+#define MATCHED_FNMATCH 3
+#define MATCHED_EXACTLY 4
 int simple_length(const char *match);
 int no_wildcard(const char *string);
 char *common_prefix(const struct pathspec *pathspec);
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 12617158db..d83aeb7dc2 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -691,7 +691,7 @@ test_expect_failure 'git clean -d skips nested repo containing ignored files' '
 	test_path_is_file nested-repo-with-ignored-file/file
 '
 
-test_expect_failure 'git clean handles being told what to clean' '
+test_expect_success 'git clean handles being told what to clean' '
 	mkdir -p d1 d2 &&
 	touch d1/ut d2/ut &&
 	git clean -f */ut &&
@@ -707,7 +707,7 @@ test_expect_success 'git clean handles being told what to clean, with -d' '
 	test_path_is_missing d2/ut
 '
 
-test_expect_failure 'git clean works if a glob is passed without -d' '
+test_expect_success 'git clean works if a glob is passed without -d' '
 	mkdir -p d1 d2 &&
 	touch d1/ut d2/ut &&
 	git clean -f "*ut" &&
-- 
2.22.1.17.g6e632477f7.dirty


  parent reply index

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-25 18:59 [PATCH] t7300-clean: demonstrate deleting nested repo with an ignored file breakage SZEDER Gábor
2019-08-25 20:34 ` SZEDER Gábor
2019-08-25 22:32 ` Philip Oakley
2019-08-26  7:48   ` SZEDER Gábor
2019-09-05 15:47 ` [RFC PATCH v2 00/12] Fix some git clean issues Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 01/12] t7300: Add some testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 02/12] dir: fix typo in comment Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 04/12] dir: Directories should be checked for matching pathspecs too Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 05/12] dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 06/12] dir: If our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-05 21:20     ` SZEDER Gábor
2019-09-05 15:47   ` [RFC PATCH v2 11/12] clean: rewrap overly long line Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 12/12] clean: fix theoretical path corruption Elijah Newren
2019-09-05 19:27     ` SZEDER Gábor
2019-09-07  0:34       ` Elijah Newren
2019-09-05 19:01   ` [RFC PATCH v2 00/12] Fix some git clean issues SZEDER Gábor
2019-09-07  0:33     ` Elijah Newren
2019-09-12 22:12   ` [PATCH v3 " Elijah Newren
2019-09-12 22:12     ` [PATCH v3 01/12] t7300: add testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-13 18:54       ` Junio C Hamano
2019-09-13 19:10         ` Elijah Newren
2019-09-13 20:29           ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 02/12] dir: fix typo in comment Elijah Newren
2019-09-12 22:12     ` [PATCH v3 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-13 19:05       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 04/12] dir: also check directories for matching pathspecs Elijah Newren
2019-09-12 22:12     ` [PATCH v3 05/12] dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-12 22:12     ` [PATCH v3 06/12] dir: if our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-13 19:45       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-13 20:04       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-12 22:12     ` [PATCH v3 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-12 22:12     ` [PATCH v3 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-12 22:12     ` [PATCH v3 11/12] clean: rewrap overly long line Elijah Newren
2019-09-12 22:12     ` [PATCH v3 12/12] clean: fix theoretical path corruption Elijah Newren
2019-09-17 16:34     ` [PATCH v4 00/12] Fix some git clean issues Elijah Newren
2019-09-17 16:34       ` [PATCH v4 01/12] t7300: add testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-17 16:34       ` [PATCH v4 02/12] dir: fix typo in comment Elijah Newren
2019-09-17 16:34       ` [PATCH v4 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-17 16:34       ` [PATCH v4 04/12] dir: also check directories for matching pathspecs Elijah Newren
2019-09-25 20:39         ` [BUG] git is segfaulting, was " Denton Liu
2019-09-25 21:28           ` Elijah Newren
2019-09-25 21:55             ` Denton Liu
2019-09-26 20:35               ` Denton Liu
2019-09-27  0:12                 ` Elijah Newren
2019-09-27  1:09           ` SZEDER Gábor
2019-09-27  2:17             ` SZEDER Gábor
2019-09-27 17:10               ` Denton Liu
2019-09-30 19:11                 ` [PATCH] dir: special case check for the possibility that pathspec is NULL Elijah Newren
2019-09-30 22:31                   ` Denton Liu
2019-10-01  7:01                     ` Elijah Newren
2019-10-01 18:30                   ` [PATCH v2] " Elijah Newren
2019-10-01 18:40                     ` Denton Liu
2019-10-01 18:54                       ` Elijah Newren
2019-10-01 18:55                       ` [PATCH v3] " Elijah Newren
2019-10-01 19:35                         ` Denton Liu
2019-10-01 19:39                           ` Elijah Newren
2019-10-02 15:51                             ` Elijah Newren
2019-10-07 18:04                         ` SZEDER Gábor
2019-09-17 16:34       ` [PATCH v4 05/12] dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-17 16:34       ` Elijah Newren [this message]
2019-09-17 16:34       ` [PATCH v4 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-17 16:35       ` [PATCH v4 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-17 16:35       ` [PATCH v4 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-17 16:35       ` [PATCH v4 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-17 16:35       ` [PATCH v4 11/12] clean: rewrap overly long line Elijah Newren
2019-09-17 16:35       ` [PATCH v4 12/12] clean: fix theoretical path corruption Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190917163504.14566-7-newren@gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=rafa.almas@gmail.com \
    --cc=sxlijin@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git