git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / Atom feed
From: Elijah Newren <newren@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Rafael Ascensão" <rafa.almas@gmail.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Samuel Lijin" <sxlijin@gmail.com>,
	"Elijah Newren" <newren@gmail.com>
Subject: [PATCH v3 09/12] clean: disambiguate the definition of -d
Date: Thu, 12 Sep 2019 15:12:37 -0700
Message-ID: <20190912221240.18057-10-newren@gmail.com> (raw)
In-Reply-To: <20190912221240.18057-1-newren@gmail.com>

The -d flag pre-dated git-clean's ability to have paths specified.  As
such, the default for git-clean was to only remove untracked files in
the current directory, and -d existed to allow it to recurse into
subdirectories.

The interaction of paths and the -d option appears to not have been
carefully considered, as evidenced by numerous bugs and a dearth of
tests covering such pairings in the testsuite.  The definition turns out
to be important, so let's look at some of the various ways one could
interpret the -d option:

  A) Without -d, only look in subdirectories which contain tracked
     files under them; with -d, also look in subdirectories which
     are untracked for files to clean.

  B) Without specified paths from the user for us to delete, we need to
     have some kind of default, so...without -d, only look in
     subdirectories which contain tracked files under them; with -d,
     also look in subdirectories which are untracked for files to clean.

The important distinction here is that choice B says that the presence
or absence of '-d' is irrelevant if paths are specified.  The logic
behind option B is that if a user explicitly asked us to clean a
specified pathspec, then we should clean anything that matches that
pathspec.  Some examples may clarify.  Should

   git clean -f untracked_dir/file

remove untracked_dir/file or not?  It seems crazy not to, but a strict
reading of option A says it shouldn't be removed.  How about

   git clean -f untracked_dir/file1 tracked_dir/file2

or

   git clean -f untracked_dir_1/file1 untracked_dir_2/file2

?  Should it remove either or both of these files?  Should it require
multiple runs to remove both the files listed?  (If this sounds like a
crazy question to even ask, see the commit message of "t7300: Add some
testcases showing failure to clean specified pathspecs" added earlier in
this patch series.)  What if -ffd were used instead of -f -- should that
allow these to be removed?  Should it take multiple invocations with
-ffd?  What if a glob (such as '*tracked*') were used instead of
spelling out the directory names?  What if the filenames involved globs,
such as

   git clean -f '*.o'

or

   git clean -f '*/*.o'

?

The current documentation actually suggests a definition that is
slightly different than choice A, and the implementation prior to this
series provided something radically different than either choices A or
B. (The implementation, though, was clearly just buggy).  There may be
other choices as well.  However, for almost any given choice of
definition for -d that I can think of, some of the examples above will
appear buggy to the user.  The only case that doesn't have negative
surprises is choice B: treat a user-specified path as a request to clean
all untracked files which match that path specification, including
recursing into any untracked directories.

Change the documentation and basic implementation to use this
definition.

There were two regression tests that indirectly depended on the current
implementation, but neither was about subdirectory handling.  These two
tests were introduced in commit 5b7570cfb41c ("git-clean: add tests for
relative path", 2008-03-07) which was solely created to add coverage for
the changes in commit fb328947c8e ("git-clean: correct printing relative
path", 2008-03-07).  Both tests specified a directory that happened to
have an untracked subdirectory, but both were only checking that the
resulting printout of a file that was removed was shown with a relative
path.  Update these tests appropriately.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 Documentation/git-clean.txt | 10 ++++++----
 builtin/clean.c             |  8 ++++++++
 t/t7300-clean.sh            |  2 ++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
index e84ffc9396..3ab749b921 100644
--- a/Documentation/git-clean.txt
+++ b/Documentation/git-clean.txt
@@ -26,10 +26,12 @@ are affected.
 OPTIONS
 -------
 -d::
-	Remove untracked directories in addition to untracked files.
-	If an untracked directory is managed by a different Git
-	repository, it is not removed by default.  Use -f option twice
-	if you really want to remove such a directory.
+	Normally, when no <path> is specified, git clean will not
+	recurse into untracked directories to avoid removing too much.
+	Specify -d to have it recurse into such directories as well.
+	If any paths are specified, -d is irrelevant; all untracked
+	files matching the specified paths (with exceptions for nested
+	git directories mentioned under `--force`) will be removed.
 
 -f::
 --force::
diff --git a/builtin/clean.c b/builtin/clean.c
index d5579da716..68d70e41c0 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -949,6 +949,14 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 
 	dir.flags |= DIR_SHOW_OTHER_DIRECTORIES;
 
+	if (argc) {
+		/*
+		 * Remaining args implies pathspecs specified, and we should
+		 * recurse within those.
+		 */
+		remove_directories = 1;
+	}
+
 	if (remove_directories)
 		dir.flags |= DIR_SHOW_IGNORED_TOO | DIR_KEEP_UNTRACKED_CONTENTS;
 
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index d83aeb7dc2..530dfdab34 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -117,6 +117,7 @@ test_expect_success C_LOCALE_OUTPUT 'git clean with relative prefix' '
 	would_clean=$(
 		cd docs &&
 		git clean -n ../src |
+		grep part3 |
 		sed -n -e "s|^Would remove ||p"
 	) &&
 	verbose test "$would_clean" = ../src/part3.c
@@ -129,6 +130,7 @@ test_expect_success C_LOCALE_OUTPUT 'git clean with absolute path' '
 	would_clean=$(
 		cd docs &&
 		git clean -n "$(pwd)/../src" |
+		grep part3 |
 		sed -n -e "s|^Would remove ||p"
 	) &&
 	verbose test "$would_clean" = ../src/part3.c
-- 
2.23.0.173.gad11b3a635.dirty


  parent reply index

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-25 18:59 [PATCH] t7300-clean: demonstrate deleting nested repo with an ignored file breakage SZEDER Gábor
2019-08-25 20:34 ` SZEDER Gábor
2019-08-25 22:32 ` Philip Oakley
2019-08-26  7:48   ` SZEDER Gábor
2019-09-05 15:47 ` [RFC PATCH v2 00/12] Fix some git clean issues Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 01/12] t7300: Add some testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 02/12] dir: fix typo in comment Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 04/12] dir: Directories should be checked for matching pathspecs too Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 05/12] dir: Make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 06/12] dir: If our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-05 21:20     ` SZEDER Gábor
2019-09-05 15:47   ` [RFC PATCH v2 11/12] clean: rewrap overly long line Elijah Newren
2019-09-05 15:47   ` [RFC PATCH v2 12/12] clean: fix theoretical path corruption Elijah Newren
2019-09-05 19:27     ` SZEDER Gábor
2019-09-07  0:34       ` Elijah Newren
2019-09-05 19:01   ` [RFC PATCH v2 00/12] Fix some git clean issues SZEDER Gábor
2019-09-07  0:33     ` Elijah Newren
2019-09-12 22:12   ` [PATCH v3 " Elijah Newren
2019-09-12 22:12     ` [PATCH v3 01/12] t7300: add testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-13 18:54       ` Junio C Hamano
2019-09-13 19:10         ` Elijah Newren
2019-09-13 20:29           ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 02/12] dir: fix typo in comment Elijah Newren
2019-09-12 22:12     ` [PATCH v3 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-13 19:05       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 04/12] dir: also check directories for matching pathspecs Elijah Newren
2019-09-12 22:12     ` [PATCH v3 05/12] dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-12 22:12     ` [PATCH v3 06/12] dir: if our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-13 19:45       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-13 20:04       ` Junio C Hamano
2019-09-12 22:12     ` [PATCH v3 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-12 22:12     ` Elijah Newren [this message]
2019-09-12 22:12     ` [PATCH v3 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-12 22:12     ` [PATCH v3 11/12] clean: rewrap overly long line Elijah Newren
2019-09-12 22:12     ` [PATCH v3 12/12] clean: fix theoretical path corruption Elijah Newren
2019-09-17 16:34     ` [PATCH v4 00/12] Fix some git clean issues Elijah Newren
2019-09-17 16:34       ` [PATCH v4 01/12] t7300: add testcases showing failure to clean specified pathspecs Elijah Newren
2019-09-17 16:34       ` [PATCH v4 02/12] dir: fix typo in comment Elijah Newren
2019-09-17 16:34       ` [PATCH v4 03/12] dir: fix off-by-one error in match_pathspec_item Elijah Newren
2019-09-17 16:34       ` [PATCH v4 04/12] dir: also check directories for matching pathspecs Elijah Newren
2019-09-25 20:39         ` [BUG] git is segfaulting, was " Denton Liu
2019-09-25 21:28           ` Elijah Newren
2019-09-25 21:55             ` Denton Liu
2019-09-26 20:35               ` Denton Liu
2019-09-27  0:12                 ` Elijah Newren
2019-09-27  1:09           ` SZEDER Gábor
2019-09-27  2:17             ` SZEDER Gábor
2019-09-27 17:10               ` Denton Liu
2019-09-30 19:11                 ` [PATCH] dir: special case check for the possibility that pathspec is NULL Elijah Newren
2019-09-30 22:31                   ` Denton Liu
2019-10-01  7:01                     ` Elijah Newren
2019-10-01 18:30                   ` [PATCH v2] " Elijah Newren
2019-10-01 18:40                     ` Denton Liu
2019-10-01 18:54                       ` Elijah Newren
2019-10-01 18:55                       ` [PATCH v3] " Elijah Newren
2019-10-01 19:35                         ` Denton Liu
2019-10-01 19:39                           ` Elijah Newren
2019-10-02 15:51                             ` Elijah Newren
2019-10-07 18:04                         ` SZEDER Gábor
2019-09-17 16:34       ` [PATCH v4 05/12] dir: make the DO_MATCH_SUBMODULE code reusable for a non-submodule case Elijah Newren
2019-09-17 16:34       ` [PATCH v4 06/12] dir: if our pathspec might match files under a dir, recurse into it Elijah Newren
2019-09-17 16:34       ` [PATCH v4 07/12] dir: add commentary explaining match_pathspec_item's return value Elijah Newren
2019-09-17 16:35       ` [PATCH v4 08/12] git-clean.txt: do not claim we will delete files with -n/--dry-run Elijah Newren
2019-09-17 16:35       ` [PATCH v4 09/12] clean: disambiguate the definition of -d Elijah Newren
2019-09-17 16:35       ` [PATCH v4 10/12] clean: avoid removing untracked files in a nested git repository Elijah Newren
2019-09-17 16:35       ` [PATCH v4 11/12] clean: rewrap overly long line Elijah Newren
2019-09-17 16:35       ` [PATCH v4 12/12] clean: fix theoretical path corruption Elijah Newren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190912221240.18057-10-newren@gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=rafa.almas@gmail.com \
    --cc=sxlijin@gmail.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Example config snippet for mirrors

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git