git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Accidentially deleted directory, bug in git clean -d?
@ 2014-03-10 10:31 Robin Pedersen
  2014-03-10 17:20 ` [PATCH] clean: respect pathspecs with "-d" Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Robin Pedersen @ 2014-03-10 10:31 UTC (permalink / raw)
  To: git

I accidentially deleted a directory using git clean. I would think
this is a bug, but I'm not sure. Was using 1.8.1, but upgraded to
1.9.0 just to see if it was still reproducable, and it was.

Here's a minimal way to reproduce:

$ git init
$ mkdir foo foobar
$ git clean -df foobar
Removing foo/
Removing foobar/
$ ls
$

I expected only "foobar" to be deleted, but "foo" was also deleted.

The same thing happens in the opposite case:

$ git init
$ mkdir foo foobar
$ git clean -df foo
Removing foo/
Removing foobar/
$ ls
$

However, it only happens when there is a common prefix in the names:

$ git init
$ mkdir foo bar
$ git clean -df foo
Removing foo/
$ ls
bar
$

In this case, "bar" was not deleted.

-- 
Best regards,

Robin Pedersen
Software Engineer

SnapTV AS
Jordmor Magdalenes vei 17
N-9519 Kviby.
Norway

robinp@snap.tv
http://www.snap.tv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] clean: respect pathspecs with "-d"
  2014-03-10 10:31 Accidentially deleted directory, bug in git clean -d? Robin Pedersen
@ 2014-03-10 17:20 ` Jeff King
  2014-03-10 17:22   ` Jeff King
  2014-03-10 17:24   ` [PATCH] clean: simplify dir/not-dir logic Jeff King
  0 siblings, 2 replies; 6+ messages in thread
From: Jeff King @ 2014-03-10 17:20 UTC (permalink / raw)
  To: Robin Pedersen; +Cc: git

git-clean uses read_directory to fill in a `struct dir` with
potential hits. However, read_directory does not actually
check against our pathspec. It uses a simplified version
that may turn up false positives. As a result, we need to
check that any hits match our pathspec. We do so reliably
for non-directories. For directories, if "-d" is not given
we check that the pathspec matched exactly (i.e., we are
even stricter, and require an explicit "git clean foo" to
clean "foo/"). But if "-d" is given, rather than relaxing
the exact match to allow a recursive match, we do not check
the pathspec at all.

This regression was introduced in 113f10f (Make git-clean a
builtin, 2007-11-11).

Signed-off-by: Jeff King <peff@peff.net>
---
On Mon, Mar 10, 2014 at 11:31:37AM +0100, Robin Pedersen wrote:

> I accidentially deleted a directory using git clean. I would think
> this is a bug, but I'm not sure. Was using 1.8.1, but upgraded to
> 1.9.0 just to see if it was still reproducable, and it was.

Definitely a bug, and it dates back quite a while.  Thanks for a very
clear bug report.

-- >8 --
 builtin/clean.c  | 5 +++--
 t/t7300-clean.sh | 8 ++++++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 114d7bf..31c1488 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -947,14 +947,15 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		if (pathspec.nr)
 			matches = dir_path_match(ent, &pathspec, 0, NULL);
 
+		if (pathspec.nr && !matches)
+			continue;
+
 		if (S_ISDIR(st.st_mode)) {
 			if (remove_directories || (matches == MATCHED_EXACTLY)) {
 				rel = relative_path(ent->name, prefix, &buf);
 				string_list_append(&del_list, rel);
 			}
 		} else {
-			if (pathspec.nr && !matches)
-				continue;
 			rel = relative_path(ent->name, prefix, &buf);
 			string_list_append(&del_list, rel);
 		}
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 710be90..0c602de 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -511,4 +511,12 @@ test_expect_success SANITY 'git clean -d with an unreadable empty directory' '
 	! test -d foo
 '
 
+test_expect_success 'git clean -d respects pathspecs' '
+	mkdir foo &&
+	mkdir foobar &&
+	git clean -df foobar &&
+	test_path_is_dir foo &&
+	test_path_is_missing foobar
+'
+
 test_done
-- 
1.9.0.403.g7a2f4b0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] clean: respect pathspecs with "-d"
  2014-03-10 17:20 ` [PATCH] clean: respect pathspecs with "-d" Jeff King
@ 2014-03-10 17:22   ` Jeff King
  2014-03-10 20:02     ` Simon Ruderich
  2014-03-10 17:24   ` [PATCH] clean: simplify dir/not-dir logic Jeff King
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff King @ 2014-03-10 17:22 UTC (permalink / raw)
  To: Robin Pedersen; +Cc: git

On Mon, Mar 10, 2014 at 01:20:02PM -0400, Jeff King wrote:

> On Mon, Mar 10, 2014 at 11:31:37AM +0100, Robin Pedersen wrote:
> 
> > I accidentially deleted a directory using git clean. I would think
> > this is a bug, but I'm not sure. Was using 1.8.1, but upgraded to
> > 1.9.0 just to see if it was still reproducable, and it was.
> 
> Definitely a bug, and it dates back quite a while.  Thanks for a very
> clear bug report.
> 
> -- >8 --

Whoops, accidentally included a scissors line here that will break
people using "git am --scissors" to pick up the patch. Here it is
correctly formatted.

-- >8 --
Subject: clean: respect pathspecs with "-d"

git-clean uses read_directory to fill in a `struct dir` with
potential hits. However, read_directory does not actually
check against our pathspec. It uses a simplified version
that may turn up false positives. As a result, we need to
check that any hits match our pathspec. We do so reliably
for non-directories. For directories, if "-d" is not given
we check that the pathspec matched exactly (i.e., we are
even stricter, and require an explicit "git clean foo" to
clean "foo/"). But if "-d" is given, rather than relaxing
the exact match to allow a recursive match, we do not check
the pathspec at all.

This regression was introduced in 113f10f (Make git-clean a
builtin, 2007-11-11).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/clean.c  | 5 +++--
 t/t7300-clean.sh | 8 ++++++++
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 114d7bf..31c1488 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -947,14 +947,15 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		if (pathspec.nr)
 			matches = dir_path_match(ent, &pathspec, 0, NULL);
 
+		if (pathspec.nr && !matches)
+			continue;
+
 		if (S_ISDIR(st.st_mode)) {
 			if (remove_directories || (matches == MATCHED_EXACTLY)) {
 				rel = relative_path(ent->name, prefix, &buf);
 				string_list_append(&del_list, rel);
 			}
 		} else {
-			if (pathspec.nr && !matches)
-				continue;
 			rel = relative_path(ent->name, prefix, &buf);
 			string_list_append(&del_list, rel);
 		}
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 710be90..0c602de 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -511,4 +511,12 @@ test_expect_success SANITY 'git clean -d with an unreadable empty directory' '
 	! test -d foo
 '
 
+test_expect_success 'git clean -d respects pathspecs' '
+	mkdir foo &&
+	mkdir foobar &&
+	git clean -df foobar &&
+	test_path_is_dir foo &&
+	test_path_is_missing foobar
+'
+
 test_done
-- 
1.9.0.403.g7a2f4b0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] clean: simplify dir/not-dir logic
  2014-03-10 17:20 ` [PATCH] clean: respect pathspecs with "-d" Jeff King
  2014-03-10 17:22   ` Jeff King
@ 2014-03-10 17:24   ` Jeff King
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff King @ 2014-03-10 17:24 UTC (permalink / raw)
  To: Robin Pedersen; +Cc: git

On Mon, Mar 10, 2014 at 01:20:02PM -0400, Jeff King wrote:

> git-clean uses read_directory to fill in a `struct dir` with
> potential hits. However, read_directory does not actually
> check against our pathspec. It uses a simplified version
> that may turn up false positives. As a result, we need to
> check that any hits match our pathspec. We do so reliably
> for non-directories. For directories, if "-d" is not given
> we check that the pathspec matched exactly (i.e., we are
> even stricter, and require an explicit "git clean foo" to
> clean "foo/"). But if "-d" is given, rather than relaxing
> the exact match to allow a recursive match, we do not check
> the pathspec at all.
> 
> This regression was introduced in 113f10f (Make git-clean a
> builtin, 2007-11-11).

The code has been cleaned up quite a bit from that original version, and
it was pretty easy to see the discrepancy between the two code paths.
However, if the code were structured like the cleanup patch below, I
think it would have been even easier.

This comes on top of my other patch. So the bug is already fixed, but I
think the end result is more readable.

-- >8 --
When we get a list of paths from read_directory, we further
prune it to create the final list of items to remove. The
code paths for directories and non-directories repeat the
same "add to list" code.

This patch restructures the code so that we don't repeat
ourselves. Also, by following a "if (condition) continue"
pattern like the pathspec check above, it makes it more
obvious that the conditional is about excluding directories
under certain circumstances.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/clean.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 31c1488..cf76b1f 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -950,15 +950,12 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		if (pathspec.nr && !matches)
 			continue;
 
-		if (S_ISDIR(st.st_mode)) {
-			if (remove_directories || (matches == MATCHED_EXACTLY)) {
-				rel = relative_path(ent->name, prefix, &buf);
-				string_list_append(&del_list, rel);
-			}
-		} else {
-			rel = relative_path(ent->name, prefix, &buf);
-			string_list_append(&del_list, rel);
-		}
+		if (S_ISDIR(st.st_mode) && !remove_directories &&
+		    matches != MATCHED_EXACTLY)
+			continue;
+
+		rel = relative_path(ent->name, prefix, &buf);
+		string_list_append(&del_list, rel);
 	}
 
 	if (interactive && del_list.nr > 0)
-- 
1.9.0.403.g7a2f4b0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] clean: respect pathspecs with "-d"
  2014-03-10 17:22   ` Jeff King
@ 2014-03-10 20:02     ` Simon Ruderich
  2014-03-10 20:37       ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Simon Ruderich @ 2014-03-10 20:02 UTC (permalink / raw)
  To: Jeff King; +Cc: Robin Pedersen, git

On Mon, Mar 10, 2014 at 01:22:15PM -0400, Jeff King wrote:
> +test_expect_success 'git clean -d respects pathspecs' '
> +	mkdir foo &&
> +	mkdir foobar &&
> +	git clean -df foobar &&
> +	test_path_is_dir foo &&
> +	test_path_is_missing foobar
> +'
> +
>  test_done

I think we should also test removing foo, which was also in the
original report, to make sure we don't match prefixes, e.g.:

test_expect_success 'git clean -d respects pathspecs' '
	mkdir foo &&
	mkdir foobar &&
	git clean -df foo &&
	test_path_is_missing foo &&
	test_path_is_dir foobar
'

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] clean: respect pathspecs with "-d"
  2014-03-10 20:02     ` Simon Ruderich
@ 2014-03-10 20:37       ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2014-03-10 20:37 UTC (permalink / raw)
  To: Simon Ruderich; +Cc: Robin Pedersen, git

On Mon, Mar 10, 2014 at 09:02:35PM +0100, Simon Ruderich wrote:

> On Mon, Mar 10, 2014 at 01:22:15PM -0400, Jeff King wrote:
> > +test_expect_success 'git clean -d respects pathspecs' '
> > +	mkdir foo &&
> > +	mkdir foobar &&
> > +	git clean -df foobar &&
> > +	test_path_is_dir foo &&
> > +	test_path_is_missing foobar
> > +'
> > +
> >  test_done
> 
> I think we should also test removing foo, which was also in the
> original report, to make sure we don't match prefixes, e.g.:
> 
> test_expect_success 'git clean -d respects pathspecs' '
> 	mkdir foo &&
> 	mkdir foobar &&
> 	git clean -df foo &&
> 	test_path_is_missing foo &&
> 	test_path_is_dir foobar
> '

Yeah, it probably makes sense to test both ways (though the root cause
and fix are the same). Those mkdirs need to be "mkdir -p", though.

Here's an updated patch with your suggestion:

-- >8 --
Subject: clean: respect pathspecs with "-d"

git-clean uses read_directory to fill in a `struct dir` with
potential hits. However, read_directory does not actually
check against our pathspec. It uses a simplified version
that may turn up false positives. As a result, we need to
check that any hits match our pathspec. We do so reliably
for non-directories. For directories, if "-d" is not given
we check that the pathspec matched exactly (i.e., we are
even stricter, and require an explicit "git clean foo" to
clean "foo/"). But if "-d" is given, rather than relaxing
the exact match to allow a recursive match, we do not check
the pathspec at all.

This regression was introduced in 113f10f (Make git-clean a
builtin, 2007-11-11).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/clean.c  |  5 +++--
 t/t7300-clean.sh | 16 ++++++++++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index 114d7bf..31c1488 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -947,14 +947,15 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		if (pathspec.nr)
 			matches = dir_path_match(ent, &pathspec, 0, NULL);
 
+		if (pathspec.nr && !matches)
+			continue;
+
 		if (S_ISDIR(st.st_mode)) {
 			if (remove_directories || (matches == MATCHED_EXACTLY)) {
 				rel = relative_path(ent->name, prefix, &buf);
 				string_list_append(&del_list, rel);
 			}
 		} else {
-			if (pathspec.nr && !matches)
-				continue;
 			rel = relative_path(ent->name, prefix, &buf);
 			string_list_append(&del_list, rel);
 		}
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 710be90..74de814 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -511,4 +511,20 @@ test_expect_success SANITY 'git clean -d with an unreadable empty directory' '
 	! test -d foo
 '
 
+test_expect_success 'git clean -d respects pathspecs (dir is prefix of pathspec)' '
+	mkdir -p foo &&
+	mkdir -p foobar &&
+	git clean -df foobar &&
+	test_path_is_dir foo &&
+	test_path_is_missing foobar
+'
+
+test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)' '
+	mkdir -p foo &&
+	mkdir -p foobar &&
+	git clean -df foo &&
+	test_path_is_missing foo &&
+	test_path_is_dir foobar
+'
+
 test_done
-- 
1.9.0.403.g7a2f4b0

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-03-10 20:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-10 10:31 Accidentially deleted directory, bug in git clean -d? Robin Pedersen
2014-03-10 17:20 ` [PATCH] clean: respect pathspecs with "-d" Jeff King
2014-03-10 17:22   ` Jeff King
2014-03-10 20:02     ` Simon Ruderich
2014-03-10 20:37       ` Jeff King
2014-03-10 17:24   ` [PATCH] clean: simplify dir/not-dir logic Jeff King

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).