git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Bug Report: .gitignore behavior is not matching in git clean and git status
@ 2017-04-28 22:56 Chris Johnson
  2017-05-01  1:41 ` Junio C Hamano
  0 siblings, 1 reply; 89+ messages in thread
From: Chris Johnson @ 2017-04-28 22:56 UTC (permalink / raw)
  To: git

I am a mailing list noob so I’m sorry if this is the wrong format or
the wrong please.

Here’s the setup for the bug (I will call it a bug but I half expect
somebody to tell me I’m an idiot):

git init
echo "/A/B/" > .gitignore
git add .gitignore && git commit -m 'Add ignore'
mkdir -p A/B
touch A/B/C
git status
git clean -dn

(my client may have screwed up the quotes, please bear with me)

Now, this may just be a case of me misunderstanding the behavior of
.gitignore, and that’s fine, but to me, git clean -dn should roughly
reflect the same behavior as git status as far as ignored files go. As
it is, git status does NOT report A/B/C as an untracked file, but git
clean will remove the entire A dir. It seems to me that one or the
other’s behavior ought to be tweaked. Which one is correct I am not
sure.

This happens on 2.5.1 as well as 2.9.2

Chris

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: Bug Report: .gitignore behavior is not matching in git clean and git status
  2017-04-28 22:56 Bug Report: .gitignore behavior is not matching in git clean and git status Chris Johnson
@ 2017-05-01  1:41 ` Junio C Hamano
  2017-05-01  1:56   ` Chris Johnson
  0 siblings, 1 reply; 89+ messages in thread
From: Junio C Hamano @ 2017-05-01  1:41 UTC (permalink / raw)
  To: Chris Johnson; +Cc: git

Chris Johnson <chrisjohnson0@gmail.com> writes:

> I am a mailing list noob so I’m sorry if this is the wrong format or
> the wrong please.
>
> Here’s the setup for the bug (I will call it a bug but I half expect
> somebody to tell me I’m an idiot):
>
> git init
> echo "/A/B/" > .gitignore

You tell Git that anything in A/B/ are uninteresting.

> git add .gitignore && git commit -m 'Add ignore'
> mkdir -p A/B
> touch A/B/C

And create an uninteresting cruft.

> git status

And Git does not bug you about it.

> git clean -dn

This incorrectly reports "Would remove A/" and if you gave 'f'
instead of 'n', it does remove A/, A/B, and A/B/C.

Despite that "git clean --help" says 'only files unknown to Git are
removed' (with an undefined term 'unknown to Git').  What it wants
the term mean can be guessed by seeing 'if the -x option is
specified, ignored files are also removed'---so 'unknown to Git'
does not include what you told .gitignore that they are
uninteresting.  IOW, Git knows they are not interesting.

It looks like a bug in "git clean -d" to me.  Do you see the same
issue if you use "git clean" without "-d"?  IOW, does it offer to
remove A/B/C?

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: Bug Report: .gitignore behavior is not matching in git clean and git status
  2017-05-01  1:41 ` Junio C Hamano
@ 2017-05-01  1:56   ` Chris Johnson
  2017-05-01  3:15     ` Junio C Hamano
  2017-05-01 13:51     ` Samuel Lijin
  0 siblings, 2 replies; 89+ messages in thread
From: Chris Johnson @ 2017-05-01  1:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Good assessment/understanding of the issue. git clean -n  does not
report anything as being targeted for removal, and git clean -f
matches that behavior. I agree with it probably being related
specifically to the -d flag.

As another experiment I modified .gitignore to ignore /A/B/C instead
of /A/B/ and the same result occurs (-n reports nothing, -dn reports
removing A/)

Lastly, I changed .gitignore to just be /A/, and in doing so, clean
-dn stops reporting that it will remove A/. I’m not exactly sure if
this last one is surprising or not.

Also, and sorry for the noise, but I did a reply-all here, but will a
reply automatically include the rest of the list? Or was reply-all the
right move?

On Sun, Apr 30, 2017 at 9:41 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Chris Johnson <chrisjohnson0@gmail.com> writes:
>
>> I am a mailing list noob so I’m sorry if this is the wrong format or
>> the wrong please.
>>
>> Here’s the setup for the bug (I will call it a bug but I half expect
>> somebody to tell me I’m an idiot):
>>
>> git init
>> echo "/A/B/" > .gitignore
>
> You tell Git that anything in A/B/ are uninteresting.
>
>> git add .gitignore && git commit -m 'Add ignore'
>> mkdir -p A/B
>> touch A/B/C
>
> And create an uninteresting cruft.
>
>> git status
>
> And Git does not bug you about it.
>
>> git clean -dn
>
> This incorrectly reports "Would remove A/" and if you gave 'f'
> instead of 'n', it does remove A/, A/B, and A/B/C.
>
> Despite that "git clean --help" says 'only files unknown to Git are
> removed' (with an undefined term 'unknown to Git').  What it wants
> the term mean can be guessed by seeing 'if the -x option is
> specified, ignored files are also removed'---so 'unknown to Git'
> does not include what you told .gitignore that they are
> uninteresting.  IOW, Git knows they are not interesting.
>
> It looks like a bug in "git clean -d" to me.  Do you see the same
> issue if you use "git clean" without "-d"?  IOW, does it offer to
> remove A/B/C?

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: Bug Report: .gitignore behavior is not matching in git clean and git status
  2017-05-01  1:56   ` Chris Johnson
@ 2017-05-01  3:15     ` Junio C Hamano
  2017-05-01 13:51     ` Samuel Lijin
  1 sibling, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-01  3:15 UTC (permalink / raw)
  To: Chris Johnson; +Cc: git

Chris Johnson <chrisjohnson0@gmail.com> writes:

> Also, and sorry for the noise, but I did a reply-all here, but will a
> reply automatically include the rest of the list? Or was reply-all the
> right move?

The convention around here is to do reply-all (in other words, make
sure that Cc: line has git@vger.kernel.org and others involved in
the discussion and mentioned on To: or Cc: line of the message you
are responding to).  Your message (i.e. the one I am responding to)
has correct addresses on To:/Cc: lines AFAICS.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: Bug Report: .gitignore behavior is not matching in git clean and git status
  2017-05-01  1:56   ` Chris Johnson
  2017-05-01  3:15     ` Junio C Hamano
@ 2017-05-01 13:51     ` Samuel Lijin
  2017-05-01 15:34       ` Samuel Lijin
  1 sibling, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-01 13:51 UTC (permalink / raw)
  To: Chris Johnson; +Cc: Junio C Hamano, git@vger.kernel.org

On Sun, Apr 30, 2017 at 8:56 PM, Chris Johnson <chrisjohnson0@gmail.com> wrote:
> Good assessment/understanding of the issue. git clean -n  does not
> report anything as being targeted for removal, and git clean -f
> matches that behavior. I agree with it probably being related
> specifically to the -d flag.
>
> As another experiment I modified .gitignore to ignore /A/B/C instead
> of /A/B/ and the same result occurs (-n reports nothing, -dn reports
> removing A/)
>
> Lastly, I changed .gitignore to just be /A/, and in doing so, clean
> -dn stops reporting that it will remove A/. I’m not exactly sure if
> this last one is surprising or not.

It doesn't seem so to me, since a trailing slash in .gitignore is an
explicit directive to ignore the entire directory, so when pruning the
tree of files/dirs to ignore, it drops everything in A/ before even
getting to A/B/C. The issue is that ignoring A/B/C shouldn't leave A/
to be cleaned.

> Also, and sorry for the noise, but I did a reply-all here, but will a
> reply automatically include the rest of the list? Or was reply-all the
> right move?
>
> On Sun, Apr 30, 2017 at 9:41 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Chris Johnson <chrisjohnson0@gmail.com> writes:
>>
>>> I am a mailing list noob so I’m sorry if this is the wrong format or
>>> the wrong please.
>>>
>>> Here’s the setup for the bug (I will call it a bug but I half expect
>>> somebody to tell me I’m an idiot):
>>>
>>> git init
>>> echo "/A/B/" > .gitignore
>>
>> You tell Git that anything in A/B/ are uninteresting.
>>
>>> git add .gitignore && git commit -m 'Add ignore'
>>> mkdir -p A/B
>>> touch A/B/C
>>
>> And create an uninteresting cruft.
>>
>>> git status
>>
>> And Git does not bug you about it.
>>
>>> git clean -dn
>>
>> This incorrectly reports "Would remove A/" and if you gave 'f'
>> instead of 'n', it does remove A/, A/B, and A/B/C.
>>
>> Despite that "git clean --help" says 'only files unknown to Git are
>> removed' (with an undefined term 'unknown to Git').  What it wants
>> the term mean can be guessed by seeing 'if the -x option is
>> specified, ignored files are also removed'---so 'unknown to Git'
>> does not include what you told .gitignore that they are
>> uninteresting.  IOW, Git knows they are not interesting.
>>
>> It looks like a bug in "git clean -d" to me.

I may be wrong (I'm not very familiar with the codebase), but I don't
think it's a bug in specifically git clean -d.

Throwing gdb at it, when builtin/clean.c:cmd_clean() calls
dir.c:fill_directory(), A/ gets appended to dir->entries, regardless
of whether or not git clean is called with or without -d. The
difference is that if git clean is called without -d, the loop that
strips out directories strips out the A/ entry.

When dir.c:fill_directory() is invoked through git status, A/ does
not, however, get appended to dir->entries. As best as I can tell,
this seems to be because git status sets the
DIR_HIDE_EMPTY_DIRECTORIES flag, whereas git clean does not (which
makes sense), but the fact that DIR_HIDE_EMPTY_DIRECTORIES is
responsible for not adding A/ to dir->entries seems to be the issue.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: Bug Report: .gitignore behavior is not matching in git clean and git status
  2017-05-01 13:51     ` Samuel Lijin
@ 2017-05-01 15:34       ` Samuel Lijin
  2017-05-02  0:26         ` Junio C Hamano
  0 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-01 15:34 UTC (permalink / raw)
  To: Chris Johnson; +Cc: Junio C Hamano, git@vger.kernel.org

After some more digging (and familiarizing myself with the
behind-the-scenes logic) the issue is that dir.c has this implicit
assumption that a directory which contains only untracked and ignored
files should itself be considered untracked. While that works fine for
use cases where we're asking if a directory should be added to the git
database, that decidedly does not make sense when we're asking if a
directory can be removed from the working tree.

I'm not sure where to proceed from here. I see two ways forward: one,
builtin/clean.c can collect ignored files when it calls
dir.c:fill_directory(), and then clean -d can prune out directories
that contain ignored files; two, path_treatment can learn about
untracked directories which contain excluded (ignored) files.

On Mon, May 1, 2017 at 8:51 AM, Samuel Lijin <sxlijin@gmail.com> wrote:
> On Sun, Apr 30, 2017 at 8:56 PM, Chris Johnson <chrisjohnson0@gmail.com> wrote:
>> Good assessment/understanding of the issue. git clean -n  does not
>> report anything as being targeted for removal, and git clean -f
>> matches that behavior. I agree with it probably being related
>> specifically to the -d flag.
>>
>> As another experiment I modified .gitignore to ignore /A/B/C instead
>> of /A/B/ and the same result occurs (-n reports nothing, -dn reports
>> removing A/)
>>
>> Lastly, I changed .gitignore to just be /A/, and in doing so, clean
>> -dn stops reporting that it will remove A/. I’m not exactly sure if
>> this last one is surprising or not.
>
> It doesn't seem so to me, since a trailing slash in .gitignore is an
> explicit directive to ignore the entire directory, so when pruning the
> tree of files/dirs to ignore, it drops everything in A/ before even
> getting to A/B/C. The issue is that ignoring A/B/C shouldn't leave A/
> to be cleaned.
>
>> Also, and sorry for the noise, but I did a reply-all here, but will a
>> reply automatically include the rest of the list? Or was reply-all the
>> right move?
>>
>> On Sun, Apr 30, 2017 at 9:41 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>> Chris Johnson <chrisjohnson0@gmail.com> writes:
>>>
>>>> I am a mailing list noob so I’m sorry if this is the wrong format or
>>>> the wrong please.
>>>>
>>>> Here’s the setup for the bug (I will call it a bug but I half expect
>>>> somebody to tell me I’m an idiot):
>>>>
>>>> git init
>>>> echo "/A/B/" > .gitignore
>>>
>>> You tell Git that anything in A/B/ are uninteresting.
>>>
>>>> git add .gitignore && git commit -m 'Add ignore'
>>>> mkdir -p A/B
>>>> touch A/B/C
>>>
>>> And create an uninteresting cruft.
>>>
>>>> git status
>>>
>>> And Git does not bug you about it.
>>>
>>>> git clean -dn
>>>
>>> This incorrectly reports "Would remove A/" and if you gave 'f'
>>> instead of 'n', it does remove A/, A/B, and A/B/C.
>>>
>>> Despite that "git clean --help" says 'only files unknown to Git are
>>> removed' (with an undefined term 'unknown to Git').  What it wants
>>> the term mean can be guessed by seeing 'if the -x option is
>>> specified, ignored files are also removed'---so 'unknown to Git'
>>> does not include what you told .gitignore that they are
>>> uninteresting.  IOW, Git knows they are not interesting.
>>>
>>> It looks like a bug in "git clean -d" to me.
>
> I may be wrong (I'm not very familiar with the codebase), but I don't
> think it's a bug in specifically git clean -d.
>
> Throwing gdb at it, when builtin/clean.c:cmd_clean() calls
> dir.c:fill_directory(), A/ gets appended to dir->entries, regardless
> of whether or not git clean is called with or without -d. The
> difference is that if git clean is called without -d, the loop that
> strips out directories strips out the A/ entry.
>
> When dir.c:fill_directory() is invoked through git status, A/ does
> not, however, get appended to dir->entries. As best as I can tell,
> this seems to be because git status sets the
> DIR_HIDE_EMPTY_DIRECTORIES flag, whereas git clean does not (which
> makes sense), but the fact that DIR_HIDE_EMPTY_DIRECTORIES is
> responsible for not adding A/ to dir->entries seems to be the issue.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: Bug Report: .gitignore behavior is not matching in git clean and git status
  2017-05-01 15:34       ` Samuel Lijin
@ 2017-05-02  0:26         ` Junio C Hamano
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
  0 siblings, 1 reply; 89+ messages in thread
From: Junio C Hamano @ 2017-05-02  0:26 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: Chris Johnson, git@vger.kernel.org

Samuel Lijin <sxlijin@gmail.com> writes:

> After some more digging (and familiarizing myself with the
> behind-the-scenes logic) the issue is that dir.c has this implicit
> assumption that a directory which contains only untracked and ignored
> files should itself be considered untracked. While that works fine for
> use cases where we're asking if a directory should be added to the git
> database, that decidedly does not make sense when we're asking if a
> directory can be removed from the working tree.

Thanks for digging.

> I'm not sure where to proceed from here. I see two ways forward: one,
> builtin/clean.c can collect ignored files when it calls
> dir.c:fill_directory(), and then clean -d can prune out directories
> that contain ignored files; two, path_treatment can learn about
> untracked directories which contain excluded (ignored) files.

My gut feeling is that the former approach would be of lesser
impact.  Directory A/ can be removed "clean" without "-x" when there
is nothing tracked in there in the index and there is no ignored
paths.  Having zero untracked files there or one or more untracked
file there do not matter---they are all subject to removal by
"clean".

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files
  2017-05-02  0:26         ` Junio C Hamano
@ 2017-05-03  3:29           ` Samuel Lijin
  2017-05-03  3:29             ` [PATCH 1/7] t7300: skip untracked dirs containing " Samuel Lijin
                               ` (16 more replies)
  0 siblings, 17 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-03  3:29 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

This patch series fixes the bug where git clean -d culls directories containing
only untracked and ignored files, by first teaching read_directory() and
read_directory_recursive() to search "untracked" directories (read: directories
*treated* as untracked because they only contain untracked and ignored files)
for ignored contents, and then teaching cmd_clean() to skip untracked
directories containing ignored files.

This does, however, introduce a breaking change in the behavior of git status:
when invoked with --ignored, git status will now return ignored files in an
untracked directory, whereas previously it would not.

First patches to the actual C code that I'm sending out! :D Looking forward to
feedback - the changes I made in read_directory_recursive() and read_directory()
feel a bit hacky, but I'm not sure how to get around that.

Samuel Lijin (7):
  t7300: skip untracked dirs containing ignored files
  dir: recurse into untracked dirs for ignored files
  dir: add method to check if a dir_entry lexically contains another
  dir: hide untracked contents of untracked dirs
  dir: change linkage of cmp_name() and check_contains()
  builtin/clean: teach clean -d to skip dirs containing ignored files
  t7061: check for ignored file in untracked dir

 builtin/clean.c            | 24 ++++++++++++++++--
 dir.c                      | 61 ++++++++++++++++++++++++++++++++++++++++++++--
 dir.h                      |  3 +++
 t/t7061-wtstatus-ignore.sh |  1 +
 t/t7300-clean.sh           | 10 ++++++++
 5 files changed, 95 insertions(+), 4 deletions(-)

-- 
2.12.2


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH 1/7] t7300: skip untracked dirs containing ignored files
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
@ 2017-05-03  3:29             ` Samuel Lijin
  2017-05-03 18:19               ` Stefan Beller
  2017-05-03  3:29             ` [PATCH 2/7] dir: recurse into untracked dirs for " Samuel Lijin
                               ` (15 subsequent siblings)
  16 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-03  3:29 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

If git sees a directory which contains only untracked and ignored
files, clean -d should not remove that directory.
---
 t/t7300-clean.sh | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index b89fd2a6a..948a455e8 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,4 +653,14 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
+test_expect_success 'git clean -d skips untracked dirs containing ignored files' '
+	echo /foo/bar >.gitignore &&
+	rm -rf foo &&
+	mkdir -p foo &&
+	touch foo/bar &&
+	git clean -df &&
+	test_path_is_file foo/bar &&
+	test_path_is_dir foo
+'
+
 test_done
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 2/7] dir: recurse into untracked dirs for ignored files
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
  2017-05-03  3:29             ` [PATCH 1/7] t7300: skip untracked dirs containing " Samuel Lijin
@ 2017-05-03  3:29             ` Samuel Lijin
  2017-05-03  3:29             ` [PATCH 3/7] dir: add method to check if a dir_entry lexically contains another Samuel Lijin
                               ` (14 subsequent siblings)
  16 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-03  3:29 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We consider directories containing only untracked and ignored files to
be themselves untracked, which in the usual case means we don't have to
search these directories. This is problematic when we want to collect
ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
read_directory_recursive() to recurse into untracked directories to find
the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
has the side effect of also collecting all untracked files in untracked
directories as well.
---
 dir.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index f451bfa48..6bd0350e9 100644
--- a/dir.c
+++ b/dir.c
@@ -1784,7 +1784,12 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 			dir_state = state;
 
 		/* recurse into subdir if instructed by treat_path */
-		if (state == path_recurse) {
+		if ((state == path_recurse) ||
+				((get_dtype(cdir.de, path.buf, path.len) == DT_DIR) &&
+				 (state == path_untracked) &&
+				 (dir->flags & DIR_SHOW_IGNORED_TOO))
+				)
+		{
 			struct untracked_cache_dir *ud;
 			ud = lookup_untracked(dir->untracked, untracked,
 					      path.buf + baselen,
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 3/7] dir: add method to check if a dir_entry lexically contains another
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
  2017-05-03  3:29             ` [PATCH 1/7] t7300: skip untracked dirs containing " Samuel Lijin
  2017-05-03  3:29             ` [PATCH 2/7] dir: recurse into untracked dirs for " Samuel Lijin
@ 2017-05-03  3:29             ` Samuel Lijin
  2017-05-03 18:09               ` Stefan Beller
  2017-05-03  3:29             ` [PATCH 4/7] dir: hide untracked contents of untracked dirs Samuel Lijin
                               ` (13 subsequent siblings)
  16 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-03  3:29 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Introduce a method that allows us to check if one dir_entry corresponds
to a path which contains the path corresponding to another dir_entry.
---
 dir.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/dir.c b/dir.c
index 6bd0350e9..25cb9eadf 100644
--- a/dir.c
+++ b/dir.c
@@ -1852,6 +1852,14 @@ static int cmp_name(const void *p1, const void *p2)
 	return name_compare(e1->name, e1->len, e2->name, e2->len);
 }
 
+// check if *out lexically contains *in
+static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+{
+	return (out->len < in->len) &&
+			(out->name[out->len - 1] == '/') &&
+			!memcmp(out->name, in->name, out->len);
+}
+
 static int treat_leading_path(struct dir_struct *dir,
 			      const char *path, int len,
 			      const struct pathspec *pathspec)
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 4/7] dir: hide untracked contents of untracked dirs
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (2 preceding siblings ...)
  2017-05-03  3:29             ` [PATCH 3/7] dir: add method to check if a dir_entry lexically contains another Samuel Lijin
@ 2017-05-03  3:29             ` Samuel Lijin
  2017-05-03  3:29             ` [PATCH 5/7] dir: change linkage of cmp_name() and check_contains() Samuel Lijin
                               ` (12 subsequent siblings)
  16 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-03  3:29 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

When we taught read_directory_recursive() to recurse into untracked
directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
had the side effect of teaching it to collect the untracked contents of
untracked directories. It does not make sense to return these, so we
teach read_directory() to strip dir->entries of any such untracked
contents.
---
 dir.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/dir.c b/dir.c
index 25cb9eadf..f0ddb4608 100644
--- a/dir.c
+++ b/dir.c
@@ -2075,6 +2075,50 @@ int read_directory(struct dir_struct *dir, const char *path,
 		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
 	QSORT(dir->entries, dir->nr, cmp_name);
 	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
+
+	// if collecting ignored files, never consider a directory containing
+	// ignored files to be untracked
+	if (dir->flags & DIR_SHOW_IGNORED_TOO) {
+		int i, j, nr_removed = 0;
+
+		// remove from dir->entries untracked contents of untracked dirs
+		for (i = 0; i < dir->nr; i++) {
+			if (!dir->entries[i])
+				continue;
+
+			for (j = i + 1; j < dir->nr; j++) {
+				if (!dir->entries[j])
+					continue;
+				if (check_contains(dir->entries[i], dir->entries[j])) {
+					nr_removed++;
+					free(dir->entries[j]);
+					dir->entries[j] = NULL;
+				}
+				else {
+					break;
+				}
+			}
+		}
+
+		// strip dir->entries of NULLs
+		if (nr_removed) {
+			for (i = 0;;) {
+				while (i < dir->nr && dir->entries[i])
+					i++;
+				if (i == dir->nr)
+					break;
+				j = i;
+				while (j < dir->nr && !dir->entries[j])
+					j++;
+				if (j == dir->nr)
+					break;
+				dir->entries[i] = dir->entries[j];
+				dir->entries[j] = NULL;
+			}
+			dir->nr -= nr_removed;
+		}
+	}
+
 	if (dir->untracked) {
 		static struct trace_key trace_untracked_stats = TRACE_KEY_INIT(UNTRACKED_STATS);
 		trace_printf_key(&trace_untracked_stats,
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 5/7] dir: change linkage of cmp_name() and check_contains()
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (3 preceding siblings ...)
  2017-05-03  3:29             ` [PATCH 4/7] dir: hide untracked contents of untracked dirs Samuel Lijin
@ 2017-05-03  3:29             ` Samuel Lijin
  2017-05-03  3:29             ` [PATCH 6/7] builtin/clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
                               ` (11 subsequent siblings)
  16 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-03  3:29 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

---
 dir.c | 4 ++--
 dir.h | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index f0ddb4608..91103b561 100644
--- a/dir.c
+++ b/dir.c
@@ -1844,7 +1844,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 	return dir_state;
 }
 
-static int cmp_name(const void *p1, const void *p2)
+int cmp_name(const void *p1, const void *p2)
 {
 	const struct dir_entry *e1 = *(const struct dir_entry **)p1;
 	const struct dir_entry *e2 = *(const struct dir_entry **)p2;
@@ -1853,7 +1853,7 @@ static int cmp_name(const void *p1, const void *p2)
 }
 
 // check if *out lexically contains *in
-static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+int check_contains(const struct dir_entry *out, const struct dir_entry *in)
 {
 	return (out->len < in->len) &&
 			(out->name[out->len - 1] == '/') &&
diff --git a/dir.h b/dir.h
index bf23a470a..1ddd8b611 100644
--- a/dir.h
+++ b/dir.h
@@ -326,6 +326,9 @@ static inline int dir_path_match(const struct dir_entry *ent,
 			      has_trailing_dir);
 }
 
+int cmp_name(const void *p1, const void *p2);
+int check_contains(const struct dir_entry *out, const struct dir_entry *in);
+
 void untracked_cache_invalidate_path(struct index_state *, const char *);
 void untracked_cache_remove_from_index(struct index_state *, const char *);
 void untracked_cache_add_to_index(struct index_state *, const char *);
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 6/7] builtin/clean: teach clean -d to skip dirs containing ignored files
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (4 preceding siblings ...)
  2017-05-03  3:29             ` [PATCH 5/7] dir: change linkage of cmp_name() and check_contains() Samuel Lijin
@ 2017-05-03  3:29             ` Samuel Lijin
  2017-05-03  3:29             ` [PATCH 7/7] t7061: check for ignored file in untracked dir Samuel Lijin
                               ` (10 subsequent siblings)
  16 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-03  3:29 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

There is an implicit assumption that a directory containing only
untracked and ignored files should itself be considered untracked. This
makes sense in use cases where we're asking if a directory should be
added to the git database, but not when we're asking if a directory can
be safely removed from the working tree; as a result, clean -d would
assume that an "untracked" directory containing ignored files could be
deleted.

To get around this, we teach clean -d to collect ignored files and skip
over so-called "untracked" directories if they contain any ignored
files.
---
 builtin/clean.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index d861f836a..368e19427 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -859,7 +859,7 @@ static void interactive_main_loop(void)
 
 int cmd_clean(int argc, const char **argv, const char *prefix)
 {
-	int i, res;
+	int i, j, res;
 	int dry_run = 0, remove_directories = 0, quiet = 0, ignored = 0;
 	int ignored_only = 0, config_set = 0, errors = 0, gone = 1;
 	int rm_flags = REMOVE_DIR_KEEP_NESTED_GIT;
@@ -911,6 +911,9 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 				  " refusing to clean"));
 	}
 
+	if (remove_directories)
+		dir.flags |= DIR_SHOW_IGNORED_TOO;
+
 	if (force > 1)
 		rm_flags = 0;
 
@@ -932,7 +935,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 
 	fill_directory(&dir, &pathspec);
 
-	for (i = 0; i < dir.nr; i++) {
+	for (j = i = 0; i < dir.nr; i++) {
 		struct dir_entry *ent = dir.entries[i];
 		int matches = 0;
 		struct stat st;
@@ -954,10 +957,27 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		    matches != MATCHED_EXACTLY)
 			continue;
 
+		// skip any dir.entries which contains a dir.ignored
+		for (; j < dir.ignored_nr; j++) {
+			if (cmp_name(&dir.entries[i],
+						&dir.ignored[j]) < 0)
+				break;
+		}
+		if ((j < dir.ignored_nr) &&
+				check_contains(dir.entries[i], dir.ignored[j])) {
+			continue;
+		}
+
 		rel = relative_path(ent->name, prefix, &buf);
 		string_list_append(&del_list, rel);
 	}
 
+	for (i = 0; i < dir.nr; i++)
+		free(dir.entries[i]);
+
+	for (i = 0; i < dir.ignored_nr; i++)
+		free(dir.ignored[i]);
+
 	if (interactive && del_list.nr > 0)
 		interactive_main_loop();
 
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH 7/7] t7061: check for ignored file in untracked dir
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (5 preceding siblings ...)
  2017-05-03  3:29             ` [PATCH 6/7] builtin/clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
@ 2017-05-03  3:29             ` Samuel Lijin
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (9 subsequent siblings)
  16 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-03  3:29 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

---
 t/t7061-wtstatus-ignore.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index cdc0747bf..fc6013ba3 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -9,6 +9,7 @@ cat >expected <<\EOF
 ?? actual
 ?? expected
 ?? untracked/
+!! untracked/ignored
 EOF
 
 test_expect_success 'status untracked directory with --ignored' '
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH 3/7] dir: add method to check if a dir_entry lexically contains another
  2017-05-03  3:29             ` [PATCH 3/7] dir: add method to check if a dir_entry lexically contains another Samuel Lijin
@ 2017-05-03 18:09               ` Stefan Beller
  0 siblings, 0 replies; 89+ messages in thread
From: Stefan Beller @ 2017-05-03 18:09 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git@vger.kernel.org

On Tue, May 2, 2017 at 8:29 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
> Introduce a method that allows us to check if one dir_entry corresponds
> to a path which contains the path corresponding to another dir_entry.
> ---
>  dir.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/dir.c b/dir.c
> index 6bd0350e9..25cb9eadf 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1852,6 +1852,14 @@ static int cmp_name(const void *p1, const void *p2)
>         return name_compare(e1->name, e1->len, e2->name, e2->len);
>  }
>
> +// check if *out lexically contains *in

Thanks for adding a comment to describe what the function ought to do.

However our Coding style prefers

/* comments this way */

/*
 * or in case of multi-
 * line comments,
 * this way.
 */

I think one of the ancient compilers just dislikes // as comment style.

> +static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
> +{
> +       return (out->len < in->len) &&
> +                       (out->name[out->len - 1] == '/') &&
> +                       !memcmp(out->name, in->name, out->len);
> +}
> +
>  static int treat_leading_path(struct dir_struct *dir,
>                               const char *path, int len,
>                               const struct pathspec *pathspec)
> --
> 2.12.2
>

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 1/7] t7300: skip untracked dirs containing ignored files
  2017-05-03  3:29             ` [PATCH 1/7] t7300: skip untracked dirs containing " Samuel Lijin
@ 2017-05-03 18:19               ` Stefan Beller
  2017-05-03 18:26                 ` Samuel Lijin
  0 siblings, 1 reply; 89+ messages in thread
From: Stefan Beller @ 2017-05-03 18:19 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git@vger.kernel.org

On Tue, May 2, 2017 at 8:29 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
> If git sees a directory which contains only untracked and ignored
> files, clean -d should not remove that directory.

Yes that states a fact; it is not clear why we want to have this test here
and now. (Is it testing for a recently fixed regression?)
Are you just introducing the test to demonstrate it keeps working later on?
Do you plan on changing this behavior in a later patch?)

> ---
>  t/t7300-clean.sh | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
> index b89fd2a6a..948a455e8 100755
> --- a/t/t7300-clean.sh
> +++ b/t/t7300-clean.sh
> @@ -653,4 +653,14 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
>         test_path_is_dir foobar
>  '
>
> +test_expect_success 'git clean -d skips untracked dirs containing ignored files' '
> +       echo /foo/bar >.gitignore &&
> +       rm -rf foo &&
> +       mkdir -p foo &&
> +       touch foo/bar &&
> +       git clean -df &&
> +       test_path_is_file foo/bar &&
> +       test_path_is_dir foo
> +'

The test makes sense, though I am wondering if we can integrate
this test into another test e.g. "ok 15 - git clean -d".

Is the -f flag needed?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 1/7] t7300: skip untracked dirs containing ignored files
  2017-05-03 18:19               ` Stefan Beller
@ 2017-05-03 18:26                 ` Samuel Lijin
  2017-05-03 18:34                   ` Stefan Beller
  0 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-03 18:26 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org

On Wed, May 3, 2017 at 1:19 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, May 2, 2017 at 8:29 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
>> If git sees a directory which contains only untracked and ignored
>> files, clean -d should not remove that directory.
>
> Yes that states a fact; it is not clear why we want to have this test here
> and now. (Is it testing for a recently fixed regression?)

It was recently discovered that this is not true of clean -d (and I'm
not sure if it ever was, to be honest). See
http://public-inbox.org/git/xmqqshkof6jd.fsf@gitster.mtv.corp.google.com/T/#mf541c06250724bb000461d210b4ed157e415a596

> Are you just introducing the test to demonstrate it keeps working later on?
> Do you plan on changing this behavior in a later patch?)

The idea was to introduce the broken test and then have the rest of
the patch series fix it; Junio pointed out to me (off-list, since he
was responding from his phone, I think) that I got the convention
backwards, so I'll be changing this in the next version.

>> ---
>>  t/t7300-clean.sh | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
>> index b89fd2a6a..948a455e8 100755
>> --- a/t/t7300-clean.sh
>> +++ b/t/t7300-clean.sh
>> @@ -653,4 +653,14 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
>>         test_path_is_dir foobar
>>  '
>>
>> +test_expect_success 'git clean -d skips untracked dirs containing ignored files' '
>> +       echo /foo/bar >.gitignore &&
>> +       rm -rf foo &&
>> +       mkdir -p foo &&
>> +       touch foo/bar &&
>> +       git clean -df &&
>> +       test_path_is_file foo/bar &&
>> +       test_path_is_dir foo
>> +'
>
> The test makes sense, though I am wondering if we can integrate
> this test into another test e.g. "ok 15 - git clean -d".
>
> Is the -f flag needed?

Will look into both.

> Thanks,
> Stefan

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH 1/7] t7300: skip untracked dirs containing ignored files
  2017-05-03 18:26                 ` Samuel Lijin
@ 2017-05-03 18:34                   ` Stefan Beller
  0 siblings, 0 replies; 89+ messages in thread
From: Stefan Beller @ 2017-05-03 18:34 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git@vger.kernel.org

On Wed, May 3, 2017 at 11:26 AM, Samuel Lijin <sxlijin@gmail.com> wrote:
> On Wed, May 3, 2017 at 1:19 PM, Stefan Beller <sbeller@google.com> wrote:
>> On Tue, May 2, 2017 at 8:29 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
>>> If git sees a directory which contains only untracked and ignored
>>> files, clean -d should not remove that directory.
>>
>> Yes that states a fact; it is not clear why we want to have this test here
>> and now. (Is it testing for a recently fixed regression?)
>
> It was recently discovered that this is not true of clean -d (and I'm
> not sure if it ever was, to be honest). See
> http://public-inbox.org/git/xmqqshkof6jd.fsf@gitster.mtv.corp.google.com/T/#mf541c06250724bb000461d210b4ed157e415a596
>
>> Are you just introducing the test to demonstrate it keeps working later on?
>> Do you plan on changing this behavior in a later patch?)
>
> The idea was to introduce the broken test and then have the rest of
> the patch series fix it;

This is valuable information for the commit message.
Also to keep git-bisect happy we want to have all tests passing on all commits,
which this does not. However to further signal your intent, you can mark it
as test_expect_failure. This is marking a test as "if Git was not buggy, this
is a test that would pass", so it actually marks success when it
fails, for example
when running t7300:

...
ok 31 - should not clean submodules
ok 32 - should avoid cleaning possible submodules
ok 33 - nested (empty) git should be kept
ok 34 - nested bare repositories should be cleaned
not ok 35 - nested (empty) bare repositories should be cleaned even
when in .git # TODO known breakage
not ok 36 - nested (non-empty) bare repositories should be cleaned
even when in .git # TODO known breakage
ok 37 - giving path in nested git work tree will remove it
ok 38 - giving path to nested .git will not remove it
ok 39 - giving path to nested .git/ will remove contents
ok 40 - force removal of nested git work tree
...

Similarly if you were to mark that test as expecting failure, we'd see:

no ok  - 45 git clean -d skips untracked dirs containing ignored files
# TODO known breakage

And then in a later patch, when you actually fix the bug, you would just flip it
to test_expect_success.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (6 preceding siblings ...)
  2017-05-03  3:29             ` [PATCH 7/7] t7061: check for ignored file in untracked dir Samuel Lijin
@ 2017-05-05 10:46             ` Samuel Lijin
  2017-05-08  4:26               ` Junio C Hamano
                                 ` (9 more replies)
  2017-05-05 10:46             ` [PATCH v2 1/9] t7300: skip untracked dirs containing ignored files Samuel Lijin
                               ` (8 subsequent siblings)
  16 siblings, 10 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-05 10:46 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Addresses the issues raised by Stefan and Junio (thanks for your feedback) about not using C99-style comments and keeping tests working on every commit to prevent breaking git bisect. (About the latter one: is it necessary to prevent compiler warnings, in addition to compiler errors? Because if so I should probably squash some of the commits together.)

Note that this introduces a breaking change in the behavior of git status: when invoked with --ignored, git status will now return ignored files in an untracked directory, whereas previously it would not.

It's possible that there are standard practices that I might have missed, so if there is anything along those lines, I'd appreciate you letting me know. (As an aside, about the git bisect thing: is there a script somewhere that people use to test patch series before sending them out?)

Samuel Lijin (9):
  t7300: skip untracked dirs containing ignored files
  t7061: expect failure where expected behavior will change
  dir: recurse into untracked dirs for ignored files
  dir: add method to check if a dir_entry lexically contains another
  dir: hide untracked contents of untracked dirs
  dir: change linkage of cmp_name() and check_contains()
  builtin/clean: teach clean -d to skip dirs containing ignored files
  t7300: clean -d now skips untracked dirs containing ignored files
  t7061: expect ignored files in untracked dirs

 builtin/clean.c            | 24 ++++++++++++++++--
 dir.c                      | 61 ++++++++++++++++++++++++++++++++++++++++++++--
 dir.h                      |  3 +++
 t/t7061-wtstatus-ignore.sh |  1 +
 t/t7300-clean.sh           | 10 ++++++++
 5 files changed, 95 insertions(+), 4 deletions(-)

-- 
2.12.2


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v2 1/9] t7300: skip untracked dirs containing ignored files
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (7 preceding siblings ...)
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
@ 2017-05-05 10:46             ` Samuel Lijin
  2017-05-07 19:12               ` Torsten Bögershausen
  2017-05-05 10:46             ` [PATCH v2 2/9] t7061: expect failure where expected behavior will change Samuel Lijin
                               ` (7 subsequent siblings)
  16 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-05 10:46 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

If git sees a directory which contains only untracked and ignored
files, clean -d should not remove that directory. It was recently
discovered that this is *not* true of git clean -d, and it's possible
that this has never worked correctly; this test and its accompanying
patch series aims to fix that.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7300-clean.sh | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index b89fd2a6a..252c75b40 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,4 +653,14 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
+test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
+	echo /foo/bar >.gitignore &&
+	rm -rf foo &&
+	mkdir -p foo &&
+	touch foo/bar &&
+	git clean -df &&
+	test_path_is_file foo/bar &&
+	test_path_is_dir foo
+'
+
 test_done
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v2 2/9] t7061: expect failure where expected behavior will change
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (8 preceding siblings ...)
  2017-05-05 10:46             ` [PATCH v2 1/9] t7300: skip untracked dirs containing ignored files Samuel Lijin
@ 2017-05-05 10:46             ` Samuel Lijin
  2017-05-08  4:34               ` Junio C Hamano
  2017-05-05 10:46             ` [PATCH v2 3/9] dir: recurse into untracked dirs for ignored files Samuel Lijin
                               ` (6 subsequent siblings)
  16 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-05 10:46 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

This changes tests for `status --ignored` from test_expect_success to
test_expect_failure in preparation for a change in its expected behavior
(namely, that ignored files in untracked dirs will be reported).

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7061-wtstatus-ignore.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index cdc0747bf..dc3be92a2 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -11,7 +11,7 @@ cat >expected <<\EOF
 ?? untracked/
 EOF
 
-test_expect_success 'status untracked directory with --ignored' '
+test_expect_failure 'status untracked directory with --ignored' '
 	echo "ignored" >.gitignore &&
 	mkdir untracked &&
 	: >untracked/ignored &&
@@ -20,7 +20,7 @@ test_expect_success 'status untracked directory with --ignored' '
 	test_cmp expected actual
 '
 
-test_expect_success 'same with gitignore starting with BOM' '
+test_expect_failure 'same with gitignore starting with BOM' '
 	printf "\357\273\277ignored\n" >.gitignore &&
 	mkdir -p untracked &&
 	: >untracked/ignored &&
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v2 3/9] dir: recurse into untracked dirs for ignored files
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (9 preceding siblings ...)
  2017-05-05 10:46             ` [PATCH v2 2/9] t7061: expect failure where expected behavior will change Samuel Lijin
@ 2017-05-05 10:46             ` Samuel Lijin
  2017-05-05 10:46             ` [PATCH v2 4/9] dir: add method to check if a dir_entry lexically contains another Samuel Lijin
                               ` (5 subsequent siblings)
  16 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-05 10:46 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We consider directories containing only untracked and ignored files to
be themselves untracked, which in the usual case means we don't have to
search these directories. This is problematic when we want to collect
ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
read_directory_recursive() to recurse into untracked directories to find
the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
has the side effect of also collecting all untracked files in untracked
directories as well.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index f451bfa48..6bd0350e9 100644
--- a/dir.c
+++ b/dir.c
@@ -1784,7 +1784,12 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 			dir_state = state;
 
 		/* recurse into subdir if instructed by treat_path */
-		if (state == path_recurse) {
+		if ((state == path_recurse) ||
+				((get_dtype(cdir.de, path.buf, path.len) == DT_DIR) &&
+				 (state == path_untracked) &&
+				 (dir->flags & DIR_SHOW_IGNORED_TOO))
+				)
+		{
 			struct untracked_cache_dir *ud;
 			ud = lookup_untracked(dir->untracked, untracked,
 					      path.buf + baselen,
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v2 4/9] dir: add method to check if a dir_entry lexically contains another
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (10 preceding siblings ...)
  2017-05-05 10:46             ` [PATCH v2 3/9] dir: recurse into untracked dirs for ignored files Samuel Lijin
@ 2017-05-05 10:46             ` Samuel Lijin
  2017-05-05 10:46             ` [PATCH v2 5/9] dir: hide untracked contents of untracked dirs Samuel Lijin
                               ` (4 subsequent siblings)
  16 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-05 10:46 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Introduce a method that allows us to check if one dir_entry corresponds
to a path which contains the path corresponding to another dir_entry.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/dir.c b/dir.c
index 6bd0350e9..4739087f4 100644
--- a/dir.c
+++ b/dir.c
@@ -1852,6 +1852,14 @@ static int cmp_name(const void *p1, const void *p2)
 	return name_compare(e1->name, e1->len, e2->name, e2->len);
 }
 
+/* check if *out lexically contains *in */
+static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+{
+	return (out->len < in->len) &&
+			(out->name[out->len - 1] == '/') &&
+			!memcmp(out->name, in->name, out->len);
+}
+
 static int treat_leading_path(struct dir_struct *dir,
 			      const char *path, int len,
 			      const struct pathspec *pathspec)
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v2 5/9] dir: hide untracked contents of untracked dirs
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (11 preceding siblings ...)
  2017-05-05 10:46             ` [PATCH v2 4/9] dir: add method to check if a dir_entry lexically contains another Samuel Lijin
@ 2017-05-05 10:46             ` Samuel Lijin
  2017-05-05 10:46             ` [PATCH v2 6/9] dir: change linkage of cmp_name() and check_contains() Samuel Lijin
                               ` (3 subsequent siblings)
  16 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-05 10:46 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

When we taught read_directory_recursive() to recurse into untracked
directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
had the side effect of teaching it to collect the untracked contents of
untracked directories. It does not make sense to return these, so we
teach read_directory() to strip dir->entries of any such untracked
contents.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/dir.c b/dir.c
index 4739087f4..fd445ee9e 100644
--- a/dir.c
+++ b/dir.c
@@ -2075,6 +2075,50 @@ int read_directory(struct dir_struct *dir, const char *path,
 		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
 	QSORT(dir->entries, dir->nr, cmp_name);
 	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
+
+	// if collecting ignored files, never consider a directory containing
+	// ignored files to be untracked
+	if (dir->flags & DIR_SHOW_IGNORED_TOO) {
+		int i, j, nr_removed = 0;
+
+		// remove from dir->entries untracked contents of untracked dirs
+		for (i = 0; i < dir->nr; i++) {
+			if (!dir->entries[i])
+				continue;
+
+			for (j = i + 1; j < dir->nr; j++) {
+				if (!dir->entries[j])
+					continue;
+				if (check_contains(dir->entries[i], dir->entries[j])) {
+					nr_removed++;
+					free(dir->entries[j]);
+					dir->entries[j] = NULL;
+				}
+				else {
+					break;
+				}
+			}
+		}
+
+		// strip dir->entries of NULLs
+		if (nr_removed) {
+			for (i = 0;;) {
+				while (i < dir->nr && dir->entries[i])
+					i++;
+				if (i == dir->nr)
+					break;
+				j = i;
+				while (j < dir->nr && !dir->entries[j])
+					j++;
+				if (j == dir->nr)
+					break;
+				dir->entries[i] = dir->entries[j];
+				dir->entries[j] = NULL;
+			}
+			dir->nr -= nr_removed;
+		}
+	}
+
 	if (dir->untracked) {
 		static struct trace_key trace_untracked_stats = TRACE_KEY_INIT(UNTRACKED_STATS);
 		trace_printf_key(&trace_untracked_stats,
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v2 6/9] dir: change linkage of cmp_name() and check_contains()
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (12 preceding siblings ...)
  2017-05-05 10:46             ` [PATCH v2 5/9] dir: hide untracked contents of untracked dirs Samuel Lijin
@ 2017-05-05 10:46             ` Samuel Lijin
  2017-05-08  4:37               ` Junio C Hamano
  2017-05-05 10:46             ` [PATCH v2 7/9] builtin/clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
                               ` (2 subsequent siblings)
  16 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-05 10:46 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 4 ++--
 dir.h | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index fd445ee9e..6a71683df 100644
--- a/dir.c
+++ b/dir.c
@@ -1844,7 +1844,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 	return dir_state;
 }
 
-static int cmp_name(const void *p1, const void *p2)
+int cmp_name(const void *p1, const void *p2)
 {
 	const struct dir_entry *e1 = *(const struct dir_entry **)p1;
 	const struct dir_entry *e2 = *(const struct dir_entry **)p2;
@@ -1853,7 +1853,7 @@ static int cmp_name(const void *p1, const void *p2)
 }
 
 /* check if *out lexically contains *in */
-static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+int check_contains(const struct dir_entry *out, const struct dir_entry *in)
 {
 	return (out->len < in->len) &&
 			(out->name[out->len - 1] == '/') &&
diff --git a/dir.h b/dir.h
index bf23a470a..1ddd8b611 100644
--- a/dir.h
+++ b/dir.h
@@ -326,6 +326,9 @@ static inline int dir_path_match(const struct dir_entry *ent,
 			      has_trailing_dir);
 }
 
+int cmp_name(const void *p1, const void *p2);
+int check_contains(const struct dir_entry *out, const struct dir_entry *in);
+
 void untracked_cache_invalidate_path(struct index_state *, const char *);
 void untracked_cache_remove_from_index(struct index_state *, const char *);
 void untracked_cache_add_to_index(struct index_state *, const char *);
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v2 7/9] builtin/clean: teach clean -d to skip dirs containing ignored files
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (13 preceding siblings ...)
  2017-05-05 10:46             ` [PATCH v2 6/9] dir: change linkage of cmp_name() and check_contains() Samuel Lijin
@ 2017-05-05 10:46             ` Samuel Lijin
  2017-05-05 10:46             ` [PATCH v2 8/9] t7300: clean -d now skips untracked " Samuel Lijin
  2017-05-05 10:46             ` [PATCH v2 9/9] t7061: expect ignored files in untracked dirs Samuel Lijin
  16 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-05 10:46 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

There is an implicit assumption that a directory containing only
untracked and ignored files should itself be considered untracked. This
makes sense in use cases where we're asking if a directory should be
added to the git database, but not when we're asking if a directory can
be safely removed from the working tree; as a result, clean -d would
assume that an "untracked" directory containing ignored files could be
deleted.

To get around this, we teach clean -d to collect ignored files and skip
over so-called "untracked" directories if they contain any ignored
files.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 builtin/clean.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index d861f836a..368e19427 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -859,7 +859,7 @@ static void interactive_main_loop(void)
 
 int cmd_clean(int argc, const char **argv, const char *prefix)
 {
-	int i, res;
+	int i, j, res;
 	int dry_run = 0, remove_directories = 0, quiet = 0, ignored = 0;
 	int ignored_only = 0, config_set = 0, errors = 0, gone = 1;
 	int rm_flags = REMOVE_DIR_KEEP_NESTED_GIT;
@@ -911,6 +911,9 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 				  " refusing to clean"));
 	}
 
+	if (remove_directories)
+		dir.flags |= DIR_SHOW_IGNORED_TOO;
+
 	if (force > 1)
 		rm_flags = 0;
 
@@ -932,7 +935,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 
 	fill_directory(&dir, &pathspec);
 
-	for (i = 0; i < dir.nr; i++) {
+	for (j = i = 0; i < dir.nr; i++) {
 		struct dir_entry *ent = dir.entries[i];
 		int matches = 0;
 		struct stat st;
@@ -954,10 +957,27 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		    matches != MATCHED_EXACTLY)
 			continue;
 
+		// skip any dir.entries which contains a dir.ignored
+		for (; j < dir.ignored_nr; j++) {
+			if (cmp_name(&dir.entries[i],
+						&dir.ignored[j]) < 0)
+				break;
+		}
+		if ((j < dir.ignored_nr) &&
+				check_contains(dir.entries[i], dir.ignored[j])) {
+			continue;
+		}
+
 		rel = relative_path(ent->name, prefix, &buf);
 		string_list_append(&del_list, rel);
 	}
 
+	for (i = 0; i < dir.nr; i++)
+		free(dir.entries[i]);
+
+	for (i = 0; i < dir.ignored_nr; i++)
+		free(dir.ignored[i]);
+
 	if (interactive && del_list.nr > 0)
 		interactive_main_loop();
 
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v2 8/9] t7300: clean -d now skips untracked dirs containing ignored files
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (14 preceding siblings ...)
  2017-05-05 10:46             ` [PATCH v2 7/9] builtin/clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
@ 2017-05-05 10:46             ` Samuel Lijin
  2017-05-05 10:46             ` [PATCH v2 9/9] t7061: expect ignored files in untracked dirs Samuel Lijin
  16 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-05 10:46 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

This was previously broken (and likely never worked); this concludes the
patch series fixing the behavior of clean -d.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7300-clean.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 252c75b40..948a455e8 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,7 +653,7 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
-test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
+test_expect_success 'git clean -d skips untracked dirs containing ignored files' '
 	echo /foo/bar >.gitignore &&
 	rm -rf foo &&
 	mkdir -p foo &&
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v2 9/9] t7061: expect ignored files in untracked dirs
  2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                               ` (15 preceding siblings ...)
  2017-05-05 10:46             ` [PATCH v2 8/9] t7300: clean -d now skips untracked " Samuel Lijin
@ 2017-05-05 10:46             ` Samuel Lijin
  2017-05-08  6:35               ` Junio C Hamano
  16 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-05 10:46 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We now expect `status --ignored` to list ignored files even if they are
in an untracked directory.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7061-wtstatus-ignore.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index dc3be92a2..fc6013ba3 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -9,9 +9,10 @@ cat >expected <<\EOF
 ?? actual
 ?? expected
 ?? untracked/
+!! untracked/ignored
 EOF
 
-test_expect_failure 'status untracked directory with --ignored' '
+test_expect_success 'status untracked directory with --ignored' '
 	echo "ignored" >.gitignore &&
 	mkdir untracked &&
 	: >untracked/ignored &&
@@ -20,7 +21,7 @@ test_expect_failure 'status untracked directory with --ignored' '
 	test_cmp expected actual
 '
 
-test_expect_failure 'same with gitignore starting with BOM' '
+test_expect_success 'same with gitignore starting with BOM' '
 	printf "\357\273\277ignored\n" >.gitignore &&
 	mkdir -p untracked &&
 	: >untracked/ignored &&
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 1/9] t7300: skip untracked dirs containing ignored files
  2017-05-05 10:46             ` [PATCH v2 1/9] t7300: skip untracked dirs containing ignored files Samuel Lijin
@ 2017-05-07 19:12               ` Torsten Bögershausen
  2017-05-08 21:24                 ` Samuel Lijin
  0 siblings, 1 reply; 89+ messages in thread
From: Torsten Bögershausen @ 2017-05-07 19:12 UTC (permalink / raw)
  To: Samuel Lijin, git

On 2017-05-05 12:46, Samuel Lijin wrote:
> If git sees a directory which contains only untracked and ignored
> files, clean -d should not remove that directory. It was recently
> discovered that this is *not* true of git clean -d, and it's possible
> that this has never worked correctly; this test and its accompanying
> patch series aims to fix that.
> 
> Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
> ---
>  t/t7300-clean.sh | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
> index b89fd2a6a..252c75b40 100755
> --- a/t/t7300-clean.sh
> +++ b/t/t7300-clean.sh
> @@ -653,4 +653,14 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
>  	test_path_is_dir foobar
>  '
>  
> +test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
> +	echo /foo/bar >.gitignore &&
> +	rm -rf foo &&
> +	mkdir -p foo &&
Minor remark:
Do we need the -p here, or can it be dropped?

> +	touch foo/bar &&
> +	git clean -df &&
> +	test_path_is_file foo/bar &&
> +	test_path_is_dir foo
> +'
> +
>  test_done
> 


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
@ 2017-05-08  4:26               ` Junio C Hamano
  2017-05-08 21:37                 ` Samuel Lijin
  2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
                                 ` (8 subsequent siblings)
  9 siblings, 1 reply; 89+ messages in thread
From: Junio C Hamano @ 2017-05-08  4:26 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> Addresses the issues raised by Stefan and Junio (thanks for your
> feedback) about not using C99-style comments and keeping tests
> working on every commit to prevent breaking git bisect. (About the
> latter one: is it necessary to prevent compiler warnings, in
> addition to compiler errors? Because if so I should probably
> squash some of the commits together.)

Some of us build with -Werror, so yes.  If by "squashing" you mean
"instead of piling a fix on top of a broken patch, I need to do
things right from the beginning", then yes, please do so, not just
for compiler warnings but for all forms of changes.

> Note that this introduces a breaking change in the behavior of git
> status: when invoked with --ignored, git status will now return
> ignored files in an untracked directory, whereas previously it
> would not.

What do you mean by a "breaking change"?  Is it just "a new bug"?
Or "the current behaviour is logically broken, but people and
scripts might have relied on that odd behaviour and fixing it this
late in the game would break their expectations"? 

> It's possible that there are standard practices that I might have
> missed, so if there is anything along those lines, I'd appreciate
> you letting me know. (As an aside, about the git bisect thing: is
> there a script somewhere that people use to test patch series
> before sending them out?)

I hear that people use variations of

    git rebase -x "make test"

on their topic.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 2/9] t7061: expect failure where expected behavior will change
  2017-05-05 10:46             ` [PATCH v2 2/9] t7061: expect failure where expected behavior will change Samuel Lijin
@ 2017-05-08  4:34               ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-08  4:34 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> This changes tests for `status --ignored` from test_expect_success to
> test_expect_failure in preparation for a change in its expected behavior
> (namely, that ignored files in untracked dirs will be reported).
>
> Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
> ---

This is an odd way to do this.  If we stop applying your patches at
this step, these tests will still see output from "status --ignored"
that is expected by them, so there is no expect_failure here.

If we decide that the current output from "status --ignored" is
WRONG, and your series to fix "clean -d" FIXES "status --ignored" as
a side effect, then having a step to describe a desired behaviour in
the new world order in the test like this patch does makes sense,
but if that is what is going on, then not just flipping "success" to
"failure", the patch would be changing the expected output as well,
i.e. by adding the ignored files in untracked directories in the
expected output.  Obviously the code at this point after applying
only patches 1 & 2 will not produce such an output, so marking the
test that expects output based on the new world order as "expect
failure" would make sense.  Then your future commit that FIXES
"status --ignored" output would flip _failure to _success.

It is unclear to me if the new behaviour of "status --ignored" is a
bugfix, or a new bug, though.


>  t/t7061-wtstatus-ignore.sh | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
> index cdc0747bf..dc3be92a2 100755
> --- a/t/t7061-wtstatus-ignore.sh
> +++ b/t/t7061-wtstatus-ignore.sh
> @@ -11,7 +11,7 @@ cat >expected <<\EOF
>  ?? untracked/
>  EOF
>  
> -test_expect_success 'status untracked directory with --ignored' '
> +test_expect_failure 'status untracked directory with --ignored' '
>  	echo "ignored" >.gitignore &&
>  	mkdir untracked &&
>  	: >untracked/ignored &&
> @@ -20,7 +20,7 @@ test_expect_success 'status untracked directory with --ignored' '
>  	test_cmp expected actual
>  '
>  
> -test_expect_success 'same with gitignore starting with BOM' '
> +test_expect_failure 'same with gitignore starting with BOM' '
>  	printf "\357\273\277ignored\n" >.gitignore &&
>  	mkdir -p untracked &&
>  	: >untracked/ignored &&

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 6/9] dir: change linkage of cmp_name() and check_contains()
  2017-05-05 10:46             ` [PATCH v2 6/9] dir: change linkage of cmp_name() and check_contains() Samuel Lijin
@ 2017-05-08  4:37               ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-08  4:37 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> diff --git a/dir.h b/dir.h
> index bf23a470a..1ddd8b611 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -326,6 +326,9 @@ static inline int dir_path_match(const struct dir_entry *ent,
>  			      has_trailing_dir);
>  }
>  
> +int cmp_name(const void *p1, const void *p2);
> +int check_contains(const struct dir_entry *out, const struct dir_entry *in);
> +

Both of these would have been perfectly sensible names when they
were private to dir.c, but are way too generic to live in the global
namespace.  When a person who works on Git internal hears a name
"check_contains()", I am sure that the first guess of what the
function does is to see if a commit is a descendant of another
commit.


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 9/9] t7061: expect ignored files in untracked dirs
  2017-05-05 10:46             ` [PATCH v2 9/9] t7061: expect ignored files in untracked dirs Samuel Lijin
@ 2017-05-08  6:35               ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-08  6:35 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git, Antoine Pelisse

Samuel Lijin <sxlijin@gmail.com> writes:

> We now expect `status --ignored` to list ignored files even if they are
> in an untracked directory.
>
> Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
> ---
>  t/t7061-wtstatus-ignore.sh | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
> index dc3be92a2..fc6013ba3 100755
> --- a/t/t7061-wtstatus-ignore.sh
> +++ b/t/t7061-wtstatus-ignore.sh
> @@ -9,9 +9,10 @@ cat >expected <<\EOF
>  ?? actual
>  ?? expected
>  ?? untracked/
> +!! untracked/ignored
>  EOF

In my comment on an earlier step (2/9, I think), I said it is
unclear if the change in behaviour to "status --ignored" is a new
bug you are introducing, or is a fix that happens as a side-effect.

I may be misunderstanding what the test that uses this expected
output is trying to see, but to me, it seems that:

        --ignored::
                Show ignored files as well.

is clear enough that "status --ignored", regardless of the
"--unracked" settings, should show the ignored file even when the
directory it happens to be in does not have any file that is
registerd in the index, and the expected output updated by this
patch looks to me the one that we _should_ be expecting.  IOW, the
change in behaviour looks like a bugfix to me.  And if that is
indeed the case, the above change should be in the earlier patch
that flips "expect_success" to "expect_failure".  The "expected"
file is prepared to expect the "correct" output before the code is
updated to produce one (i.e. the test update declares that the
current behaviour is broken), and then with changes in a later step
(i.e. somewhere before 7/9) the code starts to produce the "correct"
output at which point in that same patch you flip expect_failure into
expect_success.

The log message of eb8c5b87 ("git-status: Test --ignored behavior",
2012-12-30) says otherwise, though.  I am undecided, if I agree with
the design decision described by the first two bullet points:

    commit eb8c5b872ef144add4ac89f85bcddc974ac7114d
    Author: Antoine Pelisse <apelisse@gmail.com>
    Date:   Sun Dec 30 15:39:01 2012 +0100

        git-status: Test --ignored behavior

        Test all possible use-cases of git-status "--ignored" with the
        "--untracked-files" option with values "normal" and "all":

         - An untracked directory is listed as untracked if it has a mix of
           untracked and ignored files in it.  With -uall, ignored/untracked
           files are listed as ignored/untracked.

         - An untracked directory with only ignored files is listed as
           ignored.  With -uall, all files in the directory are listed.

         - An ignored directory is listed as ignored. With -uall, all files
           in the directory are listed as ignored.

         - An ignored and committed directory is listed as ignored if it has
           untracked files.  With -uall, all untracked files in the
           directory are listed as ignored.

        Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
        Signed-off-by: Junio C Hamano <gitster@pobox.com>

     t/t7061-wtstatus-ignore.sh | 146 +++++++++++++++++++++++++++++++++++++++++++++
     1 file changed, 146 insertions(+)

What do others think?

By the way, I wonder what the performance impact of this change
would be.  Do we end up needing to scan more parts of the
filesystem, when we already know that a directory does not have any
tracked paths?

> -test_expect_failure 'status untracked directory with --ignored' '
> +test_expect_success 'status untracked directory with --ignored' '
>  	echo "ignored" >.gitignore &&
>  	mkdir untracked &&
>  	: >untracked/ignored &&
> @@ -20,7 +21,7 @@ test_expect_failure 'status untracked directory with --ignored' '
>  	test_cmp expected actual
>  '
>  
> -test_expect_failure 'same with gitignore starting with BOM' '
> +test_expect_success 'same with gitignore starting with BOM' '
>  	printf "\357\273\277ignored\n" >.gitignore &&
>  	mkdir -p untracked &&
>  	: >untracked/ignored &&

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 1/9] t7300: skip untracked dirs containing ignored files
  2017-05-07 19:12               ` Torsten Bögershausen
@ 2017-05-08 21:24                 ` Samuel Lijin
  0 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-08 21:24 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git@vger.kernel.org

On Sun, May 7, 2017 at 2:12 PM, Torsten Bögershausen <tboegi@web.de> wrote:
> On 2017-05-05 12:46, Samuel Lijin wrote:
>> If git sees a directory which contains only untracked and ignored
>> files, clean -d should not remove that directory. It was recently
>> discovered that this is *not* true of git clean -d, and it's possible
>> that this has never worked correctly; this test and its accompanying
>> patch series aims to fix that.
>>
>> Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
>> ---
>>  t/t7300-clean.sh | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
>> index b89fd2a6a..252c75b40 100755
>> --- a/t/t7300-clean.sh
>> +++ b/t/t7300-clean.sh
>> @@ -653,4 +653,14 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
>>       test_path_is_dir foobar
>>  '
>>
>> +test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
>> +     echo /foo/bar >.gitignore &&
>> +     rm -rf foo &&
>> +     mkdir -p foo &&
> Minor remark:
> Do we need the -p here, or can it be dropped?

Yeah, the -p isn't needed here - will change when I reroll.

>> +     touch foo/bar &&
>> +     git clean -df &&
>> +     test_path_is_file foo/bar &&
>> +     test_path_is_dir foo
>> +'
>> +
>>  test_done
>>
>

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files
  2017-05-08  4:26               ` Junio C Hamano
@ 2017-05-08 21:37                 ` Samuel Lijin
  2017-05-09  0:31                   ` Junio C Hamano
  0 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-08 21:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

On Sun, May 7, 2017 at 11:26 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Samuel Lijin <sxlijin@gmail.com> writes:
>
>> Addresses the issues raised by Stefan and Junio (thanks for your
>> feedback) about not using C99-style comments and keeping tests
>> working on every commit to prevent breaking git bisect. (About the
>> latter one: is it necessary to prevent compiler warnings, in
>> addition to compiler errors? Because if so I should probably
>> squash some of the commits together.)
>
> Some of us build with -Werror, so yes.  If by "squashing" you mean
> "instead of piling a fix on top of a broken patch, I need to do
> things right from the beginning", then yes, please do so, not just
> for compiler warnings but for all forms of changes.

Got it - will keep this in mind when I reroll the patch series.

>> Note that this introduces a breaking change in the behavior of git
>> status: when invoked with --ignored, git status will now return
>> ignored files in an untracked directory, whereas previously it
>> would not.
>
> What do you mean by a "breaking change"?  Is it just "a new bug"?
> Or "the current behaviour is logically broken, but people and
> scripts might have relied on that odd behaviour and fixing it this
> late in the game would break their expectations"?

The latter, as I believe you noticed in your reply about patch 9/9.

What happens right now is that because (1) directories containing only
untracked and ignored files are considered "untracked" and (2)
read_directory_recursive() skips over untracked directories, even with
DIR_SHOW_IGNORED_TOO set. As a result, `status --ignored` never lists
ignored files if they're in an "untracked" directory (and this is the
currently defined behavior per t7061).

By teaching read_directory_recursive() to recurse into untracked
directories in search of ignored files when DIR_SHOW_IGNORED_TOO is
set, though, `status --ignored` now learns to report the existence of
these ignored files, whereas previously it did not.

>> It's possible that there are standard practices that I might have
>> missed, so if there is anything along those lines, I'd appreciate
>> you letting me know. (As an aside, about the git bisect thing: is
>> there a script somewhere that people use to test patch series
>> before sending them out?)
>
> I hear that people use variations of
>
>     git rebase -x "make test"
>
> on their topic.

Aha - thanks.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files
  2017-05-08 21:37                 ` Samuel Lijin
@ 2017-05-09  0:31                   ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-09  0:31 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git@vger.kernel.org

Samuel Lijin <sxlijin@gmail.com> writes:

> What happens right now is that because (1) directories containing only
> untracked and ignored files are considered "untracked" and (2)
> read_directory_recursive() skips over untracked directories, even with
> DIR_SHOW_IGNORED_TOO set. As a result, `status --ignored` never lists
> ignored files if they're in an "untracked" directory (and this is the
> currently defined behavior per t7061).
>
> By teaching read_directory_recursive() to recurse into untracked
> directories in search of ignored files when DIR_SHOW_IGNORED_TOO is
> set, though, `status --ignored` now learns to report the existence of
> these ignored files, whereas previously it did not.

OK, if you are revisiting the design decision made by eb8c5b87
("git-status: Test --ignored behavior", 2012-12-30), we should
clearly document it in the log message of the commit that does so in
a way similar to how eb8c5b87 described the behaviour it desired.

Thanks.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v3 0/8] Fix clean -d and status --ignored
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
  2017-05-08  4:26               ` Junio C Hamano
@ 2017-05-16  7:34               ` Samuel Lijin
  2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
                                   ` (6 more replies)
  2017-05-16  7:34               ` [PATCH v3 1/8] t7300: clean -d should skip dirs with " Samuel Lijin
                                 ` (7 subsequent siblings)
  9 siblings, 7 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-16  7:34 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Restructures the patch series so that everything works properly.

I also noticed while redoing it that I didn't properly fix the clean -d bug
before; while it wouldn't remove a directory with only untracked and ignored
files, it would miss the untracked contents of such a directory. That's now been
fixed.

Samuel Lijin (8):
  t7300: clean -d should skip dirs with ignored files
  t7061: status --ignored should search untracked dirs
  dir: recurse into untracked dirs for ignored files
  dir: hide untracked contents of untracked dirs
  dir: expose cmp_name() and check_contains()
  clean: teach clean -d to skip dirs containing ignored files
  t7300: clean -d now skips untracked dirs containing ignored files
  t7061: status --ignored now searches untracked dirs

 builtin/clean.c            | 32 ++++++++++++++++++++--
 dir.c                      | 67 +++++++++++++++++++++++++++++++++++++++++++---
 dir.h                      |  6 ++++-
 t/t7061-wtstatus-ignore.sh |  1 +
 t/t7300-clean.sh           | 11 ++++++++
 5 files changed, 110 insertions(+), 7 deletions(-)

-- 
2.12.2


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v3 1/8] t7300: clean -d should skip dirs with ignored files
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
  2017-05-08  4:26               ` Junio C Hamano
  2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
@ 2017-05-16  7:34               ` Samuel Lijin
  2017-05-16  7:34               ` [PATCH v3 2/8] t7061: status --ignored should search untracked dirs Samuel Lijin
                                 ` (6 subsequent siblings)
  9 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-16  7:34 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

If git sees a directory which contains only untracked and ignored
files, clean -d should not remove that directory. It was recently
discovered that this is *not* true of git clean -d, and it's possible
that this has never worked correctly; this test and its accompanying
patch series aims to fix that.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7300-clean.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index b89fd2a6a..c14c98e56 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,4 +653,15 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
+test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
+	echo /foo/bar >.gitignore &&
+	rm -rf foo &&
+	mkdir foo &&
+	touch foo/bar foo/baz &&
+	git clean -df &&
+	test_path_is_dir foo &&
+	test_path_is_file foo/bar &&
+	test_path_is_missing foo/baz
+'
+
 test_done
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v3 2/8] t7061: status --ignored should search untracked dirs
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                                 ` (2 preceding siblings ...)
  2017-05-16  7:34               ` [PATCH v3 1/8] t7300: clean -d should skip dirs with " Samuel Lijin
@ 2017-05-16  7:34               ` Samuel Lijin
  2017-05-16  7:34               ` [PATCH v3 3/8] dir: recurse into untracked dirs for ignored files Samuel Lijin
                                 ` (5 subsequent siblings)
  9 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-16  7:34 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Per eb8c5b87, `status --ignored` by design does not list ignored files
if they are in a directory which contains only ignored and untracked
files (which is itself considered to be untracked) without `-uall`. This
does not make sense for `--ignored`, which claims to "Show ignored files
as well."

Thus we revisit eb8c5b87 and decide that for such directories, `status
--ignored` will list the directory as untracked *and* list all ignored
files within said directory even without `-uall`.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7061-wtstatus-ignore.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index cdc0747bf..15e7592b6 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -9,9 +9,10 @@ cat >expected <<\EOF
 ?? actual
 ?? expected
 ?? untracked/
+!! untracked/ignored
 EOF
 
-test_expect_success 'status untracked directory with --ignored' '
+test_expect_failure 'status untracked directory with --ignored' '
 	echo "ignored" >.gitignore &&
 	mkdir untracked &&
 	: >untracked/ignored &&
@@ -20,7 +21,7 @@ test_expect_success 'status untracked directory with --ignored' '
 	test_cmp expected actual
 '
 
-test_expect_success 'same with gitignore starting with BOM' '
+test_expect_failure 'same with gitignore starting with BOM' '
 	printf "\357\273\277ignored\n" >.gitignore &&
 	mkdir -p untracked &&
 	: >untracked/ignored &&
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v3 3/8] dir: recurse into untracked dirs for ignored files
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                                 ` (3 preceding siblings ...)
  2017-05-16  7:34               ` [PATCH v3 2/8] t7061: status --ignored should search untracked dirs Samuel Lijin
@ 2017-05-16  7:34               ` Samuel Lijin
  2017-05-17  6:23                 ` Junio C Hamano
  2017-05-16  7:34               ` [PATCH v3 4/8] dir: hide untracked contents of untracked dirs Samuel Lijin
                                 ` (4 subsequent siblings)
  9 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-16  7:34 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We consider directories containing only untracked and ignored files to
be themselves untracked, which in the usual case means we don't have to
search these directories. This is problematic when we want to collect
ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
read_directory_recursive() to recurse into untracked directories to find
the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
has the side effect of also collecting all untracked files in untracked
directories as well.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index f451bfa48..6bd0350e9 100644
--- a/dir.c
+++ b/dir.c
@@ -1784,7 +1784,12 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 			dir_state = state;
 
 		/* recurse into subdir if instructed by treat_path */
-		if (state == path_recurse) {
+		if ((state == path_recurse) ||
+				((get_dtype(cdir.de, path.buf, path.len) == DT_DIR) &&
+				 (state == path_untracked) &&
+				 (dir->flags & DIR_SHOW_IGNORED_TOO))
+				)
+		{
 			struct untracked_cache_dir *ud;
 			ud = lookup_untracked(dir->untracked, untracked,
 					      path.buf + baselen,
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v3 4/8] dir: hide untracked contents of untracked dirs
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                                 ` (4 preceding siblings ...)
  2017-05-16  7:34               ` [PATCH v3 3/8] dir: recurse into untracked dirs for ignored files Samuel Lijin
@ 2017-05-16  7:34               ` Samuel Lijin
  2017-05-17  6:47                 ` Junio C Hamano
  2017-05-16  7:34               ` [PATCH v3 5/8] dir: expose cmp_name() and check_contains() Samuel Lijin
                                 ` (3 subsequent siblings)
  9 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-16  7:34 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

When we taught read_directory_recursive() to recurse into untracked
directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
had the side effect of teaching it to collect the untracked contents of
untracked directories. It doesn't always make sense to return these,
though (we do need them for `clean -d`), so we introduce a flag
(DIR_KEEP_UNTRACKED_CONTENTS) to control whether or not read_directory()
strips dir->entries of the untracked contents of untracked dirs.

We also introduce check_contains() to check if one dir_entry corresponds
to a path which contains the path corresponding to another dir_entry.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 dir.h |  3 ++-
 2 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 6bd0350e9..214a148ee 100644
--- a/dir.c
+++ b/dir.c
@@ -1852,6 +1852,14 @@ static int cmp_name(const void *p1, const void *p2)
 	return name_compare(e1->name, e1->len, e2->name, e2->len);
 }
 
+/* check if *out lexically contains *in */
+static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+{
+	return (out->len < in->len) &&
+			(out->name[out->len - 1] == '/') &&
+			!memcmp(out->name, in->name, out->len);
+}
+
 static int treat_leading_path(struct dir_struct *dir,
 			      const char *path, int len,
 			      const struct pathspec *pathspec)
@@ -2067,6 +2075,52 @@ int read_directory(struct dir_struct *dir, const char *path,
 		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
 	QSORT(dir->entries, dir->nr, cmp_name);
 	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
+
+	// if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
+	// up untracked contents of untracked dirs; by default we discard these,
+	// but given DIR_KEEP_UNTRACKED_CONTENTS we do not
+	if ((dir->flags & DIR_SHOW_IGNORED_TOO)
+		     && !(dir->flags & DIR_KEEP_UNTRACKED_CONTENTS)) {
+		int i, j, nr_removed = 0;
+
+		// remove from dir->entries untracked contents of untracked dirs
+		for (i = 0; i < dir->nr; i++) {
+			if (!dir->entries[i])
+				continue;
+
+			for (j = i + 1; j < dir->nr; j++) {
+				if (!dir->entries[j])
+					continue;
+				if (check_contains(dir->entries[i], dir->entries[j])) {
+					nr_removed++;
+					free(dir->entries[j]);
+					dir->entries[j] = NULL;
+				}
+				else {
+					break;
+				}
+			}
+		}
+
+		// strip dir->entries of NULLs
+		if (nr_removed) {
+			for (i = 0;;) {
+				while (i < dir->nr && dir->entries[i])
+					i++;
+				if (i == dir->nr)
+					break;
+				j = i;
+				while (j < dir->nr && !dir->entries[j])
+					j++;
+				if (j == dir->nr)
+					break;
+				dir->entries[i] = dir->entries[j];
+				dir->entries[j] = NULL;
+			}
+			dir->nr -= nr_removed;
+		}
+	}
+
 	if (dir->untracked) {
 		static struct trace_key trace_untracked_stats = TRACE_KEY_INIT(UNTRACKED_STATS);
 		trace_printf_key(&trace_untracked_stats,
diff --git a/dir.h b/dir.h
index bf23a470a..650e54bdf 100644
--- a/dir.h
+++ b/dir.h
@@ -151,7 +151,8 @@ struct dir_struct {
 		DIR_NO_GITLINKS = 1<<3,
 		DIR_COLLECT_IGNORED = 1<<4,
 		DIR_SHOW_IGNORED_TOO = 1<<5,
-		DIR_COLLECT_KILLED_ONLY = 1<<6
+		DIR_COLLECT_KILLED_ONLY = 1<<6,
+		DIR_KEEP_UNTRACKED_CONTENTS = 1<<7
 	} flags;
 	struct dir_entry **entries;
 	struct dir_entry **ignored;
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v3 5/8] dir: expose cmp_name() and check_contains()
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                                 ` (5 preceding siblings ...)
  2017-05-16  7:34               ` [PATCH v3 4/8] dir: hide untracked contents of untracked dirs Samuel Lijin
@ 2017-05-16  7:34               ` Samuel Lijin
  2017-05-16  7:34               ` [PATCH v3 6/8] clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
                                 ` (2 subsequent siblings)
  9 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-16  7:34 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We want to use cmp_name() and check_contains() (which both compare
`struct dir_entry`s, the former in terms of the sort order, the latter
in terms of whether one lexically contains another) outside of dir.c,
so we have to (1) change their linkage and (2) rename them as
appropriate for the global namespace. The second is achieved by
renaming cmp_name() to cmp_dir_entry() and check_contains() to
check_dir_entry_contains().

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 10 +++++-----
 dir.h |  3 +++
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index 214a148ee..1bf2d20e7 100644
--- a/dir.c
+++ b/dir.c
@@ -1844,7 +1844,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 	return dir_state;
 }
 
-static int cmp_name(const void *p1, const void *p2)
+int cmp_dir_entry(const void *p1, const void *p2)
 {
 	const struct dir_entry *e1 = *(const struct dir_entry **)p1;
 	const struct dir_entry *e2 = *(const struct dir_entry **)p2;
@@ -1853,7 +1853,7 @@ static int cmp_name(const void *p1, const void *p2)
 }
 
 /* check if *out lexically contains *in */
-static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+int check_dir_entry_contains(const struct dir_entry *out, const struct dir_entry *in)
 {
 	return (out->len < in->len) &&
 			(out->name[out->len - 1] == '/') &&
@@ -2073,8 +2073,8 @@ int read_directory(struct dir_struct *dir, const char *path,
 		dir->untracked = NULL;
 	if (!len || treat_leading_path(dir, path, len, pathspec))
 		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
-	QSORT(dir->entries, dir->nr, cmp_name);
-	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
+	QSORT(dir->entries, dir->nr, cmp_dir_entry);
+	QSORT(dir->ignored, dir->ignored_nr, cmp_dir_entry);
 
 	// if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
 	// up untracked contents of untracked dirs; by default we discard these,
@@ -2091,7 +2091,7 @@ int read_directory(struct dir_struct *dir, const char *path,
 			for (j = i + 1; j < dir->nr; j++) {
 				if (!dir->entries[j])
 					continue;
-				if (check_contains(dir->entries[i], dir->entries[j])) {
+				if (check_dir_entry_contains(dir->entries[i], dir->entries[j])) {
 					nr_removed++;
 					free(dir->entries[j]);
 					dir->entries[j] = NULL;
diff --git a/dir.h b/dir.h
index 650e54bdf..edb5fda58 100644
--- a/dir.h
+++ b/dir.h
@@ -327,6 +327,9 @@ static inline int dir_path_match(const struct dir_entry *ent,
 			      has_trailing_dir);
 }
 
+int cmp_dir_entry(const void *p1, const void *p2);
+int check_dir_entry_contains(const struct dir_entry *out, const struct dir_entry *in);
+
 void untracked_cache_invalidate_path(struct index_state *, const char *);
 void untracked_cache_remove_from_index(struct index_state *, const char *);
 void untracked_cache_add_to_index(struct index_state *, const char *);
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v3 6/8] clean: teach clean -d to skip dirs containing ignored files
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                                 ` (6 preceding siblings ...)
  2017-05-16  7:34               ` [PATCH v3 5/8] dir: expose cmp_name() and check_contains() Samuel Lijin
@ 2017-05-16  7:34               ` Samuel Lijin
  2017-05-18  2:34                 ` Junio C Hamano
  2017-05-16  7:34               ` [PATCH v3 7/8] t7300: clean -d now skips untracked " Samuel Lijin
  2017-05-16  7:34               ` [PATCH v3 8/8] t7061: status --ignored now searches untracked dirs Samuel Lijin
  9 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-16  7:34 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

There is an implicit assumption that a directory containing only
untracked and ignored files should itself be considered untracked. This
makes sense in use cases where we're asking if a directory should be
added to the git database, but not when we're asking if a directory can
be safely removed from the working tree; as a result, clean -d would
assume that an "untracked" directory containing ignored files could be
deleted.

To get around this, we teach clean -d to collect ignored files and skip
over so-called "untracked" directories if they contain any ignored
files (while still removing the untracked contents of such dirs).

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 builtin/clean.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index d861f836a..25f3efce5 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -859,7 +859,7 @@ static void interactive_main_loop(void)
 
 int cmd_clean(int argc, const char **argv, const char *prefix)
 {
-	int i, res;
+	int i, j, k, res;
 	int dry_run = 0, remove_directories = 0, quiet = 0, ignored = 0;
 	int ignored_only = 0, config_set = 0, errors = 0, gone = 1;
 	int rm_flags = REMOVE_DIR_KEEP_NESTED_GIT;
@@ -911,6 +911,9 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 				  " refusing to clean"));
 	}
 
+	if (remove_directories)
+		dir.flags |= DIR_SHOW_IGNORED_TOO | DIR_KEEP_UNTRACKED_CONTENTS;
+
 	if (force > 1)
 		rm_flags = 0;
 
@@ -932,7 +935,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 
 	fill_directory(&dir, &pathspec);
 
-	for (i = 0; i < dir.nr; i++) {
+	for (k = i = 0; i < dir.nr; i++) {
 		struct dir_entry *ent = dir.entries[i];
 		int matches = 0;
 		struct stat st;
@@ -954,10 +957,35 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		    matches != MATCHED_EXACTLY)
 			continue;
 
+		// skip any dir.entries which contains a dir.ignored
+		for (; k < dir.ignored_nr; k++) {
+			if (cmp_dir_entry(&dir.entries[i],
+						&dir.ignored[k]) < 0)
+				break;
+		}
+		if ((k < dir.ignored_nr) &&
+				check_dir_entry_contains(dir.entries[i], dir.ignored[k])) {
+			continue;
+		}
+
+		// current entry does not contain any ignored files
 		rel = relative_path(ent->name, prefix, &buf);
 		string_list_append(&del_list, rel);
+
+		// skip untracked contents of an untracked dir
+		for (j = i + 1;
+			 j < dir.nr &&
+			     check_dir_entry_contains(dir.entries[i], dir.entries[j]);
+			 j++);
+		i = j - 1;
 	}
 
+	for (i = 0; i < dir.nr; i++)
+		free(dir.entries[i]);
+
+	for (i = 0; i < dir.ignored_nr; i++)
+		free(dir.ignored[i]);
+
 	if (interactive && del_list.nr > 0)
 		interactive_main_loop();
 
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v3 7/8] t7300: clean -d now skips untracked dirs containing ignored files
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                                 ` (7 preceding siblings ...)
  2017-05-16  7:34               ` [PATCH v3 6/8] clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
@ 2017-05-16  7:34               ` Samuel Lijin
  2017-05-18  2:38                 ` Junio C Hamano
  2017-05-16  7:34               ` [PATCH v3 8/8] t7061: status --ignored now searches untracked dirs Samuel Lijin
  9 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-16  7:34 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7300-clean.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index c14c98e56..b65d4719c 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,7 +653,7 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
-test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
+test_expect_success 'git clean -d skips untracked dirs containing ignored files' '
 	echo /foo/bar >.gitignore &&
 	rm -rf foo &&
 	mkdir foo &&
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v3 8/8] t7061: status --ignored now searches untracked dirs
  2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
                                 ` (8 preceding siblings ...)
  2017-05-16  7:34               ` [PATCH v3 7/8] t7300: clean -d now skips untracked " Samuel Lijin
@ 2017-05-16  7:34               ` Samuel Lijin
  2017-05-18  2:46                 ` Junio C Hamano
  9 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-16  7:34 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7061-wtstatus-ignore.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index 15e7592b6..fc6013ba3 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -12,7 +12,7 @@ cat >expected <<\EOF
 !! untracked/ignored
 EOF
 
-test_expect_failure 'status untracked directory with --ignored' '
+test_expect_success 'status untracked directory with --ignored' '
 	echo "ignored" >.gitignore &&
 	mkdir untracked &&
 	: >untracked/ignored &&
@@ -21,7 +21,7 @@ test_expect_failure 'status untracked directory with --ignored' '
 	test_cmp expected actual
 '
 
-test_expect_failure 'same with gitignore starting with BOM' '
+test_expect_success 'same with gitignore starting with BOM' '
 	printf "\357\273\277ignored\n" >.gitignore &&
 	mkdir -p untracked &&
 	: >untracked/ignored &&
-- 
2.12.2


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v3 3/8] dir: recurse into untracked dirs for ignored files
  2017-05-16  7:34               ` [PATCH v3 3/8] dir: recurse into untracked dirs for ignored files Samuel Lijin
@ 2017-05-17  6:23                 ` Junio C Hamano
  2017-05-17  7:02                   ` Samuel Lijin
  0 siblings, 1 reply; 89+ messages in thread
From: Junio C Hamano @ 2017-05-17  6:23 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> We consider directories containing only untracked and ignored files to
> be themselves untracked, which in the usual case means we don't have to
> search these directories. This is problematic when we want to collect
> ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
> read_directory_recursive() to recurse into untracked directories to find
> the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
> has the side effect of also collecting all untracked files in untracked
> directories as well.
>
> Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
> ---
>  dir.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/dir.c b/dir.c
> index f451bfa48..6bd0350e9 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1784,7 +1784,12 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
>  			dir_state = state;
>  
>  		/* recurse into subdir if instructed by treat_path */
> -		if (state == path_recurse) {
> +		if ((state == path_recurse) ||
> +				((get_dtype(cdir.de, path.buf, path.len) == DT_DIR) &&

I see a conditional in treat_path() that is prepared to deal with a
NULL in cdir.de; do we know cdir.de is always non-NULL at this point
in the code, or is get_dtype() prepared to see NULL as its first
parameter?

	... goes and looks ...

Yes, it falls back to the usual lstat() dance in such a case, so
we'd be OK here.

> +				 (state == path_untracked) &&
> +				 (dir->flags & DIR_SHOW_IGNORED_TOO))

If we are told to SHOW_IGNORED_TOO, we'd recurse into an untracked
thing if it is a directory.  No other behaviour change.

Isn't get_dtype() potentially slower than other two checks if it
triggers?  I am wondering if these three conditions in &&-chain
should be reordered to call get_dtype() the last, i.e.

                if ((state == path_recurse) ||
                    ((dir->flags & DIR_SHOW_IGNORED_TOO)) &&
                     (state == path_untracked) &&
                     (get_dtype(cdir.de, path.buf, path.len) == DT_DIR)) {

> +				)
> +		{

It may be just a style, but these new lines are indented overly too
deep.  We don't use a lone "{" on a line to open a block, either.

>  			struct untracked_cache_dir *ud;
>  			ud = lookup_untracked(dir->untracked, untracked,
>  					      path.buf + baselen,

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v3 4/8] dir: hide untracked contents of untracked dirs
  2017-05-16  7:34               ` [PATCH v3 4/8] dir: hide untracked contents of untracked dirs Samuel Lijin
@ 2017-05-17  6:47                 ` Junio C Hamano
  2017-05-17  7:32                   ` Samuel Lijin
  0 siblings, 1 reply; 89+ messages in thread
From: Junio C Hamano @ 2017-05-17  6:47 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> When we taught read_directory_recursive() to recurse into untracked
> directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
> had the side effect of teaching it to collect the untracked contents of
> untracked directories. It doesn't always make sense to return these,
> though (we do need them for `clean -d`), so we introduce a flag
> (DIR_KEEP_UNTRACKED_CONTENTS) to control whether or not read_directory()
> strips dir->entries of the untracked contents of untracked dirs.
>
> We also introduce check_contains() to check if one dir_entry corresponds
> to a path which contains the path corresponding to another dir_entry.
>
> Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
> ---
>  dir.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  dir.h |  3 ++-
>  2 files changed, 56 insertions(+), 1 deletion(-)
>
> diff --git a/dir.c b/dir.c
> index 6bd0350e9..214a148ee 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1852,6 +1852,14 @@ static int cmp_name(const void *p1, const void *p2)
>  	return name_compare(e1->name, e1->len, e2->name, e2->len);
>  }
>  
> +/* check if *out lexically contains *in */
> +static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
> +{
> +	return (out->len < in->len) &&
> +			(out->name[out->len - 1] == '/') &&
> +			!memcmp(out->name, in->name, out->len);
> +}

OK, treat_one_path() and treat_pah_fast() both ensure that a path to
a directory is terminated with '/' before calling dir_add_name() and
dir_add_ignored(), so we know a dir_entry "out" that is a directory
must end with '/'.  Good.

The second and third line being overly indented is a bit
distracting, though.

>  static int treat_leading_path(struct dir_struct *dir,
>  			      const char *path, int len,
>  			      const struct pathspec *pathspec)
> @@ -2067,6 +2075,52 @@ int read_directory(struct dir_struct *dir, const char *path,
>  		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
>  	QSORT(dir->entries, dir->nr, cmp_name);
>  	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
> +
> +	// if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
> +	// up untracked contents of untracked dirs; by default we discard these,
> +	// but given DIR_KEEP_UNTRACKED_CONTENTS we do not

	/*
	 * Our multi-line comments are formatted like this 
	 * example.  No C++/C99 // comments, outside of
	 * borrowed code and platform specific compat/ code,
	 * please.
	 */

> +	if ((dir->flags & DIR_SHOW_IGNORED_TOO)
> +		     && !(dir->flags & DIR_KEEP_UNTRACKED_CONTENTS)) {

Both having && at the end and && at the beginning are valid C, but
please stick to one style in a single file.

> +		int i, j, nr_removed = 0;
> +
> +		// remove from dir->entries untracked contents of untracked dirs

	/* And our single-liner comments look like this */

> +		for (i = 0; i < dir->nr; i++) {
> +			if (!dir->entries[i])
> +				continue;
> +
> +			for (j = i + 1; j < dir->nr; j++) {
> +				if (!dir->entries[j])
> +					continue;
> +				if (check_contains(dir->entries[i], dir->entries[j])) {
> +					nr_removed++;
> +					free(dir->entries[j]);
> +					dir->entries[j] = NULL;
> +				}
> +				else {
> +					break;
> +				}
> +			}
> +		}

This loop is O(n^2).  I wonder if we can do better, especially we
know dir->entries[] is sorted already.

Well, because it is sorted, if A/, A/B, and A/B/C are all untracked,
the first round that scans for A/ will nuke both A/B and A/B/C, so
we won't have to scan looking for entries inside A/B, which is a bit
of consolation ;-)

> +			for (i = 0;;) {
> +				while (i < dir->nr && dir->entries[i])
> +					i++;
> +				if (i == dir->nr)
> +					break;
> +				j = i;
> +				while (j < dir->nr && !dir->entries[j])
> +					j++;
> +				if (j == dir->nr)
> +					break;
> +				dir->entries[i] = dir->entries[j];
> +				dir->entries[j] = NULL;
> +			}
> +			dir->nr -= nr_removed;

This looks like an overly complicated way to scan an array and skip
NULLs.  Are you doing an equivalent of this loop, or am I missing
something subtle?

	for (src = dst = 0; src < nr; src++)
		if (array[src])
			array[dst++] = src;
	nr = dst;

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v3 3/8] dir: recurse into untracked dirs for ignored files
  2017-05-17  6:23                 ` Junio C Hamano
@ 2017-05-17  7:02                   ` Samuel Lijin
  0 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-17  7:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

On Wed, May 17, 2017 at 2:23 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Samuel Lijin <sxlijin@gmail.com> writes:
>
>> We consider directories containing only untracked and ignored files to
>> be themselves untracked, which in the usual case means we don't have to
>> search these directories. This is problematic when we want to collect
>> ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
>> read_directory_recursive() to recurse into untracked directories to find
>> the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
>> has the side effect of also collecting all untracked files in untracked
>> directories as well.
>>
>> Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
>> ---
>>  dir.c | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/dir.c b/dir.c
>> index f451bfa48..6bd0350e9 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -1784,7 +1784,12 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
>>                       dir_state = state;
>>
>>               /* recurse into subdir if instructed by treat_path */
>> -             if (state == path_recurse) {
>> +             if ((state == path_recurse) ||
>> +                             ((get_dtype(cdir.de, path.buf, path.len) == DT_DIR) &&
>
> I see a conditional in treat_path() that is prepared to deal with a
> NULL in cdir.de; do we know cdir.de is always non-NULL at this point
> in the code, or is get_dtype() prepared to see NULL as its first
> parameter?
>
>         ... goes and looks ...
>
> Yes, it falls back to the usual lstat() dance in such a case, so
> we'd be OK here.
>
>> +                              (state == path_untracked) &&
>> +                              (dir->flags & DIR_SHOW_IGNORED_TOO))
>
> If we are told to SHOW_IGNORED_TOO, we'd recurse into an untracked
> thing if it is a directory.  No other behaviour change.
>
> Isn't get_dtype() potentially slower than other two checks if it
> triggers?  I am wondering if these three conditions in &&-chain
> should be reordered to call get_dtype() the last, i.e.
>
>                 if ((state == path_recurse) ||
>                     ((dir->flags & DIR_SHOW_IGNORED_TOO)) &&
>                      (state == path_untracked) &&
>                      (get_dtype(cdir.de, path.buf, path.len) == DT_DIR)) {

Ah, I didn't realize that. I figured that get_dtype() was just a
helper to invoke the appropriate macros, but if it's actually mildly
expensive I'll take your reorder.

>> +                             )
>> +             {
>
> It may be just a style, but these new lines are indented overly too
> deep.  We don't use a lone "{" on a line to open a block, either.
>
>>                       struct untracked_cache_dir *ud;
>>                       ud = lookup_untracked(dir->untracked, untracked,
>>                                             path.buf + baselen,

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v3 4/8] dir: hide untracked contents of untracked dirs
  2017-05-17  6:47                 ` Junio C Hamano
@ 2017-05-17  7:32                   ` Samuel Lijin
  2017-05-17 10:14                     ` Junio C Hamano
  0 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-17  7:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

On Wed, May 17, 2017 at 2:47 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Samuel Lijin <sxlijin@gmail.com> writes:
>
>> When we taught read_directory_recursive() to recurse into untracked
>> directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
>> had the side effect of teaching it to collect the untracked contents of
>> untracked directories. It doesn't always make sense to return these,
>> though (we do need them for `clean -d`), so we introduce a flag
>> (DIR_KEEP_UNTRACKED_CONTENTS) to control whether or not read_directory()
>> strips dir->entries of the untracked contents of untracked dirs.
>>
>> We also introduce check_contains() to check if one dir_entry corresponds
>> to a path which contains the path corresponding to another dir_entry.
>>
>> Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
>> ---
>>  dir.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  dir.h |  3 ++-
>>  2 files changed, 56 insertions(+), 1 deletion(-)
>>
>> diff --git a/dir.c b/dir.c
>> index 6bd0350e9..214a148ee 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -1852,6 +1852,14 @@ static int cmp_name(const void *p1, const void *p2)
>>       return name_compare(e1->name, e1->len, e2->name, e2->len);
>>  }
>>
>> +/* check if *out lexically contains *in */
>> +static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
>> +{
>> +     return (out->len < in->len) &&
>> +                     (out->name[out->len - 1] == '/') &&
>> +                     !memcmp(out->name, in->name, out->len);
>> +}
>
> OK, treat_one_path() and treat_pah_fast() both ensure that a path to
> a directory is terminated with '/' before calling dir_add_name() and
> dir_add_ignored(), so we know a dir_entry "out" that is a directory
> must end with '/'.  Good.
>
> The second and third line being overly indented is a bit
> distracting, though.
>
>>  static int treat_leading_path(struct dir_struct *dir,
>>                             const char *path, int len,
>>                             const struct pathspec *pathspec)
>> @@ -2067,6 +2075,52 @@ int read_directory(struct dir_struct *dir, const char *path,
>>               read_directory_recursive(dir, path, len, untracked, 0, pathspec);
>>       QSORT(dir->entries, dir->nr, cmp_name);
>>       QSORT(dir->ignored, dir->ignored_nr, cmp_name);
>> +
>> +     // if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
>> +     // up untracked contents of untracked dirs; by default we discard these,
>> +     // but given DIR_KEEP_UNTRACKED_CONTENTS we do not
>
>         /*
>          * Our multi-line comments are formatted like this
>          * example.  No C++/C99 // comments, outside of
>          * borrowed code and platform specific compat/ code,
>          * please.
>          */

Gahhhh, I keep forgetting about this, sorry. (There has to be a way to
tell my compiler to catch this, right? It's pretty embarrassing to get
called out for this twice...)

>> +     if ((dir->flags & DIR_SHOW_IGNORED_TOO)
>> +                  && !(dir->flags & DIR_KEEP_UNTRACKED_CONTENTS)) {
>
> Both having && at the end and && at the beginning are valid C, but
> please stick to one style in a single file.

Got it.

>> +             int i, j, nr_removed = 0;
>> +
>> +             // remove from dir->entries untracked contents of untracked dirs
>
>         /* And our single-liner comments look like this */
>
>> +             for (i = 0; i < dir->nr; i++) {
>> +                     if (!dir->entries[i])
>> +                             continue;
>> +
>> +                     for (j = i + 1; j < dir->nr; j++) {
>> +                             if (!dir->entries[j])
>> +                                     continue;
>> +                             if (check_contains(dir->entries[i], dir->entries[j])) {
>> +                                     nr_removed++;
>> +                                     free(dir->entries[j]);
>> +                                     dir->entries[j] = NULL;
>> +                             }
>> +                             else {
>> +                                     break;
>> +                             }
>> +                     }
>> +             }
>
> This loop is O(n^2).  I wonder if we can do better, especially we
> know dir->entries[] is sorted already.

Now that I think about it, dropping an `i = j - 1` into the inner loop
right before the break should work:

+                             else {
+                                     i = j - 1;
+                                     break;
+                             }

> Well, because it is sorted, if A/, A/B, and A/B/C are all untracked,
> the first round that scans for A/ will nuke both A/B and A/B/C, so
> we won't have to scan looking for entries inside A/B, which is a bit
> of consolation ;-)
>
>> +                     for (i = 0;;) {
>> +                             while (i < dir->nr && dir->entries[i])
>> +                                     i++;
>> +                             if (i == dir->nr)
>> +                                     break;
>> +                             j = i;
>> +                             while (j < dir->nr && !dir->entries[j])
>> +                                     j++;
>> +                             if (j == dir->nr)
>> +                                     break;
>> +                             dir->entries[i] = dir->entries[j];
>> +                             dir->entries[j] = NULL;
>> +                     }
>> +                     dir->nr -= nr_removed;
>
> This looks like an overly complicated way to scan an array and skip
> NULLs.  Are you doing an equivalent of this loop, or am I missing
> something subtle?
>
>         for (src = dst = 0; src < nr; src++)
>                 if (array[src])
>                         array[dst++] = src;
>         nr = dst;

Nope, that's pretty much it. Just me overthinking the problem.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v3 4/8] dir: hide untracked contents of untracked dirs
  2017-05-17  7:32                   ` Samuel Lijin
@ 2017-05-17 10:14                     ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-17 10:14 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git@vger.kernel.org

Samuel Lijin <sxlijin@gmail.com> writes:

>>         /* And our single-liner comments look like this */
>>
>>> +             for (i = 0; i < dir->nr; i++) {
>>> +                     if (!dir->entries[i])
>>> +                             continue;
>>> +
>>> +                     for (j = i + 1; j < dir->nr; j++) {
>>> +                             if (!dir->entries[j])
>>> +                                     continue;
>>> +                             if (check_contains(dir->entries[i], dir->entries[j])) {
>>> +                                     nr_removed++;
>>> +                                     free(dir->entries[j]);
>>> +                                     dir->entries[j] = NULL;
>>> +                             }
>>> +                             else {
>>> +                                     break;
>>> +                             }
>>> +                     }
>>> +             }
>>
>> This loop is O(n^2).  I wonder if we can do better, especially we
>> know dir->entries[] is sorted already.
>
> Now that I think about it, dropping an `i = j - 1` into the inner loop
> right before the break should work:

Actually, I think you should be able to do this in a single loop and
write in such a way that it is clearly O(N), perhaps like:

	for (src = dst = 0; src < nr; src++) {
		if (!dst ||
		    !is_a_directory(array[dst - 1]) ||
		    !contains(array[dst - 1], array[src])) {
			array[dst++] = array[src];
		} else {
	                /* 
	                 * Otherwise, src is inside (dst-1), so 
			 * we just can discard it.
			 */
			free(array[src]);
			array[src] = NULL; /* not needed */
		}
	}
	nr = dst;

That is, dst points at the next place to save the final elements of
the array, and dst-1 is the last one that is known to be the final
set.  src points at the element we are inspecting.

If dst-1 does not exist (i.e. we are looking at the first element),
if dst-1 is not a directory, or if dst-1 does not contain the src,
then we need to keep the src in the final set. Otherwise we discard.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v3 6/8] clean: teach clean -d to skip dirs containing ignored files
  2017-05-16  7:34               ` [PATCH v3 6/8] clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
@ 2017-05-18  2:34                 ` Junio C Hamano
  2017-05-18  6:32                   ` Samuel Lijin
  0 siblings, 1 reply; 89+ messages in thread
From: Junio C Hamano @ 2017-05-18  2:34 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> @@ -932,7 +935,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>  
>  	fill_directory(&dir, &pathspec);
>  
> -	for (i = 0; i < dir.nr; i++) {
> +	for (k = i = 0; i < dir.nr; i++) {
>  		struct dir_entry *ent = dir.entries[i];
>  		int matches = 0;
>  		struct stat st;
> @@ -954,10 +957,35 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>  		    matches != MATCHED_EXACTLY)
>  			continue;
>  
> +		// skip any dir.entries which contains a dir.ignored
> +		for (; k < dir.ignored_nr; k++) {
> +			if (cmp_dir_entry(&dir.entries[i],
> +						&dir.ignored[k]) < 0)
> +				break;

It is a bit unfortunate that we do not use the short-hand "ent" we
established for dir.entries[i] at the beginning of this loop here.

> +		}
> +		if ((k < dir.ignored_nr) &&
> +				check_dir_entry_contains(dir.entries[i], dir.ignored[k])) {
> +			continue;
> +		}

The above logic is not needed when dir.entries[i] is a directory,
right?

> +
> +		// current entry does not contain any ignored files

... or the entry may not have been a directory in the first place.

>  		rel = relative_path(ent->name, prefix, &buf);
>  		string_list_append(&del_list, rel);
> +
> +		// skip untracked contents of an untracked dir
> +		for (j = i + 1;
> +			 j < dir.nr &&
> +			     check_dir_entry_contains(dir.entries[i], dir.entries[j]);
> +			 j++);
> +		i = j - 1;
>  	}
>
> +	for (i = 0; i < dir.nr; i++)
> +		free(dir.entries[i]);
> +
> +	for (i = 0; i < dir.ignored_nr; i++)
> +		free(dir.ignored[i]);
> +

The above logic may not be incorrect per-se, but I have this
suspicion that the resulting code may become cleaner and easier to
understand if we did it in a new separate loop we call immediately
after fill_directory() returns.  Or it could even be a call to a
helper that "post processes" the "dir" thing that was polupated by
fill_directory() that removes elements in the entries[] array that
contains any element in ignored[] array.

>  	if (interactive && del_list.nr > 0)
>  		interactive_main_loop();

Thanks.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v3 7/8] t7300: clean -d now skips untracked dirs containing ignored files
  2017-05-16  7:34               ` [PATCH v3 7/8] t7300: clean -d now skips untracked " Samuel Lijin
@ 2017-05-18  2:38                 ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-18  2:38 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

This and 8/8 needs to be part of the code change that fixes them.
Otherwise, running tests after applying only 1-6/8 would give you:

    Test Summary Report
    -------------------
    t7061-wtstatus-ignore.sh (Wstat: 0 Tests: 21 Failed: 0)
      TODO passed:   1-2
    t7300-clean.sh          (Wstat: 0 Tests: 45 Failed: 0)
      TODO passed:   45
    Files=2, Tests=66,...

or

    $ sh t7061-wtstatus-ignore.sh
    ok 1 - status untracke... # TODO known breakage vanished
    ...
    # 2 known breakage(s) vanished; please update test(s)
    # passed all remaining 19 test(s)
    1..21

with the "please update" painted in red.

    

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v3 8/8] t7061: status --ignored now searches untracked dirs
  2017-05-16  7:34               ` [PATCH v3 8/8] t7061: status --ignored now searches untracked dirs Samuel Lijin
@ 2017-05-18  2:46                 ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-18  2:46 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
> ---

This was fixed at "hide untracked" patch, so squash these changes in
to that commit and add something like

    This fixes known breakages in t7061 documented earlier in the series.

at the end of the log message of that one.  The same for 7/8, which
was fixed in "teach clean -d to skip dirs" patch.

Thanks.


>  t/t7061-wtstatus-ignore.sh | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
> index 15e7592b6..fc6013ba3 100755
> --- a/t/t7061-wtstatus-ignore.sh
> +++ b/t/t7061-wtstatus-ignore.sh
> @@ -12,7 +12,7 @@ cat >expected <<\EOF
>  !! untracked/ignored
>  EOF
>  
> -test_expect_failure 'status untracked directory with --ignored' '
> +test_expect_success 'status untracked directory with --ignored' '
>  	echo "ignored" >.gitignore &&
>  	mkdir untracked &&
>  	: >untracked/ignored &&
> @@ -21,7 +21,7 @@ test_expect_failure 'status untracked directory with --ignored' '
>  	test_cmp expected actual
>  '
>  
> -test_expect_failure 'same with gitignore starting with BOM' '
> +test_expect_success 'same with gitignore starting with BOM' '
>  	printf "\357\273\277ignored\n" >.gitignore &&
>  	mkdir -p untracked &&
>  	: >untracked/ignored &&

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v3 6/8] clean: teach clean -d to skip dirs containing ignored files
  2017-05-18  2:34                 ` Junio C Hamano
@ 2017-05-18  6:32                   ` Samuel Lijin
  0 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-18  6:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

On Wed, May 17, 2017 at 10:34 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Samuel Lijin <sxlijin@gmail.com> writes:
>
>> @@ -932,7 +935,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>>
>>       fill_directory(&dir, &pathspec);
>>
>> -     for (i = 0; i < dir.nr; i++) {
>> +     for (k = i = 0; i < dir.nr; i++) {
>>               struct dir_entry *ent = dir.entries[i];
>>               int matches = 0;
>>               struct stat st;
>> @@ -954,10 +957,35 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>>                   matches != MATCHED_EXACTLY)
>>                       continue;
>>
>> +             // skip any dir.entries which contains a dir.ignored
>> +             for (; k < dir.ignored_nr; k++) {
>> +                     if (cmp_dir_entry(&dir.entries[i],
>> +                                             &dir.ignored[k]) < 0)
>> +                             break;
>
> It is a bit unfortunate that we do not use the short-hand "ent" we
> established for dir.entries[i] at the beginning of this loop here.
>
>> +             }
>> +             if ((k < dir.ignored_nr) &&
>> +                             check_dir_entry_contains(dir.entries[i], dir.ignored[k])) {
>> +                     continue;
>> +             }
>
> The above logic is not needed when dir.entries[i] is a directory,
> right?

Au contraire - this logic only matters when dir.entries[i] is a directory.

>> +
>> +             // current entry does not contain any ignored files
>
> ... or the entry may not have been a directory in the first place.

If it's not a directory, it can't contain ignored files ;)

>>               rel = relative_path(ent->name, prefix, &buf);
>>               string_list_append(&del_list, rel);
>> +
>> +             // skip untracked contents of an untracked dir
>> +             for (j = i + 1;
>> +                      j < dir.nr &&
>> +                          check_dir_entry_contains(dir.entries[i], dir.entries[j]);
>> +                      j++);
>> +             i = j - 1;
>>       }
>>
>> +     for (i = 0; i < dir.nr; i++)
>> +             free(dir.entries[i]);
>> +
>> +     for (i = 0; i < dir.ignored_nr; i++)
>> +             free(dir.ignored[i]);
>> +
>
> The above logic may not be incorrect per-se, but I have this
> suspicion that the resulting code may become cleaner and easier to
> understand if we did it in a new separate loop we call immediately
> after fill_directory() returns.  Or it could even be a call to a
> helper that "post processes" the "dir" thing that was polupated by
> fill_directory() that removes elements in the entries[] array that
> contains any element in ignored[] array.

Will try it out.

>>       if (interactive && del_list.nr > 0)
>>               interactive_main_loop();
>
> Thanks.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v4 0/6] Fix clean -d and status --ignored
  2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
@ 2017-05-18  8:21                 ` Samuel Lijin
  2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
                                     ` (6 more replies)
  2017-05-18  8:21                 ` [PATCH v4 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
                                   ` (5 subsequent siblings)
  6 siblings, 7 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-18  8:21 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Incorporates all of Junio's feedback on v4:

* squashes test expect_failure -> expect_success flips into the commits that
  fix them
* the O(n^2) pruning loop added for DIR_KEEP_UNTRACKED_CONTENTS is now O(n)
* the logic in cmd_clean() that keeps the contents of an untracked dir from the
  del_list is now hoisted into a separate loop

Also includes the following:

* expands the test I add in t7300 to check clean -d, for better coverage of the
  pruning logic in cmd_clean() (the logic that tells clean -d to not remove a/b
  and a/c if it's already removing a/)
* documents DIR_KEEP_UNTRACKED_CONTENTS in the corresponding technical API doc

Samuel Lijin (6):
  t7300: clean -d should skip dirs with ignored files
  t7061: status --ignored should search untracked dirs
  dir: recurse into untracked dirs for ignored files
  dir: hide untracked contents of untracked dirs
  dir: expose cmp_name() and check_contains()
  clean: teach clean -d to skip dirs containing ignored files

 Documentation/technical/api-directory-listing.txt |  6 ++++
 builtin/clean.c                                   | 38 +++++++++++++++++++-
 dir.c                                             | 42 ++++++++++++++++++++---
 dir.h                                             |  6 +++-
 t/t7061-wtstatus-ignore.sh                        |  1 +
 t/t7300-clean.sh                                  | 16 +++++++++
 6 files changed, 103 insertions(+), 6 deletions(-)

-- 
2.13.0


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v4 1/6] t7300: clean -d should skip dirs with ignored files
  2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
  2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
@ 2017-05-18  8:21                 ` Samuel Lijin
  2017-05-18  8:21                 ` [PATCH v4 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
                                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-18  8:21 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

If git sees a directory which contains only untracked and ignored
files, clean -d should not remove that directory. It was recently
discovered that this is *not* true of git clean -d, and it's possible
that this has never worked correctly; this test and its accompanying
patch series aims to fix that.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7300-clean.sh | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index b89fd2a6a..3a2d709c2 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,4 +653,20 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
+test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
+	echo /foo/bar >.gitignore &&
+	echo ignoreme >>.gitignore &&
+	rm -rf foo &&
+	mkdir -p foo/a/aa/aaa foo/b/bb/bbb &&
+	touch foo/bar foo/baz foo/a/aa/ignoreme foo/b/ignoreme foo/b/bb/1 foo/b/bb/2 &&
+	git clean -df &&
+	test_path_is_dir foo &&
+	test_path_is_file foo/bar &&
+	test_path_is_missing foo/baz &&
+	test_path_is_file foo/a/aa/ignoreme &&
+	test_path_is_missing foo/a/aa/aaa &&
+	test_path_is_file foo/b/ignoreme &&
+	test_path_is_missing foo/b/bb
+'
+
 test_done
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v4 2/6] t7061: status --ignored should search untracked dirs
  2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
  2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
  2017-05-18  8:21                 ` [PATCH v4 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
@ 2017-05-18  8:21                 ` Samuel Lijin
  2017-05-18  8:21                 ` [PATCH v4 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
                                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-18  8:21 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Per eb8c5b87, `status --ignored` by design does not list ignored files
if they are in a directory which contains only ignored and untracked
files (which is itself considered to be untracked) without `-uall`. This
does not make sense for `--ignored`, which claims to "Show ignored files
as well."

Thus we revisit eb8c5b87 and decide that for such directories, `status
--ignored` will list the directory as untracked *and* list all ignored
files within said directory even without `-uall`.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7061-wtstatus-ignore.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index cdc0747bf..15e7592b6 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -9,9 +9,10 @@ cat >expected <<\EOF
 ?? actual
 ?? expected
 ?? untracked/
+!! untracked/ignored
 EOF
 
-test_expect_success 'status untracked directory with --ignored' '
+test_expect_failure 'status untracked directory with --ignored' '
 	echo "ignored" >.gitignore &&
 	mkdir untracked &&
 	: >untracked/ignored &&
@@ -20,7 +21,7 @@ test_expect_success 'status untracked directory with --ignored' '
 	test_cmp expected actual
 '
 
-test_expect_success 'same with gitignore starting with BOM' '
+test_expect_failure 'same with gitignore starting with BOM' '
 	printf "\357\273\277ignored\n" >.gitignore &&
 	mkdir -p untracked &&
 	: >untracked/ignored &&
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v4 3/6] dir: recurse into untracked dirs for ignored files
  2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
                                   ` (2 preceding siblings ...)
  2017-05-18  8:21                 ` [PATCH v4 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
@ 2017-05-18  8:21                 ` Samuel Lijin
  2017-05-18  8:21                 ` [PATCH v4 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
                                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-18  8:21 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We consider directories containing only untracked and ignored files to
be themselves untracked, which in the usual case means we don't have to
search these directories. This is problematic when we want to collect
ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
read_directory_recursive() to recurse into untracked directories to find
the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
has the side effect of also collecting all untracked files in untracked
directories as well.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index f451bfa48..68cf6e47c 100644
--- a/dir.c
+++ b/dir.c
@@ -1784,7 +1784,10 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 			dir_state = state;
 
 		/* recurse into subdir if instructed by treat_path */
-		if (state == path_recurse) {
+		if ((state == path_recurse) ||
+			((state == path_untracked) &&
+			 (dir->flags & DIR_SHOW_IGNORED_TOO) &&
+			 (get_dtype(cdir.de, path.buf, path.len) == DT_DIR))) {
 			struct untracked_cache_dir *ud;
 			ud = lookup_untracked(dir->untracked, untracked,
 					      path.buf + baselen,
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v4 4/6] dir: hide untracked contents of untracked dirs
  2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
                                   ` (3 preceding siblings ...)
  2017-05-18  8:21                 ` [PATCH v4 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
@ 2017-05-18  8:21                 ` Samuel Lijin
  2017-05-22  3:13                   ` Junio C Hamano
  2017-05-18  8:21                 ` [PATCH v4 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
  2017-05-18  8:21                 ` [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
  6 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-18  8:21 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

When we taught read_directory_recursive() to recurse into untracked
directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
had the side effect of teaching it to collect the untracked contents of
untracked directories. It doesn't always make sense to return these,
though (we do need them for `clean -d`), so we introduce a flag
(DIR_KEEP_UNTRACKED_CONTENTS) to control whether or not read_directory()
strips dir->entries of the untracked contents of untracked dirs.

We also introduce check_contains() to check if one dir_entry corresponds
to a path which contains the path corresponding to another dir_entry.

This also fixes known breakages in t7061, since status --ignored now
searches untracked directories for ignored files.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 Documentation/technical/api-directory-listing.txt |  6 +++++
 dir.c                                             | 30 +++++++++++++++++++++++
 dir.h                                             |  3 ++-
 t/t7061-wtstatus-ignore.sh                        |  4 +--
 4 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index 7f8e78d91..6c77b4920 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -33,6 +33,12 @@ The notable options are:
 	Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]`
 	in addition to untracked files in `entries[]`.
 
+`DIR_KEEP_UNTRACKED_CONTENTS`:::
+
+	Only has meaning if `DIR_SHOW_IGNORED_TOO` is also set; if this is set, the
+	untracked contents of untracked directories are also returned in
+	`entries[]`.
+
 `DIR_COLLECT_IGNORED`:::
 
 	Special mode for git-add. Return ignored files in `ignored[]` and
diff --git a/dir.c b/dir.c
index 68cf6e47c..4b0a99323 100644
--- a/dir.c
+++ b/dir.c
@@ -1850,6 +1850,14 @@ static int cmp_name(const void *p1, const void *p2)
 	return name_compare(e1->name, e1->len, e2->name, e2->len);
 }
 
+/* check if *out lexically strictly contains *in */
+static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+{
+	return (out->len < in->len) &&
+		(out->name[out->len - 1] == '/') &&
+		!memcmp(out->name, in->name, out->len);
+}
+
 static int treat_leading_path(struct dir_struct *dir,
 			      const char *path, int len,
 			      const struct pathspec *pathspec)
@@ -2065,6 +2073,28 @@ int read_directory(struct dir_struct *dir, const char *path,
 		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
 	QSORT(dir->entries, dir->nr, cmp_name);
 	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
+
+	/* if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
+	 * up untracked contents of untracked dirs; by default we discard these,
+	 * but given DIR_KEEP_UNTRACKED_CONTENTS we do not
+	 */
+	if ((dir->flags & DIR_SHOW_IGNORED_TOO) &&
+		     !(dir->flags & DIR_KEEP_UNTRACKED_CONTENTS)) {
+		int i, j;
+
+		/* remove from dir->entries untracked contents of untracked dirs */
+		for (i = j = 0; j < dir->nr; j++) {
+			if (i && check_contains(dir->entries[i - 1], dir->entries[j])) {
+				free(dir->entries[j]);
+				dir->entries[j] = NULL;
+			} else {
+				dir->entries[i++] = dir->entries[j];
+			}
+		}
+
+		dir->nr = i;
+	}
+
 	if (dir->untracked) {
 		static struct trace_key trace_untracked_stats = TRACE_KEY_INIT(UNTRACKED_STATS);
 		trace_printf_key(&trace_untracked_stats,
diff --git a/dir.h b/dir.h
index bf23a470a..650e54bdf 100644
--- a/dir.h
+++ b/dir.h
@@ -151,7 +151,8 @@ struct dir_struct {
 		DIR_NO_GITLINKS = 1<<3,
 		DIR_COLLECT_IGNORED = 1<<4,
 		DIR_SHOW_IGNORED_TOO = 1<<5,
-		DIR_COLLECT_KILLED_ONLY = 1<<6
+		DIR_COLLECT_KILLED_ONLY = 1<<6,
+		DIR_KEEP_UNTRACKED_CONTENTS = 1<<7
 	} flags;
 	struct dir_entry **entries;
 	struct dir_entry **ignored;
diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index 15e7592b6..fc6013ba3 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -12,7 +12,7 @@ cat >expected <<\EOF
 !! untracked/ignored
 EOF
 
-test_expect_failure 'status untracked directory with --ignored' '
+test_expect_success 'status untracked directory with --ignored' '
 	echo "ignored" >.gitignore &&
 	mkdir untracked &&
 	: >untracked/ignored &&
@@ -21,7 +21,7 @@ test_expect_failure 'status untracked directory with --ignored' '
 	test_cmp expected actual
 '
 
-test_expect_failure 'same with gitignore starting with BOM' '
+test_expect_success 'same with gitignore starting with BOM' '
 	printf "\357\273\277ignored\n" >.gitignore &&
 	mkdir -p untracked &&
 	: >untracked/ignored &&
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v4 5/6] dir: expose cmp_name() and check_contains()
  2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
                                   ` (4 preceding siblings ...)
  2017-05-18  8:21                 ` [PATCH v4 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
@ 2017-05-18  8:21                 ` Samuel Lijin
  2017-05-22  0:38                   ` Junio C Hamano
  2017-05-18  8:21                 ` [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
  6 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-18  8:21 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We want to use cmp_name() and check_contains() (which both compare
`struct dir_entry`s, the former in terms of the sort order, the latter
in terms of whether one lexically contains another) outside of dir.c,
so we have to (1) change their linkage and (2) rename them as
appropriate for the global namespace. The second is achieved by
renaming cmp_name() to cmp_dir_entry() and check_contains() to
check_dir_entry_contains().

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 11 ++++++-----
 dir.h |  3 +++
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index 4b0a99323..2c520f17a 100644
--- a/dir.c
+++ b/dir.c
@@ -1842,7 +1842,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 	return dir_state;
 }
 
-static int cmp_name(const void *p1, const void *p2)
+int cmp_dir_entry(const void *p1, const void *p2)
 {
 	const struct dir_entry *e1 = *(const struct dir_entry **)p1;
 	const struct dir_entry *e2 = *(const struct dir_entry **)p2;
@@ -1851,7 +1851,7 @@ static int cmp_name(const void *p1, const void *p2)
 }
 
 /* check if *out lexically strictly contains *in */
-static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+int check_dir_entry_contains(const struct dir_entry *out, const struct dir_entry *in)
 {
 	return (out->len < in->len) &&
 		(out->name[out->len - 1] == '/') &&
@@ -2071,8 +2071,8 @@ int read_directory(struct dir_struct *dir, const char *path,
 		dir->untracked = NULL;
 	if (!len || treat_leading_path(dir, path, len, pathspec))
 		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
-	QSORT(dir->entries, dir->nr, cmp_name);
-	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
+	QSORT(dir->entries, dir->nr, cmp_dir_entry);
+	QSORT(dir->ignored, dir->ignored_nr, cmp_dir_entry);
 
 	/* if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
 	 * up untracked contents of untracked dirs; by default we discard these,
@@ -2084,7 +2084,8 @@ int read_directory(struct dir_struct *dir, const char *path,
 
 		/* remove from dir->entries untracked contents of untracked dirs */
 		for (i = j = 0; j < dir->nr; j++) {
-			if (i && check_contains(dir->entries[i - 1], dir->entries[j])) {
+			if (i &&
+			    check_dir_entry_contains(dir->entries[i - 1], dir->entries[j])) {
 				free(dir->entries[j]);
 				dir->entries[j] = NULL;
 			} else {
diff --git a/dir.h b/dir.h
index 650e54bdf..edb5fda58 100644
--- a/dir.h
+++ b/dir.h
@@ -327,6 +327,9 @@ static inline int dir_path_match(const struct dir_entry *ent,
 			      has_trailing_dir);
 }
 
+int cmp_dir_entry(const void *p1, const void *p2);
+int check_dir_entry_contains(const struct dir_entry *out, const struct dir_entry *in);
+
 void untracked_cache_invalidate_path(struct index_state *, const char *);
 void untracked_cache_remove_from_index(struct index_state *, const char *);
 void untracked_cache_add_to_index(struct index_state *, const char *);
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files
  2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
                                   ` (5 preceding siblings ...)
  2017-05-18  8:21                 ` [PATCH v4 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
@ 2017-05-18  8:21                 ` Samuel Lijin
  2017-05-22  4:48                   ` Junio C Hamano
  6 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-18  8:21 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

There is an implicit assumption that a directory containing only
untracked and ignored files should itself be considered untracked. This
makes sense in use cases where we're asking if a directory should be
added to the git database, but not when we're asking if a directory can
be safely removed from the working tree; as a result, clean -d would
assume that an "untracked" directory containing ignored files could be
deleted.

To get around this, we teach clean -d to collect ignored files and skip
over so-called "untracked" directories if they contain any ignored
files (while still removing the untracked contents of such dirs).

This also fixes the known breakage in t7300 since clean -d now skips
untracked directories containing ignored files.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 builtin/clean.c  | 38 +++++++++++++++++++++++++++++++++++++-
 t/t7300-clean.sh |  2 +-
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index d861f836a..0d490d76e 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -859,7 +859,7 @@ static void interactive_main_loop(void)
 
 int cmd_clean(int argc, const char **argv, const char *prefix)
 {
-	int i, res;
+	int i, j, res;
 	int dry_run = 0, remove_directories = 0, quiet = 0, ignored = 0;
 	int ignored_only = 0, config_set = 0, errors = 0, gone = 1;
 	int rm_flags = REMOVE_DIR_KEEP_NESTED_GIT;
@@ -911,6 +911,9 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 				  " refusing to clean"));
 	}
 
+	if (remove_directories)
+		dir.flags |= DIR_SHOW_IGNORED_TOO | DIR_KEEP_UNTRACKED_CONTENTS;
+
 	if (force > 1)
 		rm_flags = 0;
 
@@ -932,12 +935,39 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 
 	fill_directory(&dir, &pathspec);
 
+	for (j = i = 0; i < dir.nr;) {
+		for (;
+		     j < dir.ignored_nr &&
+		       0 <= cmp_dir_entry(&dir.entries[i], &dir.ignored[j]);
+		     j++);
+
+		if ((j < dir.ignored_nr) &&
+				check_dir_entry_contains(dir.entries[i], dir.ignored[j])) {
+			/* skip any dir.entries which contains a dir.ignored */
+			free(dir.entries[i]);
+			dir.entries[i++] = NULL;
+		} else {
+			/* prune the contents of a dir.entries which will be removed */
+			struct dir_entry *ent = dir.entries[i++];
+			for (;
+			     i < dir.nr &&
+			       check_dir_entry_contains(ent, dir.entries[i]);
+			     i++) {
+				free(dir.entries[i]);
+				dir.entries[i] = NULL;
+			}
+		}
+	}
+
 	for (i = 0; i < dir.nr; i++) {
 		struct dir_entry *ent = dir.entries[i];
 		int matches = 0;
 		struct stat st;
 		const char *rel;
 
+		if (!ent)
+			continue;
+
 		if (!cache_name_is_other(ent->name, ent->len))
 			continue;
 
@@ -958,6 +988,12 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		string_list_append(&del_list, rel);
 	}
 
+	for (i = 0; i < dir.nr; i++)
+		free(dir.entries[i]);
+
+	for (i = 0; i < dir.ignored_nr; i++)
+		free(dir.ignored[i]);
+
 	if (interactive && del_list.nr > 0)
 		interactive_main_loop();
 
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 3a2d709c2..7b36954d6 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,7 +653,7 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
-test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
+test_expect_success 'git clean -d skips untracked dirs containing ignored files' '
 	echo /foo/bar >.gitignore &&
 	echo ignoreme >>.gitignore &&
 	rm -rf foo &&
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v4 5/6] dir: expose cmp_name() and check_contains()
  2017-05-18  8:21                 ` [PATCH v4 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
@ 2017-05-22  0:38                   ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-22  0:38 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> We want to use cmp_name() and check_contains() (which both compare
> `struct dir_entry`s, the former in terms of the sort order, the latter
> in terms of whether one lexically contains another) outside of dir.c,
> so we have to (1) change their linkage and (2) rename them as
> appropriate for the global namespace. The second is achieved by
> renaming cmp_name() to cmp_dir_entry() and check_contains() to
> check_dir_entry_contains().
>
> Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
> ---
>  dir.c | 11 ++++++-----
>  dir.h |  3 +++
>  2 files changed, 9 insertions(+), 5 deletions(-)

Up to this point in the series all looked sensible.  I haven't
looked the last one carefully to form an opinion yet.

Thanks.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v4 4/6] dir: hide untracked contents of untracked dirs
  2017-05-18  8:21                 ` [PATCH v4 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
@ 2017-05-22  3:13                   ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-22  3:13 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> +
> +	/* if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
> +	 * up untracked contents of untracked dirs; by default we discard these,
> +	 * but given DIR_KEEP_UNTRACKED_CONTENTS we do not
> +	 */

No need to resend only to fix this, as I'll fix it up while queuing,
but for the next time, 

        /*
         * Our multi-line comments have their opening slash-aster
         * and closing aster-slash on their own lines, like this.
         */


^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files
  2017-05-18  8:21                 ` [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
@ 2017-05-22  4:48                   ` Junio C Hamano
  2017-05-22  5:58                     ` Samuel Lijin
  0 siblings, 1 reply; 89+ messages in thread
From: Junio C Hamano @ 2017-05-22  4:48 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> +	for (j = i = 0; i < dir.nr;) {
> +		for (;
> +		     j < dir.ignored_nr &&
> +		       0 <= cmp_dir_entry(&dir.entries[i], &dir.ignored[j]);
> +		     j++);
> +
> +		if ((j < dir.ignored_nr) &&
> +				check_dir_entry_contains(dir.entries[i], dir.ignored[j])) {
> +			/* skip any dir.entries which contains a dir.ignored */
> +			free(dir.entries[i]);
> +			dir.entries[i++] = NULL;
> +		} else {
> +			/* prune the contents of a dir.entries which will be removed */
> +			struct dir_entry *ent = dir.entries[i++];
> +			for (;
> +			     i < dir.nr &&
> +			       check_dir_entry_contains(ent, dir.entries[i]);
> +			     i++) {
> +				free(dir.entries[i]);
> +				dir.entries[i] = NULL;
> +			}
> +		}
> +	}

The second loop in the else clause is a bit tricky, and the comment
"which will be removed" is not all that helpful to explain why the
loop is there.

But I think the code is correct.  Here is how I understood it.

    While looking at dir.entries[i], the code noticed that nothing
    in that directory is ignored.  But entries in dir.entries[] that
    come later may be contained in dir.entries[i] and we just want
    to show the top-level untracked one (e.g. "a/" and "a/b/" were
    in entries[], there is nothing in "a/", so naturally there is
    nothing in "a/b/", but we do not want to bother showing
    both---showing "a/" alone saying "the entire a/ is untracked" is
    what we want).

We may want to have a test to ensure "a/b/" is indeed omitted in
such a situation from the output, though.

By the way, instead of putting NULL, it may be easier to follow if
you used two pointers, src and dst, into dir.entries[], just like
you did in your latest version of [PATCH 4/6].  That way, you do not
have to change anything in the later loop that walks over elements
in the dir.entries[] array.  It would also help the logic easier to
follow if the above loop were its own helper function.

Putting them all together, here is what I came up with that can be
squashed into your patch.  I am undecided myself if this is easier
to follow than your version, but it seems to pass your test ;-)

Thanks.

 builtin/clean.c | 70 ++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 42 insertions(+), 28 deletions(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index dd3308a447..c8712e7ac8 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -851,9 +851,49 @@ static void interactive_main_loop(void)
 	}
 }
 
+static void simplify_untracked(struct dir_struct *dir)
+{
+	int src, dst, ign;
+
+	for (src = dst = ign = 0; src < dir->nr; src++) {
+		/*
+		 * Skip entries in ignored[] that cannot be inside
+		 * entries[src]
+		 */
+		while (ign < dir->ignored_nr &&
+		       0 <= cmp_dir_entry(&dir->entries[src], &dir->ignored[ign]))
+			ign++;
+
+		if (dir->ignored_nr <= ign ||
+		    !check_dir_entry_contains(dir->entries[src], dir->ignored[ign])) {
+			/*
+			 * entries[src] does not contain an ignored
+			 * path -- we need to keep it.  But we do not
+			 * want to show entries[] that are contained
+			 * in entries[src].
+			 */
+			struct dir_entry *ent = dir->entries[src++];
+			dir->entries[dst++] = ent;
+			while (src < dir->nr &&
+			       check_dir_entry_contains(ent, dir->entries[src])) {
+				free(dir->entries[src++]);
+			}
+			/* compensate for the outer loop's loop control */
+			src--;
+		} else {
+			/*
+			 * entries[src] contains an ignored path --
+			 * drop it.
+			 */
+			free(dir->entries[src]);
+		}
+	}
+	dir->nr = dst;
+}
+
 int cmd_clean(int argc, const char **argv, const char *prefix)
 {
-	int i, j, res;
+	int i, res;
 	int dry_run = 0, remove_directories = 0, quiet = 0, ignored = 0;
 	int ignored_only = 0, config_set = 0, errors = 0, gone = 1;
 	int rm_flags = REMOVE_DIR_KEEP_NESTED_GIT;
@@ -928,30 +968,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		       prefix, argv);
 
 	fill_directory(&dir, &pathspec);
-
-	for (j = i = 0; i < dir.nr;) {
-		for (;
-		     j < dir.ignored_nr &&
-		       0 <= cmp_dir_entry(&dir.entries[i], &dir.ignored[j]);
-		     j++);
-
-		if ((j < dir.ignored_nr) &&
-				check_dir_entry_contains(dir.entries[i], dir.ignored[j])) {
-			/* skip any dir.entries which contains a dir.ignored */
-			free(dir.entries[i]);
-			dir.entries[i++] = NULL;
-		} else {
-			/* prune the contents of a dir.entries which will be removed */
-			struct dir_entry *ent = dir.entries[i++];
-			for (;
-			     i < dir.nr &&
-			       check_dir_entry_contains(ent, dir.entries[i]);
-			     i++) {
-				free(dir.entries[i]);
-				dir.entries[i] = NULL;
-			}
-		}
-	}
+	simplify_untracked(&dir);
 
 	for (i = 0; i < dir.nr; i++) {
 		struct dir_entry *ent = dir.entries[i];
@@ -959,9 +976,6 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		struct stat st;
 		const char *rel;
 
-		if (!ent)
-			continue;
-
 		if (!cache_name_is_other(ent->name, ent->len))
 			continue;
 

^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files
  2017-05-22  4:48                   ` Junio C Hamano
@ 2017-05-22  5:58                     ` Samuel Lijin
  2017-05-22  6:17                       ` Junio C Hamano
  0 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-22  5:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

On Mon, May 22, 2017 at 12:48 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Samuel Lijin <sxlijin@gmail.com> writes:
>
>> +     for (j = i = 0; i < dir.nr;) {
>> +             for (;
>> +                  j < dir.ignored_nr &&
>> +                    0 <= cmp_dir_entry(&dir.entries[i], &dir.ignored[j]);
>> +                  j++);
>> +
>> +             if ((j < dir.ignored_nr) &&
>> +                             check_dir_entry_contains(dir.entries[i], dir.ignored[j])) {
>> +                     /* skip any dir.entries which contains a dir.ignored */
>> +                     free(dir.entries[i]);
>> +                     dir.entries[i++] = NULL;
>> +             } else {
>> +                     /* prune the contents of a dir.entries which will be removed */
>> +                     struct dir_entry *ent = dir.entries[i++];
>> +                     for (;
>> +                          i < dir.nr &&
>> +                            check_dir_entry_contains(ent, dir.entries[i]);
>> +                          i++) {
>> +                             free(dir.entries[i]);
>> +                             dir.entries[i] = NULL;
>> +                     }
>> +             }
>> +     }
>
> The second loop in the else clause is a bit tricky, and the comment
> "which will be removed" is not all that helpful to explain why the
> loop is there.
>
> But I think the code is correct.  Here is how I understood it.
>
>     While looking at dir.entries[i], the code noticed that nothing
>     in that directory is ignored.  But entries in dir.entries[] that
>     come later may be contained in dir.entries[i] and we just want
>     to show the top-level untracked one (e.g. "a/" and "a/b/" were
>     in entries[], there is nothing in "a/", so naturally there is
>     nothing in "a/b/", but we do not want to bother showing
>     both---showing "a/" alone saying "the entire a/ is untracked" is
>     what we want).

Yep, that's the gist of it.

More specifically: the contents of untracked dirs have to be picked up
so that clean -d can distinguish between purely untracked dirs and
untracked dirs which also contain ignored files. In the case of a
purely untracked dir, clean -d can remove the dir itself, and should
just skip over all the dir's contents; in the case of a mixed
untracked dir, clean -d should not remove the dir itself, just the
untracked contents.

> We may want to have a test to ensure "a/b/" is indeed omitted in
> such a situation from the output, though.

Is there a reason to ensure specifically a/b/ is tested, or are the
current tests, which do cover the a/b (i.e. where a/b is a file, not
where a/b/ is a dir) case, insufficient for some reason?

> By the way, instead of putting NULL, it may be easier to follow if
> you used two pointers, src and dst, into dir.entries[], just like
> you did in your latest version of [PATCH 4/6].  That way, you do not
> have to change anything in the later loop that walks over elements
> in the dir.entries[] array.  It would also help the logic easier to
> follow if the above loop were its own helper function.

Agreed on the helper function. On the src-dst thing: I considered it,
but I figured another O(n) set of array moves was unnecessary. I guess
this is one of those cases where premature optimization doesn't make
sense?

> Putting them all together, here is what I came up with that can be
> squashed into your patch.  I am undecided myself if this is easier
> to follow than your version, but it seems to pass your test ;-)
>
> Thanks.
>
>  builtin/clean.c | 70 ++++++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 42 insertions(+), 28 deletions(-)
>
> diff --git a/builtin/clean.c b/builtin/clean.c
> index dd3308a447..c8712e7ac8 100644
> --- a/builtin/clean.c
> +++ b/builtin/clean.c
> @@ -851,9 +851,49 @@ static void interactive_main_loop(void)
>         }
>  }
>
> +static void simplify_untracked(struct dir_struct *dir)
> +{
> +       int src, dst, ign;
> +
> +       for (src = dst = ign = 0; src < dir->nr; src++) {
> +               /*
> +                * Skip entries in ignored[] that cannot be inside
> +                * entries[src]
> +                */
> +               while (ign < dir->ignored_nr &&
> +                      0 <= cmp_dir_entry(&dir->entries[src], &dir->ignored[ign]))
> +                       ign++;
> +
> +               if (dir->ignored_nr <= ign ||
> +                   !check_dir_entry_contains(dir->entries[src], dir->ignored[ign])) {
> +                       /*
> +                        * entries[src] does not contain an ignored
> +                        * path -- we need to keep it.  But we do not
> +                        * want to show entries[] that are contained
> +                        * in entries[src].
> +                        */
> +                       struct dir_entry *ent = dir->entries[src++];
> +                       dir->entries[dst++] = ent;
> +                       while (src < dir->nr &&
> +                              check_dir_entry_contains(ent, dir->entries[src])) {
> +                               free(dir->entries[src++]);
> +                       }
> +                       /* compensate for the outer loop's loop control */
> +                       src--;
> +               } else {
> +                       /*
> +                        * entries[src] contains an ignored path --
> +                        * drop it.
> +                        */
> +                       free(dir->entries[src]);
> +               }
> +       }
> +       dir->nr = dst;
> +}
> +
>  int cmd_clean(int argc, const char **argv, const char *prefix)
>  {
> -       int i, j, res;
> +       int i, res;
>         int dry_run = 0, remove_directories = 0, quiet = 0, ignored = 0;
>         int ignored_only = 0, config_set = 0, errors = 0, gone = 1;
>         int rm_flags = REMOVE_DIR_KEEP_NESTED_GIT;
> @@ -928,30 +968,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>                        prefix, argv);
>
>         fill_directory(&dir, &pathspec);
> -
> -       for (j = i = 0; i < dir.nr;) {
> -               for (;
> -                    j < dir.ignored_nr &&
> -                      0 <= cmp_dir_entry(&dir.entries[i], &dir.ignored[j]);
> -                    j++);
> -
> -               if ((j < dir.ignored_nr) &&
> -                               check_dir_entry_contains(dir.entries[i], dir.ignored[j])) {
> -                       /* skip any dir.entries which contains a dir.ignored */
> -                       free(dir.entries[i]);
> -                       dir.entries[i++] = NULL;
> -               } else {
> -                       /* prune the contents of a dir.entries which will be removed */
> -                       struct dir_entry *ent = dir.entries[i++];
> -                       for (;
> -                            i < dir.nr &&
> -                              check_dir_entry_contains(ent, dir.entries[i]);
> -                            i++) {
> -                               free(dir.entries[i]);
> -                               dir.entries[i] = NULL;
> -                       }
> -               }
> -       }
> +       simplify_untracked(&dir);
>
>         for (i = 0; i < dir.nr; i++) {
>                 struct dir_entry *ent = dir.entries[i];
> @@ -959,9 +976,6 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>                 struct stat st;
>                 const char *rel;
>
> -               if (!ent)
> -                       continue;
> -
>                 if (!cache_name_is_other(ent->name, ent->len))
>                         continue;
>

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files
  2017-05-22  5:58                     ` Samuel Lijin
@ 2017-05-22  6:17                       ` Junio C Hamano
  2017-05-23  2:43                         ` Samuel Lijin
  0 siblings, 1 reply; 89+ messages in thread
From: Junio C Hamano @ 2017-05-22  6:17 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git@vger.kernel.org

Samuel Lijin <sxlijin@gmail.com> writes:

>> By the way, instead of putting NULL, it may be easier to follow if
>> you used two pointers, src and dst, into dir.entries[], just like
>> you did in your latest version of [PATCH 4/6].  That way, you do not
>> have to change anything in the later loop that walks over elements
>> in the dir.entries[] array.  It would also help the logic easier to
>> follow if the above loop were its own helper function.
>
> Agreed on the helper function. On the src-dst thing: I considered it,
> but I figured another O(n) set of array moves was unnecessary. I guess
> this is one of those cases where premature optimization doesn't make
> sense?

I actually did not mean to give the variables more descriptive names
and preserve the original 'main loop' (namely, not adding the "skip
if NULL" which would never happen in normal case where "-d" is not
used without "-x") as "optimization", whether it is premature or
not.  My suggestions were purely from "wouldn't the resulting code
easier to follow and understand, leading to fewer bugs in the
future?" point of view.

As I said, I am undecided if the result is easier to follow than
your version ;-)

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files
  2017-05-22  6:17                       ` Junio C Hamano
@ 2017-05-23  2:43                         ` Samuel Lijin
  2017-05-23  7:50                           ` Junio C Hamano
  0 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23  2:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

On Mon, May 22, 2017 at 2:17 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Samuel Lijin <sxlijin@gmail.com> writes:
>
>>> By the way, instead of putting NULL, it may be easier to follow if
>>> you used two pointers, src and dst, into dir.entries[], just like
>>> you did in your latest version of [PATCH 4/6].  That way, you do not
>>> have to change anything in the later loop that walks over elements
>>> in the dir.entries[] array.  It would also help the logic easier to
>>> follow if the above loop were its own helper function.
>>
>> Agreed on the helper function. On the src-dst thing: I considered it,
>> but I figured another O(n) set of array moves was unnecessary. I guess
>> this is one of those cases where premature optimization doesn't make
>> sense?
>
> I actually did not mean to give the variables more descriptive names
> and preserve the original 'main loop' (namely, not adding the "skip
> if NULL" which would never happen in normal case where "-d" is not
> used without "-x") as "optimization", whether it is premature or
> not.  My suggestions were purely from "wouldn't the resulting code
> easier to follow and understand, leading to fewer bugs in the
> future?" point of view.
>
> As I said, I am undecided if the result is easier to follow than
> your version ;-)

I think I'll defer to your patch: I do agree that your version is
easier to follow and understand. Should I reroll just this patch and
its commit message, or would you prefer to handle that in the queuing
yourself?

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files
  2017-05-23  2:43                         ` Samuel Lijin
@ 2017-05-23  7:50                           ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-23  7:50 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git@vger.kernel.org

Samuel Lijin <sxlijin@gmail.com> writes:

>> As I said, I am undecided if the result is easier to follow than
>> your version ;-)
>
> I think I'll defer to your patch: I do agree that your version is
> easier to follow and understand. Should I reroll just this patch and
> its commit message, or would you prefer to handle that in the queuing
> yourself?

I was going thru the entries in "What's cooking" draft and was
wondering what the final outcome would be.  A v5 that I can just
apply without thinking would probably be easier for me (I can tend
to other topics while waiting your final version).

Thanks.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v5 0/6] Fix clean -d and status --ignored
  2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
@ 2017-05-23  9:18                   ` Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 " Samuel Lijin
                                       ` (6 more replies)
  2017-05-23  9:18                   ` [PATCH v5 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
                                     ` (5 subsequent siblings)
  6 siblings, 7 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23  9:18 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Incorporates latest round of feedback from Junio about how to best structure
the changes to cmd_clean() for maintainability.

Samuel Lijin (6):
  t7300: clean -d should skip dirs with ignored files
  t7061: status --ignored should search untracked dirs
  dir: recurse into untracked dirs for ignored files
  dir: hide untracked contents of untracked dirs
  dir: expose cmp_name() and check_contains()
  clean: teach clean -d to preserve ignored paths

 Documentation/technical/api-directory-listing.txt |  6 ++++
 builtin/clean.c                                   | 31 ++++++++++++++++
 dir.c                                             | 43 ++++++++++++++++++++---
 dir.h                                             |  6 +++-
 t/t7061-wtstatus-ignore.sh                        |  1 +
 t/t7300-clean.sh                                  | 16 +++++++++
 6 files changed, 98 insertions(+), 5 deletions(-)

-- 
2.13.0


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v5 1/6] t7300: clean -d should skip dirs with ignored files
  2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
  2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
@ 2017-05-23  9:18                   ` Samuel Lijin
  2017-05-23  9:18                   ` [PATCH v5 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
                                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23  9:18 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

If git sees a directory which contains only untracked and ignored
files, clean -d should not remove that directory. It was recently
discovered that this is *not* true of git clean -d, and it's possible
that this has never worked correctly; this test and its accompanying
patch series aims to fix that.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7300-clean.sh | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index b89fd2a6a..3a2d709c2 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,4 +653,20 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
+test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
+	echo /foo/bar >.gitignore &&
+	echo ignoreme >>.gitignore &&
+	rm -rf foo &&
+	mkdir -p foo/a/aa/aaa foo/b/bb/bbb &&
+	touch foo/bar foo/baz foo/a/aa/ignoreme foo/b/ignoreme foo/b/bb/1 foo/b/bb/2 &&
+	git clean -df &&
+	test_path_is_dir foo &&
+	test_path_is_file foo/bar &&
+	test_path_is_missing foo/baz &&
+	test_path_is_file foo/a/aa/ignoreme &&
+	test_path_is_missing foo/a/aa/aaa &&
+	test_path_is_file foo/b/ignoreme &&
+	test_path_is_missing foo/b/bb
+'
+
 test_done
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v5 2/6] t7061: status --ignored should search untracked dirs
  2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
  2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
  2017-05-23  9:18                   ` [PATCH v5 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
@ 2017-05-23  9:18                   ` Samuel Lijin
  2017-05-23  9:18                   ` [PATCH v5 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
                                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23  9:18 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Per eb8c5b87, `status --ignored` by design does not list ignored files
if they are in a directory which contains only ignored and untracked
files (which is itself considered to be untracked) without `-uall`. This
does not make sense for `--ignored`, which claims to "Show ignored files
as well."

Thus we revisit eb8c5b87 and decide that for such directories, `status
--ignored` will list the directory as untracked *and* list all ignored
files within said directory even without `-uall`.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7061-wtstatus-ignore.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index cdc0747bf..15e7592b6 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -9,9 +9,10 @@ cat >expected <<\EOF
 ?? actual
 ?? expected
 ?? untracked/
+!! untracked/ignored
 EOF
 
-test_expect_success 'status untracked directory with --ignored' '
+test_expect_failure 'status untracked directory with --ignored' '
 	echo "ignored" >.gitignore &&
 	mkdir untracked &&
 	: >untracked/ignored &&
@@ -20,7 +21,7 @@ test_expect_success 'status untracked directory with --ignored' '
 	test_cmp expected actual
 '
 
-test_expect_success 'same with gitignore starting with BOM' '
+test_expect_failure 'same with gitignore starting with BOM' '
 	printf "\357\273\277ignored\n" >.gitignore &&
 	mkdir -p untracked &&
 	: >untracked/ignored &&
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v5 3/6] dir: recurse into untracked dirs for ignored files
  2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
                                     ` (2 preceding siblings ...)
  2017-05-23  9:18                   ` [PATCH v5 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
@ 2017-05-23  9:18                   ` Samuel Lijin
  2017-05-23  9:18                   ` [PATCH v5 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
                                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23  9:18 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We consider directories containing only untracked and ignored files to
be themselves untracked, which in the usual case means we don't have to
search these directories. This is problematic when we want to collect
ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
read_directory_recursive() to recurse into untracked directories to find
the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
has the side effect of also collecting all untracked files in untracked
directories as well.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index f451bfa48..68cf6e47c 100644
--- a/dir.c
+++ b/dir.c
@@ -1784,7 +1784,10 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 			dir_state = state;
 
 		/* recurse into subdir if instructed by treat_path */
-		if (state == path_recurse) {
+		if ((state == path_recurse) ||
+			((state == path_untracked) &&
+			 (dir->flags & DIR_SHOW_IGNORED_TOO) &&
+			 (get_dtype(cdir.de, path.buf, path.len) == DT_DIR))) {
 			struct untracked_cache_dir *ud;
 			ud = lookup_untracked(dir->untracked, untracked,
 					      path.buf + baselen,
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v5 4/6] dir: hide untracked contents of untracked dirs
  2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
                                     ` (3 preceding siblings ...)
  2017-05-23  9:18                   ` [PATCH v5 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
@ 2017-05-23  9:18                   ` Samuel Lijin
  2017-05-23  9:18                   ` [PATCH v5 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
  2017-05-23  9:18                   ` [PATCH v5 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23  9:18 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

When we taught read_directory_recursive() to recurse into untracked
directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
had the side effect of teaching it to collect the untracked contents of
untracked directories. It doesn't always make sense to return these,
though (we do need them for `clean -d`), so we introduce a flag
(DIR_KEEP_UNTRACKED_CONTENTS) to control whether or not read_directory()
strips dir->entries of the untracked contents of untracked dirs.

We also introduce check_contains() to check if one dir_entry corresponds
to a path which contains the path corresponding to another dir_entry.

This also fixes known breakages in t7061, since status --ignored now
searches untracked directories for ignored files.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 Documentation/technical/api-directory-listing.txt |  6 +++++
 dir.c                                             | 31 +++++++++++++++++++++++
 dir.h                                             |  3 ++-
 t/t7061-wtstatus-ignore.sh                        |  4 +--
 4 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index 7f8e78d91..6c77b4920 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -33,6 +33,12 @@ The notable options are:
 	Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]`
 	in addition to untracked files in `entries[]`.
 
+`DIR_KEEP_UNTRACKED_CONTENTS`:::
+
+	Only has meaning if `DIR_SHOW_IGNORED_TOO` is also set; if this is set, the
+	untracked contents of untracked directories are also returned in
+	`entries[]`.
+
 `DIR_COLLECT_IGNORED`:::
 
 	Special mode for git-add. Return ignored files in `ignored[]` and
diff --git a/dir.c b/dir.c
index 68cf6e47c..ba5eadeda 100644
--- a/dir.c
+++ b/dir.c
@@ -1850,6 +1850,14 @@ static int cmp_name(const void *p1, const void *p2)
 	return name_compare(e1->name, e1->len, e2->name, e2->len);
 }
 
+/* check if *out lexically strictly contains *in */
+static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+{
+	return (out->len < in->len) &&
+		(out->name[out->len - 1] == '/') &&
+		!memcmp(out->name, in->name, out->len);
+}
+
 static int treat_leading_path(struct dir_struct *dir,
 			      const char *path, int len,
 			      const struct pathspec *pathspec)
@@ -2065,6 +2073,29 @@ int read_directory(struct dir_struct *dir, const char *path,
 		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
 	QSORT(dir->entries, dir->nr, cmp_name);
 	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
+
+	/* 
+	 * if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
+	 * up untracked contents of untracked dirs; by default we discard these,
+	 * but given DIR_KEEP_UNTRACKED_CONTENTS we do not
+	 */
+	if ((dir->flags & DIR_SHOW_IGNORED_TOO) &&
+		     !(dir->flags & DIR_KEEP_UNTRACKED_CONTENTS)) {
+		int i, j;
+
+		/* remove from dir->entries untracked contents of untracked dirs */
+		for (i = j = 0; j < dir->nr; j++) {
+			if (i && check_contains(dir->entries[i - 1], dir->entries[j])) {
+				free(dir->entries[j]);
+				dir->entries[j] = NULL;
+			} else {
+				dir->entries[i++] = dir->entries[j];
+			}
+		}
+
+		dir->nr = i;
+	}
+
 	if (dir->untracked) {
 		static struct trace_key trace_untracked_stats = TRACE_KEY_INIT(UNTRACKED_STATS);
 		trace_printf_key(&trace_untracked_stats,
diff --git a/dir.h b/dir.h
index bf23a470a..650e54bdf 100644
--- a/dir.h
+++ b/dir.h
@@ -151,7 +151,8 @@ struct dir_struct {
 		DIR_NO_GITLINKS = 1<<3,
 		DIR_COLLECT_IGNORED = 1<<4,
 		DIR_SHOW_IGNORED_TOO = 1<<5,
-		DIR_COLLECT_KILLED_ONLY = 1<<6
+		DIR_COLLECT_KILLED_ONLY = 1<<6,
+		DIR_KEEP_UNTRACKED_CONTENTS = 1<<7
 	} flags;
 	struct dir_entry **entries;
 	struct dir_entry **ignored;
diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index 15e7592b6..fc6013ba3 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -12,7 +12,7 @@ cat >expected <<\EOF
 !! untracked/ignored
 EOF
 
-test_expect_failure 'status untracked directory with --ignored' '
+test_expect_success 'status untracked directory with --ignored' '
 	echo "ignored" >.gitignore &&
 	mkdir untracked &&
 	: >untracked/ignored &&
@@ -21,7 +21,7 @@ test_expect_failure 'status untracked directory with --ignored' '
 	test_cmp expected actual
 '
 
-test_expect_failure 'same with gitignore starting with BOM' '
+test_expect_success 'same with gitignore starting with BOM' '
 	printf "\357\273\277ignored\n" >.gitignore &&
 	mkdir -p untracked &&
 	: >untracked/ignored &&
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v5 5/6] dir: expose cmp_name() and check_contains()
  2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
                                     ` (4 preceding siblings ...)
  2017-05-23  9:18                   ` [PATCH v5 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
@ 2017-05-23  9:18                   ` Samuel Lijin
  2017-05-23  9:18                   ` [PATCH v5 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23  9:18 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We want to use cmp_name() and check_contains() (which both compare
`struct dir_entry`s, the former in terms of the sort order, the latter
in terms of whether one lexically contains another) outside of dir.c,
so we have to (1) change their linkage and (2) rename them as
appropriate for the global namespace. The second is achieved by
renaming cmp_name() to cmp_dir_entry() and check_contains() to
check_dir_entry_contains().

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 11 ++++++-----
 dir.h |  3 +++
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index ba5eadeda..aae6d00b4 100644
--- a/dir.c
+++ b/dir.c
@@ -1842,7 +1842,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 	return dir_state;
 }
 
-static int cmp_name(const void *p1, const void *p2)
+int cmp_dir_entry(const void *p1, const void *p2)
 {
 	const struct dir_entry *e1 = *(const struct dir_entry **)p1;
 	const struct dir_entry *e2 = *(const struct dir_entry **)p2;
@@ -1851,7 +1851,7 @@ static int cmp_name(const void *p1, const void *p2)
 }
 
 /* check if *out lexically strictly contains *in */
-static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+int check_dir_entry_contains(const struct dir_entry *out, const struct dir_entry *in)
 {
 	return (out->len < in->len) &&
 		(out->name[out->len - 1] == '/') &&
@@ -2071,8 +2071,8 @@ int read_directory(struct dir_struct *dir, const char *path,
 		dir->untracked = NULL;
 	if (!len || treat_leading_path(dir, path, len, pathspec))
 		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
-	QSORT(dir->entries, dir->nr, cmp_name);
-	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
+	QSORT(dir->entries, dir->nr, cmp_dir_entry);
+	QSORT(dir->ignored, dir->ignored_nr, cmp_dir_entry);
 
 	/* 
 	 * if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
@@ -2085,7 +2085,8 @@ int read_directory(struct dir_struct *dir, const char *path,
 
 		/* remove from dir->entries untracked contents of untracked dirs */
 		for (i = j = 0; j < dir->nr; j++) {
-			if (i && check_contains(dir->entries[i - 1], dir->entries[j])) {
+			if (i &&
+			    check_dir_entry_contains(dir->entries[i - 1], dir->entries[j])) {
 				free(dir->entries[j]);
 				dir->entries[j] = NULL;
 			} else {
diff --git a/dir.h b/dir.h
index 650e54bdf..edb5fda58 100644
--- a/dir.h
+++ b/dir.h
@@ -327,6 +327,9 @@ static inline int dir_path_match(const struct dir_entry *ent,
 			      has_trailing_dir);
 }
 
+int cmp_dir_entry(const void *p1, const void *p2);
+int check_dir_entry_contains(const struct dir_entry *out, const struct dir_entry *in);
+
 void untracked_cache_invalidate_path(struct index_state *, const char *);
 void untracked_cache_remove_from_index(struct index_state *, const char *);
 void untracked_cache_add_to_index(struct index_state *, const char *);
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v5 6/6] clean: teach clean -d to preserve ignored paths
  2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
                                     ` (5 preceding siblings ...)
  2017-05-23  9:18                   ` [PATCH v5 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
@ 2017-05-23  9:18                   ` Samuel Lijin
  2017-05-23 12:52                     ` Junio C Hamano
  6 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23  9:18 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

There is an implicit assumption that a directory containing only
untracked and ignored paths should itself be considered untracked. This
makes sense in use cases where we're asking if a directory should be
added to the git database, but not when we're asking if a directory can
be safely removed from the working tree; as a result, clean -d would
assume that an "untracked" directory containing ignored paths could be
deleted, even though doing so would also remove the ignored paths.

To get around this, we teach clean -d to collect ignored paths and skip
an untracked directory if it contained an ignored path, instead just
removing the untracked contents thereof. To achieve this, cmd_clean()
has to collect all untracked contents of untracked directories, in
addition to all ignored paths, to determine which untracked dirs must be
skipped (because they contain ignored paths) and which ones should *not*
be skipped.

For this purpose, correct_untracked_entries() is introduced to prune a
given dir_struct of untracked entries containing ignored paths and those
untracked entries encompassed by the untracked entries which are not
pruned away.

This also fixes the known breakage in t7300, since clean -d now skips
untracked directories containing ignored paths.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 builtin/clean.c  | 31 +++++++++++++++++++++++++++++++
 t/t7300-clean.sh |  2 +-
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index d861f836a..45dbdcd18 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -857,6 +857,36 @@ static void interactive_main_loop(void)
 	}
 }
 
+static void correct_untracked_entries(struct dir_struct *dir)
+{
+	int src, dst, ign;
+
+	for (src = dst = ign = 0; src < dir->nr; src++) {
+		/* skip paths in ignored[] that cannot be inside entries[src] */
+		while (ign < dir->ignored_nr &&
+		       0 <= cmp_dir_entry(&dir->entries[src], &dir->ignored[ign]))
+			ign++;
+
+		if (ign < dir->ignored_nr &&
+		    check_dir_entry_contains(dir->entries[src], dir->ignored[ign])) {
+			/* entries[src] contains an ignored path, so we drop it */
+			free(dir->entries[src]);
+		} else {
+			/* entries[src] does not contain an ignored path, so we keep it */
+			dir->entries[dst++] = dir->entries[src++];
+
+			/* then discard paths in entries[] contained inside entries[src] */
+			while (src < dir->nr &&
+			       check_dir_entry_contains(ent, dir->entries[src]))
+				free(dir->entries[src++]);
+
+			/* compensate for the outer loop's loop control */
+			src--;
+		}
+	}
+	dir->nr = dst;
+}
+
 int cmd_clean(int argc, const char **argv, const char *prefix)
 {
 	int i, res;
@@ -931,6 +961,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		       prefix, argv);
 
 	fill_directory(&dir, &pathspec);
+	correct_untracked_entries(&dir);
 
 	for (i = 0; i < dir.nr; i++) {
 		struct dir_entry *ent = dir.entries[i];
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 3a2d709c2..7b36954d6 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,7 +653,7 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
-test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
+test_expect_success 'git clean -d skips untracked dirs containing ignored files' '
 	echo /foo/bar >.gitignore &&
 	echo ignoreme >>.gitignore &&
 	rm -rf foo &&
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v6 0/6] Fix clean -d and status --ignored
  2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
@ 2017-05-23 10:09                     ` Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
                                       ` (5 subsequent siblings)
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23 10:09 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Messed up on 6/6 in v5, forgot to include changes from earlier versions (karma
for not running tests before I send-email'd the patch series).

Samuel Lijin (6):
  t7300: clean -d should skip dirs with ignored files
  t7061: status --ignored should search untracked dirs
  dir: recurse into untracked dirs for ignored files
  dir: hide untracked contents of untracked dirs
  dir: expose cmp_name() and check_contains()
  clean: teach clean -d to preserve ignored paths

 Documentation/technical/api-directory-listing.txt |  6 ++++
 builtin/clean.c                                   | 42 ++++++++++++++++++++++
 dir.c                                             | 43 ++++++++++++++++++++---
 dir.h                                             |  6 +++-
 t/t7061-wtstatus-ignore.sh                        |  1 +
 t/t7300-clean.sh                                  | 16 +++++++++
 6 files changed, 109 insertions(+), 5 deletions(-)

-- 
2.13.0


^ permalink raw reply	[flat|nested] 89+ messages in thread

* [PATCH v6 1/6] t7300: clean -d should skip dirs with ignored files
  2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 " Samuel Lijin
@ 2017-05-23 10:09                     ` Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
                                       ` (4 subsequent siblings)
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23 10:09 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

If git sees a directory which contains only untracked and ignored
files, clean -d should not remove that directory. It was recently
discovered that this is *not* true of git clean -d, and it's possible
that this has never worked correctly; this test and its accompanying
patch series aims to fix that.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7300-clean.sh | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index b89fd2a6a..3a2d709c2 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,4 +653,20 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
+test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
+	echo /foo/bar >.gitignore &&
+	echo ignoreme >>.gitignore &&
+	rm -rf foo &&
+	mkdir -p foo/a/aa/aaa foo/b/bb/bbb &&
+	touch foo/bar foo/baz foo/a/aa/ignoreme foo/b/ignoreme foo/b/bb/1 foo/b/bb/2 &&
+	git clean -df &&
+	test_path_is_dir foo &&
+	test_path_is_file foo/bar &&
+	test_path_is_missing foo/baz &&
+	test_path_is_file foo/a/aa/ignoreme &&
+	test_path_is_missing foo/a/aa/aaa &&
+	test_path_is_file foo/b/ignoreme &&
+	test_path_is_missing foo/b/bb
+'
+
 test_done
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v6 2/6] t7061: status --ignored should search untracked dirs
  2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 " Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
@ 2017-05-23 10:09                     ` Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
                                       ` (3 subsequent siblings)
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23 10:09 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

Per eb8c5b87, `status --ignored` by design does not list ignored files
if they are in a directory which contains only ignored and untracked
files (which is itself considered to be untracked) without `-uall`. This
does not make sense for `--ignored`, which claims to "Show ignored files
as well."

Thus we revisit eb8c5b87 and decide that for such directories, `status
--ignored` will list the directory as untracked *and* list all ignored
files within said directory even without `-uall`.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 t/t7061-wtstatus-ignore.sh | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index cdc0747bf..15e7592b6 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -9,9 +9,10 @@ cat >expected <<\EOF
 ?? actual
 ?? expected
 ?? untracked/
+!! untracked/ignored
 EOF
 
-test_expect_success 'status untracked directory with --ignored' '
+test_expect_failure 'status untracked directory with --ignored' '
 	echo "ignored" >.gitignore &&
 	mkdir untracked &&
 	: >untracked/ignored &&
@@ -20,7 +21,7 @@ test_expect_success 'status untracked directory with --ignored' '
 	test_cmp expected actual
 '
 
-test_expect_success 'same with gitignore starting with BOM' '
+test_expect_failure 'same with gitignore starting with BOM' '
 	printf "\357\273\277ignored\n" >.gitignore &&
 	mkdir -p untracked &&
 	: >untracked/ignored &&
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v6 3/6] dir: recurse into untracked dirs for ignored files
  2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
                                       ` (2 preceding siblings ...)
  2017-05-23 10:09                     ` [PATCH v6 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
@ 2017-05-23 10:09                     ` Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
                                       ` (2 subsequent siblings)
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23 10:09 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We consider directories containing only untracked and ignored files to
be themselves untracked, which in the usual case means we don't have to
search these directories. This is problematic when we want to collect
ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
read_directory_recursive() to recurse into untracked directories to find
the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
has the side effect of also collecting all untracked files in untracked
directories as well.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index f451bfa48..68cf6e47c 100644
--- a/dir.c
+++ b/dir.c
@@ -1784,7 +1784,10 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 			dir_state = state;
 
 		/* recurse into subdir if instructed by treat_path */
-		if (state == path_recurse) {
+		if ((state == path_recurse) ||
+			((state == path_untracked) &&
+			 (dir->flags & DIR_SHOW_IGNORED_TOO) &&
+			 (get_dtype(cdir.de, path.buf, path.len) == DT_DIR))) {
 			struct untracked_cache_dir *ud;
 			ud = lookup_untracked(dir->untracked, untracked,
 					      path.buf + baselen,
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v6 4/6] dir: hide untracked contents of untracked dirs
  2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
                                       ` (3 preceding siblings ...)
  2017-05-23 10:09                     ` [PATCH v6 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
@ 2017-05-23 10:09                     ` Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23 10:09 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

When we taught read_directory_recursive() to recurse into untracked
directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
had the side effect of teaching it to collect the untracked contents of
untracked directories. It doesn't always make sense to return these,
though (we do need them for `clean -d`), so we introduce a flag
(DIR_KEEP_UNTRACKED_CONTENTS) to control whether or not read_directory()
strips dir->entries of the untracked contents of untracked dirs.

We also introduce check_contains() to check if one dir_entry corresponds
to a path which contains the path corresponding to another dir_entry.

This also fixes known breakages in t7061, since status --ignored now
searches untracked directories for ignored files.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 Documentation/technical/api-directory-listing.txt |  6 +++++
 dir.c                                             | 31 +++++++++++++++++++++++
 dir.h                                             |  3 ++-
 t/t7061-wtstatus-ignore.sh                        |  4 +--
 4 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index 7f8e78d91..6c77b4920 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -33,6 +33,12 @@ The notable options are:
 	Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]`
 	in addition to untracked files in `entries[]`.
 
+`DIR_KEEP_UNTRACKED_CONTENTS`:::
+
+	Only has meaning if `DIR_SHOW_IGNORED_TOO` is also set; if this is set, the
+	untracked contents of untracked directories are also returned in
+	`entries[]`.
+
 `DIR_COLLECT_IGNORED`:::
 
 	Special mode for git-add. Return ignored files in `ignored[]` and
diff --git a/dir.c b/dir.c
index 68cf6e47c..ba5eadeda 100644
--- a/dir.c
+++ b/dir.c
@@ -1850,6 +1850,14 @@ static int cmp_name(const void *p1, const void *p2)
 	return name_compare(e1->name, e1->len, e2->name, e2->len);
 }
 
+/* check if *out lexically strictly contains *in */
+static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+{
+	return (out->len < in->len) &&
+		(out->name[out->len - 1] == '/') &&
+		!memcmp(out->name, in->name, out->len);
+}
+
 static int treat_leading_path(struct dir_struct *dir,
 			      const char *path, int len,
 			      const struct pathspec *pathspec)
@@ -2065,6 +2073,29 @@ int read_directory(struct dir_struct *dir, const char *path,
 		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
 	QSORT(dir->entries, dir->nr, cmp_name);
 	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
+
+	/* 
+	 * if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
+	 * up untracked contents of untracked dirs; by default we discard these,
+	 * but given DIR_KEEP_UNTRACKED_CONTENTS we do not
+	 */
+	if ((dir->flags & DIR_SHOW_IGNORED_TOO) &&
+		     !(dir->flags & DIR_KEEP_UNTRACKED_CONTENTS)) {
+		int i, j;
+
+		/* remove from dir->entries untracked contents of untracked dirs */
+		for (i = j = 0; j < dir->nr; j++) {
+			if (i && check_contains(dir->entries[i - 1], dir->entries[j])) {
+				free(dir->entries[j]);
+				dir->entries[j] = NULL;
+			} else {
+				dir->entries[i++] = dir->entries[j];
+			}
+		}
+
+		dir->nr = i;
+	}
+
 	if (dir->untracked) {
 		static struct trace_key trace_untracked_stats = TRACE_KEY_INIT(UNTRACKED_STATS);
 		trace_printf_key(&trace_untracked_stats,
diff --git a/dir.h b/dir.h
index bf23a470a..650e54bdf 100644
--- a/dir.h
+++ b/dir.h
@@ -151,7 +151,8 @@ struct dir_struct {
 		DIR_NO_GITLINKS = 1<<3,
 		DIR_COLLECT_IGNORED = 1<<4,
 		DIR_SHOW_IGNORED_TOO = 1<<5,
-		DIR_COLLECT_KILLED_ONLY = 1<<6
+		DIR_COLLECT_KILLED_ONLY = 1<<6,
+		DIR_KEEP_UNTRACKED_CONTENTS = 1<<7
 	} flags;
 	struct dir_entry **entries;
 	struct dir_entry **ignored;
diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index 15e7592b6..fc6013ba3 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -12,7 +12,7 @@ cat >expected <<\EOF
 !! untracked/ignored
 EOF
 
-test_expect_failure 'status untracked directory with --ignored' '
+test_expect_success 'status untracked directory with --ignored' '
 	echo "ignored" >.gitignore &&
 	mkdir untracked &&
 	: >untracked/ignored &&
@@ -21,7 +21,7 @@ test_expect_failure 'status untracked directory with --ignored' '
 	test_cmp expected actual
 '
 
-test_expect_failure 'same with gitignore starting with BOM' '
+test_expect_success 'same with gitignore starting with BOM' '
 	printf "\357\273\277ignored\n" >.gitignore &&
 	mkdir -p untracked &&
 	: >untracked/ignored &&
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v6 5/6] dir: expose cmp_name() and check_contains()
  2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
                                       ` (4 preceding siblings ...)
  2017-05-23 10:09                     ` [PATCH v6 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
@ 2017-05-23 10:09                     ` Samuel Lijin
  2017-05-23 10:09                     ` [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
  6 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23 10:09 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

We want to use cmp_name() and check_contains() (which both compare
`struct dir_entry`s, the former in terms of the sort order, the latter
in terms of whether one lexically contains another) outside of dir.c,
so we have to (1) change their linkage and (2) rename them as
appropriate for the global namespace. The second is achieved by
renaming cmp_name() to cmp_dir_entry() and check_contains() to
check_dir_entry_contains().

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 dir.c | 11 ++++++-----
 dir.h |  3 +++
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index ba5eadeda..aae6d00b4 100644
--- a/dir.c
+++ b/dir.c
@@ -1842,7 +1842,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 	return dir_state;
 }
 
-static int cmp_name(const void *p1, const void *p2)
+int cmp_dir_entry(const void *p1, const void *p2)
 {
 	const struct dir_entry *e1 = *(const struct dir_entry **)p1;
 	const struct dir_entry *e2 = *(const struct dir_entry **)p2;
@@ -1851,7 +1851,7 @@ static int cmp_name(const void *p1, const void *p2)
 }
 
 /* check if *out lexically strictly contains *in */
-static int check_contains(const struct dir_entry *out, const struct dir_entry *in)
+int check_dir_entry_contains(const struct dir_entry *out, const struct dir_entry *in)
 {
 	return (out->len < in->len) &&
 		(out->name[out->len - 1] == '/') &&
@@ -2071,8 +2071,8 @@ int read_directory(struct dir_struct *dir, const char *path,
 		dir->untracked = NULL;
 	if (!len || treat_leading_path(dir, path, len, pathspec))
 		read_directory_recursive(dir, path, len, untracked, 0, pathspec);
-	QSORT(dir->entries, dir->nr, cmp_name);
-	QSORT(dir->ignored, dir->ignored_nr, cmp_name);
+	QSORT(dir->entries, dir->nr, cmp_dir_entry);
+	QSORT(dir->ignored, dir->ignored_nr, cmp_dir_entry);
 
 	/* 
 	 * if DIR_SHOW_IGNORED_TOO, read_directory_recursive() will also pick
@@ -2085,7 +2085,8 @@ int read_directory(struct dir_struct *dir, const char *path,
 
 		/* remove from dir->entries untracked contents of untracked dirs */
 		for (i = j = 0; j < dir->nr; j++) {
-			if (i && check_contains(dir->entries[i - 1], dir->entries[j])) {
+			if (i &&
+			    check_dir_entry_contains(dir->entries[i - 1], dir->entries[j])) {
 				free(dir->entries[j]);
 				dir->entries[j] = NULL;
 			} else {
diff --git a/dir.h b/dir.h
index 650e54bdf..edb5fda58 100644
--- a/dir.h
+++ b/dir.h
@@ -327,6 +327,9 @@ static inline int dir_path_match(const struct dir_entry *ent,
 			      has_trailing_dir);
 }
 
+int cmp_dir_entry(const void *p1, const void *p2);
+int check_dir_entry_contains(const struct dir_entry *out, const struct dir_entry *in);
+
 void untracked_cache_invalidate_path(struct index_state *, const char *);
 void untracked_cache_remove_from_index(struct index_state *, const char *);
 void untracked_cache_add_to_index(struct index_state *, const char *);
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths
  2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
                                       ` (5 preceding siblings ...)
  2017-05-23 10:09                     ` [PATCH v6 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
@ 2017-05-23 10:09                     ` Samuel Lijin
  2017-05-23 22:35                       ` Junio C Hamano
  2017-05-24  4:14                       ` Torsten Bögershausen
  6 siblings, 2 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23 10:09 UTC (permalink / raw)
  To: git; +Cc: Samuel Lijin

There is an implicit assumption that a directory containing only
untracked and ignored paths should itself be considered untracked. This
makes sense in use cases where we're asking if a directory should be
added to the git database, but not when we're asking if a directory can
be safely removed from the working tree; as a result, clean -d would
assume that an "untracked" directory containing ignored paths could be
deleted, even though doing so would also remove the ignored paths.

To get around this, we teach clean -d to collect ignored paths and skip
an untracked directory if it contained an ignored path, instead just
removing the untracked contents thereof. To achieve this, cmd_clean()
has to collect all untracked contents of untracked directories, in
addition to all ignored paths, to determine which untracked dirs must be
skipped (because they contain ignored paths) and which ones should *not*
be skipped.

For this purpose, correct_untracked_entries() is introduced to prune a
given dir_struct of untracked entries containing ignored paths and those
untracked entries encompassed by the untracked entries which are not
pruned away.

A memory leak is also fixed in cmd_clean().

This also fixes the known breakage in t7300, since clean -d now skips
untracked directories containing ignored paths.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
---
 builtin/clean.c  | 42 ++++++++++++++++++++++++++++++++++++++++++
 t/t7300-clean.sh |  2 +-
 2 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/builtin/clean.c b/builtin/clean.c
index d861f836a..937eb17b6 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -857,6 +857,38 @@ static void interactive_main_loop(void)
 	}
 }
 
+static void correct_untracked_entries(struct dir_struct *dir)
+{
+	int src, dst, ign;
+
+	for (src = dst = ign = 0; src < dir->nr; src++) {
+		/* skip paths in ignored[] that cannot be inside entries[src] */
+		while (ign < dir->ignored_nr &&
+		       0 <= cmp_dir_entry(&dir->entries[src], &dir->ignored[ign]))
+			ign++;
+
+		if (ign < dir->ignored_nr &&
+		    check_dir_entry_contains(dir->entries[src], dir->ignored[ign])) {
+			/* entries[src] contains an ignored path, so we drop it */
+			free(dir->entries[src]);
+		} else {
+			struct dir_entry *ent = dir->entries[src++];
+
+			/* entries[src] does not contain an ignored path, so we keep it */
+			dir->entries[dst++] = ent;
+
+			/* then discard paths in entries[] contained inside entries[src] */
+			while (src < dir->nr &&
+			       check_dir_entry_contains(ent, dir->entries[src]))
+				free(dir->entries[src++]);
+
+			/* compensate for the outer loop's loop control */
+			src--;
+		}
+	}
+	dir->nr = dst;
+}
+
 int cmd_clean(int argc, const char **argv, const char *prefix)
 {
 	int i, res;
@@ -916,6 +948,9 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 
 	dir.flags |= DIR_SHOW_OTHER_DIRECTORIES;
 
+	if (remove_directories)
+		dir.flags |= DIR_SHOW_IGNORED_TOO | DIR_KEEP_UNTRACKED_CONTENTS;
+
 	if (read_cache() < 0)
 		die(_("index file corrupt"));
 
@@ -931,6 +966,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		       prefix, argv);
 
 	fill_directory(&dir, &pathspec);
+	correct_untracked_entries(&dir);
 
 	for (i = 0; i < dir.nr; i++) {
 		struct dir_entry *ent = dir.entries[i];
@@ -958,6 +994,12 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 		string_list_append(&del_list, rel);
 	}
 
+	for (i = 0; i < dir.nr; i++)
+		free(dir.entries[i]);
+
+	for (i = 0; i < dir.ignored_nr; i++)
+		free(dir.ignored[i]);
+
 	if (interactive && del_list.nr > 0)
 		interactive_main_loop();
 
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index 3a2d709c2..7b36954d6 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -653,7 +653,7 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
 	test_path_is_dir foobar
 '
 
-test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
+test_expect_success 'git clean -d skips untracked dirs containing ignored files' '
 	echo /foo/bar >.gitignore &&
 	echo ignoreme >>.gitignore &&
 	rm -rf foo &&
-- 
2.13.0


^ permalink raw reply related	[flat|nested] 89+ messages in thread

* Re: [PATCH v5 6/6] clean: teach clean -d to preserve ignored paths
  2017-05-23  9:18                   ` [PATCH v5 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
@ 2017-05-23 12:52                     ` Junio C Hamano
  2017-05-23 19:16                       ` Samuel Lijin
  0 siblings, 1 reply; 89+ messages in thread
From: Junio C Hamano @ 2017-05-23 12:52 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> @@ -931,6 +961,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>  		       prefix, argv);
>  
>  	fill_directory(&dir, &pathspec);
> +	correct_untracked_entries(&dir);
>  
>  	for (i = 0; i < dir.nr; i++) {
>  		struct dir_entry *ent = dir.entries[i];

You used to set SHOW_IGNORED_TOO and KEEP_UNTRACKED_CONTENTS in
dir.flags early in the function, and then free dir.entries[] and
dir.ignored[] after we are done.  They are gone in this version.

Intended?

> diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
> index 3a2d709c2..7b36954d6 100755
> --- a/t/t7300-clean.sh
> +++ b/t/t7300-clean.sh
> @@ -653,7 +653,7 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
>  	test_path_is_dir foobar
>  '
>  
> -test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
> +test_expect_success 'git clean -d skips untracked dirs containing ignored files' '
>  	echo /foo/bar >.gitignore &&
>  	echo ignoreme >>.gitignore &&
>  	rm -rf foo &&

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v5 6/6] clean: teach clean -d to preserve ignored paths
  2017-05-23 12:52                     ` Junio C Hamano
@ 2017-05-23 19:16                       ` Samuel Lijin
  0 siblings, 0 replies; 89+ messages in thread
From: Samuel Lijin @ 2017-05-23 19:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org

On Tue, May 23, 2017 at 8:52 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Samuel Lijin <sxlijin@gmail.com> writes:
>
>> @@ -931,6 +961,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
>>                      prefix, argv);
>>
>>       fill_directory(&dir, &pathspec);
>> +     correct_untracked_entries(&dir);
>>
>>       for (i = 0; i < dir.nr; i++) {
>>               struct dir_entry *ent = dir.entries[i];
>
> You used to set SHOW_IGNORED_TOO and KEEP_UNTRACKED_CONTENTS in
> dir.flags early in the function, and then free dir.entries[] and
> dir.ignored[] after we are done.  They are gone in this version.
>
> Intended?

Embarrassingly, no. Rerolling.

>> diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
>> index 3a2d709c2..7b36954d6 100755
>> --- a/t/t7300-clean.sh
>> +++ b/t/t7300-clean.sh
>> @@ -653,7 +653,7 @@ test_expect_success 'git clean -d respects pathspecs (pathspec is prefix of dir)
>>       test_path_is_dir foobar
>>  '
>>
>> -test_expect_failure 'git clean -d skips untracked dirs containing ignored files' '
>> +test_expect_success 'git clean -d skips untracked dirs containing ignored files' '
>>       echo /foo/bar >.gitignore &&
>>       echo ignoreme >>.gitignore &&
>>       rm -rf foo &&

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths
  2017-05-23 10:09                     ` [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
@ 2017-05-23 22:35                       ` Junio C Hamano
  2017-05-24  4:14                       ` Torsten Bögershausen
  1 sibling, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-23 22:35 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git

Samuel Lijin <sxlijin@gmail.com> writes:

> There is an implicit assumption that a directory containing only
> untracked and ignored paths should itself be considered untracked. This
> makes sense in use cases where we're asking if a directory should be
> added to the git database, but not when we're asking if a directory can
> be safely removed from the working tree; as a result, clean -d would
> assume that an "untracked" directory containing ignored paths could be
> deleted, even though doing so would also remove the ignored paths.
>
> To get around this, we teach clean -d to collect ignored paths and skip
> an untracked directory if it contained an ignored path, instead just
> removing the untracked contents thereof. To achieve this, cmd_clean()
> has to collect all untracked contents of untracked directories, in
> addition to all ignored paths, to determine which untracked dirs must be
> skipped (because they contain ignored paths) and which ones should *not*
> be skipped.
>
> For this purpose, correct_untracked_entries() is introduced to prune a
> given dir_struct of untracked entries containing ignored paths and those
> untracked entries encompassed by the untracked entries which are not
> pruned away.
>
> A memory leak is also fixed in cmd_clean().
>
> This also fixes the known breakage in t7300, since clean -d now skips
> untracked directories containing ignored paths.

Nicely explained.  Will replace the previous 6/6 and my squash on
top that were queued on 'pu'.

Thanks.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths
  2017-05-23 10:09                     ` [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
  2017-05-23 22:35                       ` Junio C Hamano
@ 2017-05-24  4:14                       ` Torsten Bögershausen
  2017-05-25 15:36                         ` Samuel Lijin
  1 sibling, 1 reply; 89+ messages in thread
From: Torsten Bögershausen @ 2017-05-24  4:14 UTC (permalink / raw)
  To: Samuel Lijin, git


> diff --git a/builtin/clean.c b/builtin/clean.c
> index d861f836a..937eb17b6 100644
> --- a/builtin/clean.c
> +++ b/builtin/clean.c
> @@ -857,6 +857,38 @@ static void interactive_main_loop(void)
>   	}
>   }
>   
> +static void correct_untracked_entries(struct dir_struct *dir)
> +{
> +	int src, dst, ign;
> +
> +	for (src = dst = ign = 0; src < dir->nr; src++) {
> +		/* skip paths in ignored[] that cannot be inside entries[src] */
> +		while (ign < dir->ignored_nr &&
> +		       0 <= cmp_dir_entry(&dir->entries[src], &dir->ignored[ign]))
> +			ign++;
> +
> +		if (ign < dir->ignored_nr &&
> +		    check_dir_entry_contains(dir->entries[src], dir->ignored[ign])) {
> +			/* entries[src] contains an ignored path, so we drop it */
> +			free(dir->entries[src]);
> +		} else {
> +			struct dir_entry *ent = dir->entries[src++];
> +
> +			/* entries[src] does not contain an ignored path, so we keep it */
> +			dir->entries[dst++] = ent;
> +
> +			/* then discard paths in entries[] contained inside entries[src] */
> +			while (src < dir->nr &&
> +			       check_dir_entry_contains(ent, dir->entries[src]))
> +				free(dir->entries[src++]);
> +
> +			/* compensate for the outer loop's loop control */
> +			src--;
> +		}
> +	}
> +	dir->nr = dst;
> +}
> +

Looking what the function does, would
drop_or_keep_untracked_entries()
make more sense ?



^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths
  2017-05-24  4:14                       ` Torsten Bögershausen
@ 2017-05-25 15:36                         ` Samuel Lijin
  2017-05-26  1:05                           ` Junio C Hamano
  0 siblings, 1 reply; 89+ messages in thread
From: Samuel Lijin @ 2017-05-25 15:36 UTC (permalink / raw)
  To: Torsten Bögershausen; +Cc: git@vger.kernel.org

On Wed, May 24, 2017 at 12:14 AM, Torsten Bögershausen <tboegi@web.de> wrote:
>
>> diff --git a/builtin/clean.c b/builtin/clean.c
>> index d861f836a..937eb17b6 100644
>> --- a/builtin/clean.c
>> +++ b/builtin/clean.c
>> @@ -857,6 +857,38 @@ static void interactive_main_loop(void)
>>         }
>>   }
>>   +static void correct_untracked_entries(struct dir_struct *dir)
>
> Looking what the function does, would
> drop_or_keep_untracked_entries()
> make more sense ?

To me, drop_or_keep_ implies that they're either all dropped or all
kept, nothing in between, which is why I went with correct_, to
indicate that the set of untracked entries in the dir_struct prior to
calling the method needs to be corrected.

^ permalink raw reply	[flat|nested] 89+ messages in thread

* Re: [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths
  2017-05-25 15:36                         ` Samuel Lijin
@ 2017-05-26  1:05                           ` Junio C Hamano
  0 siblings, 0 replies; 89+ messages in thread
From: Junio C Hamano @ 2017-05-26  1:05 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: Torsten Bögershausen, git@vger.kernel.org

Samuel Lijin <sxlijin@gmail.com> writes:

> On Wed, May 24, 2017 at 12:14 AM, Torsten Bögershausen <tboegi@web.de> wrote:
>>
>>> diff --git a/builtin/clean.c b/builtin/clean.c
>>> index d861f836a..937eb17b6 100644
>>> --- a/builtin/clean.c
>>> +++ b/builtin/clean.c
>>> @@ -857,6 +857,38 @@ static void interactive_main_loop(void)
>>>         }
>>>   }
>>>   +static void correct_untracked_entries(struct dir_struct *dir)
>>
>> Looking what the function does, would
>> drop_or_keep_untracked_entries()
>> make more sense ?
>
> To me, drop_or_keep_ implies that they're either all dropped or all
> kept, nothing in between, which is why I went with correct_, to
> indicate that the set of untracked entries in the dir_struct prior to
> calling the method needs to be corrected.

Neither is a particularly good name, but if I have to pick, I'd say
we should keep yours.

drop-or-keep may indicate some are dropped while others are kept but
it does not say what the function is for.  correct is better in the
sense that the readers can guess that there is something wrong in
"dir" at the point and needs correcting by calling the helper, but
still does not convey what exactly is wrong.  How the wrong-ness is
corrected does not have to be explained in the name (i.e. I am
saying that drop-or-keep does not give us much useful information),
but I wish that the name of the helper hinted what kind of wrongness
is there to be corrected to the readers.




^ permalink raw reply	[flat|nested] 89+ messages in thread

end of thread, other threads:[~2017-05-26  1:05 UTC | newest]

Thread overview: 89+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-28 22:56 Bug Report: .gitignore behavior is not matching in git clean and git status Chris Johnson
2017-05-01  1:41 ` Junio C Hamano
2017-05-01  1:56   ` Chris Johnson
2017-05-01  3:15     ` Junio C Hamano
2017-05-01 13:51     ` Samuel Lijin
2017-05-01 15:34       ` Samuel Lijin
2017-05-02  0:26         ` Junio C Hamano
2017-05-03  3:29           ` [PATCH 0/7] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
2017-05-03  3:29             ` [PATCH 1/7] t7300: skip untracked dirs containing " Samuel Lijin
2017-05-03 18:19               ` Stefan Beller
2017-05-03 18:26                 ` Samuel Lijin
2017-05-03 18:34                   ` Stefan Beller
2017-05-03  3:29             ` [PATCH 2/7] dir: recurse into untracked dirs for " Samuel Lijin
2017-05-03  3:29             ` [PATCH 3/7] dir: add method to check if a dir_entry lexically contains another Samuel Lijin
2017-05-03 18:09               ` Stefan Beller
2017-05-03  3:29             ` [PATCH 4/7] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-03  3:29             ` [PATCH 5/7] dir: change linkage of cmp_name() and check_contains() Samuel Lijin
2017-05-03  3:29             ` [PATCH 6/7] builtin/clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
2017-05-03  3:29             ` [PATCH 7/7] t7061: check for ignored file in untracked dir Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 0/9] Keep git clean -d from inadvertently removing ignored files Samuel Lijin
2017-05-08  4:26               ` Junio C Hamano
2017-05-08 21:37                 ` Samuel Lijin
2017-05-09  0:31                   ` Junio C Hamano
2017-05-16  7:34               ` [PATCH v3 0/8] Fix clean -d and status --ignored Samuel Lijin
2017-05-18  8:21                 ` [PATCH v4 0/6] " Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 " Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 " Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
2017-05-23 10:09                     ` [PATCH v6 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
2017-05-23 22:35                       ` Junio C Hamano
2017-05-24  4:14                       ` Torsten Bögershausen
2017-05-25 15:36                         ` Samuel Lijin
2017-05-26  1:05                           ` Junio C Hamano
2017-05-23  9:18                   ` [PATCH v5 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
2017-05-23  9:18                   ` [PATCH v5 6/6] clean: teach clean -d to preserve ignored paths Samuel Lijin
2017-05-23 12:52                     ` Junio C Hamano
2017-05-23 19:16                       ` Samuel Lijin
2017-05-18  8:21                 ` [PATCH v4 1/6] t7300: clean -d should skip dirs with ignored files Samuel Lijin
2017-05-18  8:21                 ` [PATCH v4 2/6] t7061: status --ignored should search untracked dirs Samuel Lijin
2017-05-18  8:21                 ` [PATCH v4 3/6] dir: recurse into untracked dirs for ignored files Samuel Lijin
2017-05-18  8:21                 ` [PATCH v4 4/6] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-22  3:13                   ` Junio C Hamano
2017-05-18  8:21                 ` [PATCH v4 5/6] dir: expose cmp_name() and check_contains() Samuel Lijin
2017-05-22  0:38                   ` Junio C Hamano
2017-05-18  8:21                 ` [PATCH v4 6/6] clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
2017-05-22  4:48                   ` Junio C Hamano
2017-05-22  5:58                     ` Samuel Lijin
2017-05-22  6:17                       ` Junio C Hamano
2017-05-23  2:43                         ` Samuel Lijin
2017-05-23  7:50                           ` Junio C Hamano
2017-05-16  7:34               ` [PATCH v3 1/8] t7300: clean -d should skip dirs with " Samuel Lijin
2017-05-16  7:34               ` [PATCH v3 2/8] t7061: status --ignored should search untracked dirs Samuel Lijin
2017-05-16  7:34               ` [PATCH v3 3/8] dir: recurse into untracked dirs for ignored files Samuel Lijin
2017-05-17  6:23                 ` Junio C Hamano
2017-05-17  7:02                   ` Samuel Lijin
2017-05-16  7:34               ` [PATCH v3 4/8] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-17  6:47                 ` Junio C Hamano
2017-05-17  7:32                   ` Samuel Lijin
2017-05-17 10:14                     ` Junio C Hamano
2017-05-16  7:34               ` [PATCH v3 5/8] dir: expose cmp_name() and check_contains() Samuel Lijin
2017-05-16  7:34               ` [PATCH v3 6/8] clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
2017-05-18  2:34                 ` Junio C Hamano
2017-05-18  6:32                   ` Samuel Lijin
2017-05-16  7:34               ` [PATCH v3 7/8] t7300: clean -d now skips untracked " Samuel Lijin
2017-05-18  2:38                 ` Junio C Hamano
2017-05-16  7:34               ` [PATCH v3 8/8] t7061: status --ignored now searches untracked dirs Samuel Lijin
2017-05-18  2:46                 ` Junio C Hamano
2017-05-05 10:46             ` [PATCH v2 1/9] t7300: skip untracked dirs containing ignored files Samuel Lijin
2017-05-07 19:12               ` Torsten Bögershausen
2017-05-08 21:24                 ` Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 2/9] t7061: expect failure where expected behavior will change Samuel Lijin
2017-05-08  4:34               ` Junio C Hamano
2017-05-05 10:46             ` [PATCH v2 3/9] dir: recurse into untracked dirs for ignored files Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 4/9] dir: add method to check if a dir_entry lexically contains another Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 5/9] dir: hide untracked contents of untracked dirs Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 6/9] dir: change linkage of cmp_name() and check_contains() Samuel Lijin
2017-05-08  4:37               ` Junio C Hamano
2017-05-05 10:46             ` [PATCH v2 7/9] builtin/clean: teach clean -d to skip dirs containing ignored files Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 8/9] t7300: clean -d now skips untracked " Samuel Lijin
2017-05-05 10:46             ` [PATCH v2 9/9] t7061: expect ignored files in untracked dirs Samuel Lijin
2017-05-08  6:35               ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).