git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v6 0/5] [GSoC] remove_subtree(): reimplement using iterators
@ 2017-04-02  4:35 Daniel Ferreira
  2017-04-02  4:35 ` [PATCH v6 1/5] dir_iterator: add tests for dir_iterator API Daniel Ferreira
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Daniel Ferreira @ 2017-04-02  4:35 UTC (permalink / raw)
  To: git; +Cc: gitster, sbeller, pclouds, mhagger, peff, Daniel Ferreira

This is the sixth version of a patch series that implements the GSoC
microproject of converting a recursive call to readdir() to use dir_iterator.

v1: https://public-inbox.org/git/CAGZ79kZwT-9mHTiOJ5CEjk2wDFkn6+NcogjX0=vjhsAh16ANYg@mail.gmail.com/T/#t
v2: https://public-inbox.org/git/CACsJy8Dxh-QPBBLfaFWPAWUsbA9GVXA7x+mXLjEvYKhk1zOpig@mail.gmail.com/T/#t
v3: https://public-inbox.org/git/CAGZ79kYtpmURSQWPumobA=e3JBFjKhWCdv_LPhKCd71ZRwMovA@mail.gmail.com/T/#t
v4: https://public-inbox.org/git/1490747533-89143-1-git-send-email-bnmvco@gmail.com/T/#e437a63e0c22c00c69b5d92977c9b438ed2b9fd3a
v5: https://public-inbox.org/git/1490844730-47634-1-git-send-email-bnmvco@gmail.com/T/#m2323f15e45de699f2e09364f40a62e17047cf453

Back in v5, Michael had a number of suggestions, all of which were applied
to this version (including a slightly modified version of his "biggish rewrite"
project to make dir_iterator's state machine simpler). The only suggestion that
did not make it into this series was that of not traversing into subdirectories,
since I believe it would be better off in another series that actually required
that feature (that is, I do not want a series to implement a feature it will
not need). The same goes for Junio's thought on a flag to list *only* directories
and no files on the v4 discussion.

Junio and Peff's comments about how to write to files in the tests were also
considered, and the tests were adjusted.

I chose to squash both the state machine refactor and the addition of the
new flags in a single commit. I do not know whether you will feel this is
the right choice but it seemed natural, since most of the state machine's
new logic would not even make sense without encompassing the new features.
I am, of course, open for feedback on this decision.

To Michael and Duy, thanks -- really -- for the encouraging comments! :)
I never regarded this microproject purely as a means to fulfill a GSoC
requirement, but as a way to get to learn more about Git, so I'm surely
not giving it up.

Once again, thanks for all the previous reviews,
Daniel.

Daniel Ferreira (5):
  dir_iterator: add tests for dir_iterator API
  remove_subtree(): test removing nested directories
  dir_iterator: add helpers to dir_iterator_advance
  dir_iterator: refactor state machine model
  remove_subtree(): reimplement using iterators

 Makefile                        |   1 +
 dir-iterator.c                  | 190 ++++++++++++++++++++++++++--------------
 dir-iterator.h                  |  28 ++++--
 entry.c                         |  38 +++-----
 refs/files-backend.c            |   2 +-
 t/helper/.gitignore             |   1 +
 t/helper/test-dir-iterator.c    |  32 +++++++
 t/t0065-dir-iterator.sh         | 109 +++++++++++++++++++++++
 t/t2000-checkout-cache-clash.sh |  11 +++
 9 files changed, 312 insertions(+), 100 deletions(-)
 create mode 100644 t/helper/test-dir-iterator.c
 create mode 100755 t/t0065-dir-iterator.sh

--
2.7.4 (Apple Git-66)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v6 1/5] dir_iterator: add tests for dir_iterator API
  2017-04-02  4:35 [PATCH v6 0/5] [GSoC] remove_subtree(): reimplement using iterators Daniel Ferreira
@ 2017-04-02  4:35 ` Daniel Ferreira
  2017-04-02  4:35 ` [PATCH v6 2/5] remove_subtree(): test removing nested directories Daniel Ferreira
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Daniel Ferreira @ 2017-04-02  4:35 UTC (permalink / raw)
  To: git; +Cc: gitster, sbeller, pclouds, mhagger, peff, Daniel Ferreira

Create t/helper/test-dir-iterator.c, which prints relevant information
about a directory tree iterated over with dir_iterator.

Create t/t0065-dir-iterator.sh, which tests that dir_iterator does
iterate through a whole directory tree.

Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
---
 Makefile                     |  1 +
 t/helper/.gitignore          |  1 +
 t/helper/test-dir-iterator.c | 28 +++++++++++++++++++++++
 t/t0065-dir-iterator.sh      | 54 ++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 84 insertions(+)
 create mode 100644 t/helper/test-dir-iterator.c
 create mode 100755 t/t0065-dir-iterator.sh

diff --git a/Makefile b/Makefile
index a5a11e7..d0245f3 100644
--- a/Makefile
+++ b/Makefile
@@ -607,6 +607,7 @@ TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
+TEST_PROGRAMS_NEED_X += test-dir-iterator
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
 TEST_PROGRAMS_NEED_X += test-dump-split-index
 TEST_PROGRAMS_NEED_X += test-dump-untracked-cache
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index d6e8b36..9c9656d 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -3,6 +3,7 @@
 /test-config
 /test-date
 /test-delta
+/test-dir-iterator
 /test-dump-cache-tree
 /test-dump-split-index
 /test-dump-untracked-cache
diff --git a/t/helper/test-dir-iterator.c b/t/helper/test-dir-iterator.c
new file mode 100644
index 0000000..06f03fc
--- /dev/null
+++ b/t/helper/test-dir-iterator.c
@@ -0,0 +1,28 @@
+#include "git-compat-util.h"
+#include "strbuf.h"
+#include "iterator.h"
+#include "dir-iterator.h"
+
+int cmd_main(int argc, const char **argv) {
+	struct strbuf path = STRBUF_INIT;
+	struct dir_iterator *diter;
+
+	if (argc < 2) {
+		return 1;
+	}
+
+	strbuf_add(&path, argv[1], strlen(argv[1]));
+
+	diter = dir_iterator_begin(path.buf);
+
+	while (dir_iterator_advance(diter) == ITER_OK) {
+		if (S_ISDIR(diter->st.st_mode))
+			printf("[d] ");
+		else
+			printf("[f] ");
+
+		printf("(%s) %s\n", diter->relative_path, diter->path.buf);
+	}
+
+	return 0;
+}
diff --git a/t/t0065-dir-iterator.sh b/t/t0065-dir-iterator.sh
new file mode 100755
index 0000000..b857c07
--- /dev/null
+++ b/t/t0065-dir-iterator.sh
@@ -0,0 +1,54 @@
+#!/bin/sh
+
+test_description='Test directory iteration.'
+
+. ./test-lib.sh
+
+cat >expect-sorted-output <<-\EOF &&
+[d] (a) ./dir/a
+[d] (a/b) ./dir/a/b
+[d] (a/b/c) ./dir/a/b/c
+[d] (d) ./dir/d
+[d] (d/e) ./dir/d/e
+[d] (d/e/d) ./dir/d/e/d
+[f] (a/b/c/d) ./dir/a/b/c/d
+[f] (a/e) ./dir/a/e
+[f] (b) ./dir/b
+[f] (c) ./dir/c
+[f] (d/e/d/a) ./dir/d/e/d/a
+EOF
+
+test_expect_success 'dir-iterator should iterate through all files' '
+	mkdir -p dir &&
+	mkdir -p dir/a/b/c/ &&
+	>dir/b &&
+	>dir/c &&
+	mkdir -p dir/d/e/d/ &&
+	>dir/a/b/c/d &&
+	>dir/a/e &&
+	>dir/d/e/d/a &&
+
+	test-dir-iterator ./dir | sort >./actual-pre-order-sorted-output &&
+	rm -rf dir &&
+
+	test_cmp expect-sorted-output actual-pre-order-sorted-output
+'
+
+cat >expect-pre-order-output <<-\EOF &&
+[d] (a) ./dir/a
+[d] (a/b) ./dir/a/b
+[d] (a/b/c) ./dir/a/b/c
+[f] (a/b/c/d) ./dir/a/b/c/d
+EOF
+
+test_expect_success 'dir-iterator should list files in the correct order' '
+	mkdir -p dir/a/b/c/ &&
+	>dir/a/b/c/d &&
+
+	test-dir-iterator ./dir >actual-pre-order-output &&
+	rm -rf dir &&
+
+	test_cmp expect-pre-order-output actual-pre-order-output
+'
+
+test_done
--
2.7.4 (Apple Git-66)


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v6 2/5] remove_subtree(): test removing nested directories
  2017-04-02  4:35 [PATCH v6 0/5] [GSoC] remove_subtree(): reimplement using iterators Daniel Ferreira
  2017-04-02  4:35 ` [PATCH v6 1/5] dir_iterator: add tests for dir_iterator API Daniel Ferreira
@ 2017-04-02  4:35 ` Daniel Ferreira
  2017-04-02  4:35 ` [PATCH v6 3/5] dir_iterator: add helpers to dir_iterator_advance Daniel Ferreira
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Daniel Ferreira @ 2017-04-02  4:35 UTC (permalink / raw)
  To: git; +Cc: gitster, sbeller, pclouds, mhagger, peff, Daniel Ferreira

Test removing a nested directory when an attempt is made to restore the
index to a state where it does not exist. A similar test could be found
previously in t/t2000-checkout-cache-clash.sh, but it did not check for
nested directories, which could allow a faulty implementation of
remove_subtree() pass the tests.

Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
---
 t/t2000-checkout-cache-clash.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t2000-checkout-cache-clash.sh b/t/t2000-checkout-cache-clash.sh
index de3edb5..ac10ba3 100755
--- a/t/t2000-checkout-cache-clash.sh
+++ b/t/t2000-checkout-cache-clash.sh
@@ -57,4 +57,15 @@ test_expect_success SYMLINKS 'checkout-index -f twice with --prefix' '
 	git checkout-index -a -f --prefix=there/
 '

+test_expect_success 'git checkout-index -f should remove nested subtrees' '
+	echo content >path &&
+	git update-index --add path &&
+	rm path &&
+	mkdir -p path/with/nested/paths &&
+	echo content >path/file1 &&
+	echo content >path/with/nested/paths/file2 &&
+	git checkout-index -f -a &&
+	test ! -d path
+'
+
 test_done
--
2.7.4 (Apple Git-66)


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v6 3/5] dir_iterator: add helpers to dir_iterator_advance
  2017-04-02  4:35 [PATCH v6 0/5] [GSoC] remove_subtree(): reimplement using iterators Daniel Ferreira
  2017-04-02  4:35 ` [PATCH v6 1/5] dir_iterator: add tests for dir_iterator API Daniel Ferreira
  2017-04-02  4:35 ` [PATCH v6 2/5] remove_subtree(): test removing nested directories Daniel Ferreira
@ 2017-04-02  4:35 ` Daniel Ferreira
  2017-04-02  4:35 ` [PATCH v6 4/5] dir_iterator: refactor state machine model Daniel Ferreira
  2017-04-02  4:35 ` [PATCH v6 5/5] remove_subtree(): reimplement using iterators Daniel Ferreira
  4 siblings, 0 replies; 8+ messages in thread
From: Daniel Ferreira @ 2017-04-02  4:35 UTC (permalink / raw)
  To: git; +Cc: gitster, sbeller, pclouds, mhagger, peff, Daniel Ferreira

Create inline helpers to dir_iterator_advance(). Make
dir_iterator_advance()'s code more legible and allow some behavior to
be reusable.

Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
---
 dir-iterator.c | 65 +++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 42 insertions(+), 23 deletions(-)

diff --git a/dir-iterator.c b/dir-iterator.c
index 34182a9..ce8bf81 100644
--- a/dir-iterator.c
+++ b/dir-iterator.c
@@ -50,6 +50,43 @@ struct dir_iterator_int {
 	struct dir_iterator_level *levels;
 };

+static inline void push_dir_level(struct dir_iterator_int *iter, struct dir_iterator_level *level)
+{
+	level->dir_state = DIR_STATE_RECURSE;
+	ALLOC_GROW(iter->levels, iter->levels_nr + 1,
+		   iter->levels_alloc);
+	level = &iter->levels[iter->levels_nr++];
+	level->initialized = 0;
+}
+
+static inline int pop_dir_level(struct dir_iterator_int *iter)
+{
+	return --iter->levels_nr;
+}
+
+static inline int set_iterator_data(struct dir_iterator_int *iter, struct dir_iterator_level *level)
+{
+	if (lstat(iter->base.path.buf, &iter->base.st) < 0) {
+		if (errno != ENOENT)
+			warning("error reading path '%s': %s",
+				iter->base.path.buf,
+				strerror(errno));
+		return -1;
+	}
+
+	/*
+	 * We have to set these each time because
+	 * the path strbuf might have been realloc()ed.
+	 */
+	iter->base.relative_path =
+		iter->base.path.buf + iter->levels[0].prefix_len;
+	iter->base.basename =
+		iter->base.path.buf + level->prefix_len;
+	level->dir_state = DIR_STATE_ITER;
+
+	return 0;
+}
+
 int dir_iterator_advance(struct dir_iterator *dir_iterator)
 {
 	struct dir_iterator_int *iter =
@@ -84,11 +121,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 				 * over; now prepare to iterate into
 				 * it.
 				 */
-				level->dir_state = DIR_STATE_RECURSE;
-				ALLOC_GROW(iter->levels, iter->levels_nr + 1,
-					   iter->levels_alloc);
-				level = &iter->levels[iter->levels_nr++];
-				level->initialized = 0;
+				push_dir_level(iter, level);
 				continue;
 			} else {
 				/*
@@ -104,7 +137,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 			 * This level is exhausted (or wasn't opened
 			 * successfully); pop up a level.
 			 */
-			if (--iter->levels_nr == 0)
+			if (pop_dir_level(iter) == 0)
 				return dir_iterator_abort(dir_iterator);

 			continue;
@@ -129,7 +162,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 						iter->base.path.buf, strerror(errno));

 				level->dir = NULL;
-				if (--iter->levels_nr == 0)
+				if (pop_dir_level(iter) == 0)
 					return dir_iterator_abort(dir_iterator);
 				break;
 			}
@@ -138,23 +171,9 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 				continue;

 			strbuf_addstr(&iter->base.path, de->d_name);
-			if (lstat(iter->base.path.buf, &iter->base.st) < 0) {
-				if (errno != ENOENT)
-					warning("error reading path '%s': %s",
-						iter->base.path.buf,
-						strerror(errno));
-				continue;
-			}

-			/*
-			 * We have to set these each time because
-			 * the path strbuf might have been realloc()ed.
-			 */
-			iter->base.relative_path =
-				iter->base.path.buf + iter->levels[0].prefix_len;
-			iter->base.basename =
-				iter->base.path.buf + level->prefix_len;
-			level->dir_state = DIR_STATE_ITER;
+			if (set_iterator_data(iter, level))
+				continue;

 			return ITER_OK;
 		}
--
2.7.4 (Apple Git-66)


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v6 4/5] dir_iterator: refactor state machine model
  2017-04-02  4:35 [PATCH v6 0/5] [GSoC] remove_subtree(): reimplement using iterators Daniel Ferreira
                   ` (2 preceding siblings ...)
  2017-04-02  4:35 ` [PATCH v6 3/5] dir_iterator: add helpers to dir_iterator_advance Daniel Ferreira
@ 2017-04-02  4:35 ` Daniel Ferreira
  2017-04-02  4:46   ` Daniel Ferreira (theiostream)
  2017-04-02  4:35 ` [PATCH v6 5/5] remove_subtree(): reimplement using iterators Daniel Ferreira
  4 siblings, 1 reply; 8+ messages in thread
From: Daniel Ferreira @ 2017-04-02  4:35 UTC (permalink / raw)
  To: git; +Cc: gitster, sbeller, pclouds, mhagger, peff, Daniel Ferreira

Perform major refactor of dir_iterator_advance(). dir_iterator has
ceased to rely on a convoluted state machine mechanism of two loops and
two state variables (level.initialized and level.dir_state). This serves
to ease comprehension of the iterator mechanism and ease addition of new
features to the iterator.

Create an option for the dir_iterator API to iterate over subdirectories
only after having iterated through their contents. This feature was
predicted, although not implemented by 0fe5043 ("dir_iterator: new API
for iterating over a directory tree", 2016-06-18). This is useful for
recursively removing a directory and calling rmdir() on a directory only
after all of its contents have been wiped.

Add an option for the dir_iterator API to iterate over the root
directory (the one it was initialized with) as well.

Add the "flags" parameter to dir_iterator_create, allowing for the
aforementioned new features to be enabled. The new default behavior
(i.e. flags set to 0) does not iterate over directories. Flag
DIR_ITERATOR_PRE_ORDER_TRAVERSAL iterates over a directory before doing
so over its contents. DIR_ITERATOR_POST_ORDER_TRAVERSAL iterates over a
directory after doing so over its contents. DIR_ITERATOR_LIST_ROOT_DIR
iterates over the root directory. These flags do not conflict with each
other and may be used simultaneously.

Amend a call to dir_iterator_begin() in refs/files-backend.c to pass
the flags parameter introduced.

Improve t/t0065-dir-iterator.sh and t/helper/test-dir-iterator.c to
test "post-order" and "iterate-over-root" modes.

Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
---
 dir-iterator.c               | 149 +++++++++++++++++++++++++++----------------
 dir-iterator.h               |  28 ++++++--
 refs/files-backend.c         |   2 +-
 t/helper/test-dir-iterator.c |   6 +-
 t/t0065-dir-iterator.sh      |  61 +++++++++++++++++-
 5 files changed, 179 insertions(+), 67 deletions(-)

diff --git a/dir-iterator.c b/dir-iterator.c
index ce8bf81..4b23889 100644
--- a/dir-iterator.c
+++ b/dir-iterator.c
@@ -4,8 +4,6 @@
 #include "dir-iterator.h"
 
 struct dir_iterator_level {
-	int initialized;
-
 	DIR *dir;
 
 	/*
@@ -20,8 +18,11 @@ struct dir_iterator_level {
 	 * iteration and also iterated into):
 	 */
 	enum {
-		DIR_STATE_ITER,
-		DIR_STATE_RECURSE
+		DIR_STATE_PUSHED,
+		DIR_STATE_PRE_ITERATION,
+		DIR_STATE_ITERATING,
+		DIR_STATE_POST_ITERATION,
+		DIR_STATE_EXHAUSTED
 	} dir_state;
 };
 
@@ -48,15 +49,20 @@ struct dir_iterator_int {
 	 * that will be included in this iteration.
 	 */
 	struct dir_iterator_level *levels;
+
+	/* Holds the flags passed to dir_iterator_begin(). */
+	unsigned flags;
 };
 
 static inline void push_dir_level(struct dir_iterator_int *iter, struct dir_iterator_level *level)
 {
-	level->dir_state = DIR_STATE_RECURSE;
 	ALLOC_GROW(iter->levels, iter->levels_nr + 1,
 		   iter->levels_alloc);
+
+	/* Push a new level */
 	level = &iter->levels[iter->levels_nr++];
-	level->initialized = 0;
+	level->dir = NULL;
+	level->dir_state = DIR_STATE_PUSHED;
 }
 
 static inline int pop_dir_level(struct dir_iterator_int *iter)
@@ -75,18 +81,35 @@ static inline int set_iterator_data(struct dir_iterator_int *iter, struct dir_it
 	}
 
 	/*
-	 * We have to set these each time because
-	 * the path strbuf might have been realloc()ed.
+	 * Check if we are dealing with the root directory as an
+	 * item that's being iterated through.
 	 */
-	iter->base.relative_path =
-		iter->base.path.buf + iter->levels[0].prefix_len;
+	if (level->dir_state != DIR_STATE_ITERATING &&
+		iter->levels_nr == 1) {
+		iter->base.relative_path = ".";
+	}
+	else {
+		iter->base.relative_path =
+			iter->base.path.buf + iter->levels[0].prefix_len;
+	}
 	iter->base.basename =
 		iter->base.path.buf + level->prefix_len;
-	level->dir_state = DIR_STATE_ITER;
 
 	return 0;
 }
 
+/*
+ * This function uses a state machine with the following states:
+ * -> DIR_STATE_PUSHED: the directory has been pushed to the
+ * iterator traversal tree.
+ * -> DIR_STATE_PRE_ITERATION: the directory is *NOT* initialized. The
+ * dirpath has already been returned if pre-order traversal is set.
+ * -> DIR_STATE_ITERATING: the directory is initialized. We are traversing
+ * through it.
+ * -> DIR_STATE_POST_ITERATION: the directory has been iterated through.
+ * We are ready to close it.
+ * -> DIR_STATE_EXHAUSTED: the directory is closed and ready to be popped.
+ */
 int dir_iterator_advance(struct dir_iterator *dir_iterator)
 {
 	struct dir_iterator_int *iter =
@@ -97,7 +120,18 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 			&iter->levels[iter->levels_nr - 1];
 		struct dirent *de;
 
-		if (!level->initialized) {
+		if (level->dir_state == DIR_STATE_PUSHED) {
+			level->dir_state = DIR_STATE_PRE_ITERATION;
+
+			if (iter->flags & DIR_ITERATOR_PRE_ORDER_TRAVERSAL) {
+				/* We may not want the root directory to be iterated over */
+				if (iter->levels_nr != 1 ||
+					(iter->flags & DIR_ITERATOR_LIST_ROOT_DIR)) {
+					set_iterator_data(iter, level);
+					return ITER_OK;
+				}
+			}
+		} else if (level->dir_state == DIR_STATE_PRE_ITERATION) {
 			/*
 			 * Note: dir_iterator_begin() ensures that
 			 * path is not the empty string.
@@ -108,63 +142,32 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 
 			level->dir = opendir(iter->base.path.buf);
 			if (!level->dir && errno != ENOENT) {
+				/*
+				 * This level wasn't opened sucessfully; pretend we
+				 * iterated through it already.
+				 */
 				warning("error opening directory %s: %s",
 					iter->base.path.buf, strerror(errno));
-				/* Popping the level is handled below */
-			}
 
-			level->initialized = 1;
-		} else if (S_ISDIR(iter->base.st.st_mode)) {
-			if (level->dir_state == DIR_STATE_ITER) {
-				/*
-				 * The directory was just iterated
-				 * over; now prepare to iterate into
-				 * it.
-				 */
-				push_dir_level(iter, level);
+				level->dir_state = DIR_STATE_POST_ITERATION;
 				continue;
-			} else {
-				/*
-				 * The directory has already been
-				 * iterated over and iterated into;
-				 * we're done with it.
-				 */
 			}
-		}
 
-		if (!level->dir) {
-			/*
-			 * This level is exhausted (or wasn't opened
-			 * successfully); pop up a level.
-			 */
-			if (pop_dir_level(iter) == 0)
-				return dir_iterator_abort(dir_iterator);
-
-			continue;
-		}
-
-		/*
-		 * Loop until we find an entry that we can give back
-		 * to the caller:
-		 */
-		while (1) {
+			level->dir_state = DIR_STATE_ITERATING;
+		} else if (level->dir_state == DIR_STATE_ITERATING) {
 			strbuf_setlen(&iter->base.path, level->prefix_len);
 			errno = 0;
 			de = readdir(level->dir);
 
 			if (!de) {
-				/* This level is exhausted; pop up a level. */
+				/* In case of readdir() error */
 				if (errno) {
 					warning("error reading directory %s: %s",
 						iter->base.path.buf, strerror(errno));
-				} else if (closedir(level->dir))
-					warning("error closing directory %s: %s",
-						iter->base.path.buf, strerror(errno));
+				}
 
-				level->dir = NULL;
-				if (pop_dir_level(iter) == 0)
-					return dir_iterator_abort(dir_iterator);
-				break;
+				level->dir_state = DIR_STATE_POST_ITERATION;
+				continue;
 			}
 
 			if (is_dot_or_dotdot(de->d_name))
@@ -175,7 +178,38 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 			if (set_iterator_data(iter, level))
 				continue;
 
+			if (S_ISDIR(iter->base.st.st_mode)) {
+				push_dir_level(iter, level);
+				continue;
+			}
+
 			return ITER_OK;
+		} else if (level->dir_state == DIR_STATE_POST_ITERATION) {
+			if (level->dir != NULL && closedir(level->dir)) {
+				warning("error closing directory %s: %s",
+					iter->base.path.buf, strerror(errno));
+			}
+			level->dir_state = DIR_STATE_EXHAUSTED;
+
+			strbuf_setlen(&iter->base.path, level->prefix_len);
+			/*
+			 * Since we are iterating through the dirpath
+			 * after we have gone through it, we still need
+			 * to get rid of the trailing slash we appended.
+			 */
+			strbuf_strip_suffix(&iter->base.path, "/");
+
+			if (iter->flags & DIR_ITERATOR_POST_ORDER_TRAVERSAL) {
+				/* We may not want the root directory to be iterated over */
+				if (iter->levels_nr != 1 ||
+					(iter->flags & DIR_ITERATOR_LIST_ROOT_DIR)) {
+					set_iterator_data(iter, level);
+					return ITER_OK;
+				}
+			}
+		} else if (level->dir_state == DIR_STATE_EXHAUSTED) {
+			if (pop_dir_level(iter) == 0)
+				return dir_iterator_abort(dir_iterator);
 		}
 	}
 }
@@ -201,7 +235,7 @@ int dir_iterator_abort(struct dir_iterator *dir_iterator)
 	return ITER_DONE;
 }
 
-struct dir_iterator *dir_iterator_begin(const char *path)
+struct dir_iterator *dir_iterator_begin(const char *path, unsigned flags)
 {
 	struct dir_iterator_int *iter = xcalloc(1, sizeof(*iter));
 	struct dir_iterator *dir_iterator = &iter->base;
@@ -209,13 +243,16 @@ struct dir_iterator *dir_iterator_begin(const char *path)
 	if (!path || !*path)
 		die("BUG: empty path passed to dir_iterator_begin()");
 
+	iter->flags = flags;
+
 	strbuf_init(&iter->base.path, PATH_MAX);
 	strbuf_addstr(&iter->base.path, path);
 
 	ALLOC_GROW(iter->levels, 10, iter->levels_alloc);
 
 	iter->levels_nr = 1;
-	iter->levels[0].initialized = 0;
+	iter->levels[0].dir = NULL;
+	iter->levels[0].dir_state = DIR_STATE_PUSHED;
 
 	return dir_iterator;
 }
diff --git a/dir-iterator.h b/dir-iterator.h
index 27739e6..0e82f36 100644
--- a/dir-iterator.h
+++ b/dir-iterator.h
@@ -11,13 +11,12 @@
  * Every time dir_iterator_advance() is called, update the members of
  * the dir_iterator structure to reflect the next path in the
  * iteration. The order that paths are iterated over within a
- * directory is undefined, but directory paths are always iterated
- * over before the subdirectory contents.
+ * directory is undefined.
  *
  * A typical iteration looks like this:
  *
  *     int ok;
- *     struct iterator *iter = dir_iterator_begin(path);
+ *     struct iterator *iter = dir_iterator_begin(path, flags);
  *
  *     while ((ok = dir_iterator_advance(iter)) == ITER_OK) {
  *             if (want_to_stop_iteration()) {
@@ -38,6 +37,22 @@
  * dir_iterator_advance() again.
  */
 
+/*
+ * Possible flags for dir_iterator_begin().
+ *
+ * -> DIR_ITERATOR_PRE_ORDER_TRAVERSAL: the iterator shall return
+ * a dirpath it has found before iterating through that directory's
+ * contents.
+ * -> DIR_ITERATOR_POST_ORDER_TRAVERSAL: the iterator shall return
+ * a dirpath it has found after iterating through that directory's
+ * contents.
+ * -> DIR_ITERATOR_LIST_ROOT_DIR: the iterator shall return the dirpath
+ * of the root directory it is iterating through.
+ */
+#define DIR_ITERATOR_PRE_ORDER_TRAVERSAL (1 << 0)
+#define DIR_ITERATOR_POST_ORDER_TRAVERSAL (1 << 1)
+#define DIR_ITERATOR_LIST_ROOT_DIR (1 << 2)
+
 struct dir_iterator {
 	/* The current path: */
 	struct strbuf path;
@@ -57,15 +72,16 @@ struct dir_iterator {
 };
 
 /*
- * Start a directory iteration over path. Return a dir_iterator that
- * holds the internal state of the iteration.
+ * Start a directory iteration over path, with options specified in
+ * 'flags'. Return a dir_iterator that holds the internal state of
+ * the iteration.
  *
  * The iteration includes all paths under path, not including path
  * itself and not including "." or ".." entries.
  *
  * path is the starting directory. An internal copy will be made.
  */
-struct dir_iterator *dir_iterator_begin(const char *path);
+struct dir_iterator *dir_iterator_begin(const char *path, unsigned flags);
 
 /*
  * Advance the iterator to the first or next item and return ITER_OK.
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 50188e9..b4bba74 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -3346,7 +3346,7 @@ static struct ref_iterator *files_reflog_iterator_begin(struct ref_store *ref_st
 	files_downcast(ref_store, 0, "reflog_iterator_begin");
 
 	base_ref_iterator_init(ref_iterator, &files_reflog_iterator_vtable);
-	iter->dir_iterator = dir_iterator_begin(git_path("logs"));
+	iter->dir_iterator = dir_iterator_begin(git_path("logs"), 0);
 	return ref_iterator;
 }
 
diff --git a/t/helper/test-dir-iterator.c b/t/helper/test-dir-iterator.c
index 06f03fc..a1b8c78 100644
--- a/t/helper/test-dir-iterator.c
+++ b/t/helper/test-dir-iterator.c
@@ -6,6 +6,7 @@
 int cmd_main(int argc, const char **argv) {
 	struct strbuf path = STRBUF_INIT;
 	struct dir_iterator *diter;
+	unsigned flag = DIR_ITERATOR_PRE_ORDER_TRAVERSAL;
 
 	if (argc < 2) {
 		return 1;
@@ -13,7 +14,10 @@ int cmd_main(int argc, const char **argv) {
 
 	strbuf_add(&path, argv[1], strlen(argv[1]));
 
-	diter = dir_iterator_begin(path.buf);
+	if (argc == 3)
+		flag = atoi(argv[2]);
+
+	diter = dir_iterator_begin(path.buf, flag);
 
 	while (dir_iterator_advance(diter) == ITER_OK) {
 		if (S_ISDIR(diter->st.st_mode))
diff --git a/t/t0065-dir-iterator.sh b/t/t0065-dir-iterator.sh
index b857c07..ade3ee0 100755
--- a/t/t0065-dir-iterator.sh
+++ b/t/t0065-dir-iterator.sh
@@ -28,12 +28,28 @@ test_expect_success 'dir-iterator should iterate through all files' '
 	>dir/a/e &&
 	>dir/d/e/d/a &&
 
-	test-dir-iterator ./dir | sort >./actual-pre-order-sorted-output &&
+	test-dir-iterator ./dir 1 | sort >./actual-pre-order-sorted-output &&
 	rm -rf dir &&
 
 	test_cmp expect-sorted-output actual-pre-order-sorted-output
 '
 
+test_expect_success 'dir-iterator should iterate through all files on post-order mode' '
+	mkdir -p dir &&
+	mkdir -p dir/a/b/c/ &&
+	>dir/b &&
+	>dir/c &&
+	mkdir -p dir/d/e/d/ &&
+	>dir/a/b/c/d &&
+	>dir/a/e &&
+	>dir/d/e/d/a &&
+
+	test-dir-iterator ./dir 2 | sort >actual-post-order-sorted-output &&
+	rm -rf dir &&
+
+	test_cmp expect-sorted-output actual-post-order-sorted-output
+'
+
 cat >expect-pre-order-output <<-\EOF &&
 [d] (a) ./dir/a
 [d] (a/b) ./dir/a/b
@@ -41,14 +57,53 @@ cat >expect-pre-order-output <<-\EOF &&
 [f] (a/b/c/d) ./dir/a/b/c/d
 EOF
 
-test_expect_success 'dir-iterator should list files in the correct order' '
+test_expect_success 'dir-iterator should list files properly on pre-order mode' '
 	mkdir -p dir/a/b/c/ &&
 	>dir/a/b/c/d &&
 
-	test-dir-iterator ./dir >actual-pre-order-output &&
+	test-dir-iterator ./dir 1 >actual-pre-order-output &&
 	rm -rf dir &&
 
 	test_cmp expect-pre-order-output actual-pre-order-output
 '
 
+cat >expect-post-order-output <<-\EOF &&
+[f] (a/b/c/d) ./dir/a/b/c/d
+[d] (a/b/c) ./dir/a/b/c
+[d] (a/b) ./dir/a/b
+[d] (a) ./dir/a
+EOF
+
+test_expect_success 'dir-iterator should list files properly on post-order mode' '
+	mkdir -p dir/a/b/c/ &&
+	>dir/a/b/c/d &&
+
+	test-dir-iterator ./dir 2 >actual-post-order-output &&
+	rm -rf dir &&
+
+	test_cmp expect-post-order-output actual-post-order-output
+'
+
+cat >expect-pre-order-post-order-root-dir-output <<-\EOF &&
+[d] (.) ./dir
+[d] (a) ./dir/a
+[d] (a/b) ./dir/a/b
+[d] (a/b/c) ./dir/a/b/c
+[f] (a/b/c/d) ./dir/a/b/c/d
+[d] (a/b/c) ./dir/a/b/c
+[d] (a/b) ./dir/a/b
+[d] (a) ./dir/a
+[d] (.) ./dir
+EOF
+
+test_expect_success 'dir-iterator should list files properly on pre-order + post-order + root-dir mode' '
+	mkdir -p dir/a/b/c/ &&
+	>dir/a/b/c/d &&
+
+	test-dir-iterator ./dir 7 >actual-pre-order-post-order-root-dir-output &&
+	rm -rf dir &&
+
+	test_cmp expect-pre-order-post-order-root-dir-output actual-pre-order-post-order-root-dir-output
+'
+
 test_done
-- 
2.7.4 (Apple Git-66)


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v6 5/5] remove_subtree(): reimplement using iterators
  2017-04-02  4:35 [PATCH v6 0/5] [GSoC] remove_subtree(): reimplement using iterators Daniel Ferreira
                   ` (3 preceding siblings ...)
  2017-04-02  4:35 ` [PATCH v6 4/5] dir_iterator: refactor state machine model Daniel Ferreira
@ 2017-04-02  4:35 ` Daniel Ferreira
  4 siblings, 0 replies; 8+ messages in thread
From: Daniel Ferreira @ 2017-04-02  4:35 UTC (permalink / raw)
  To: git; +Cc: gitster, sbeller, pclouds, mhagger, peff, Daniel Ferreira

Use dir_iterator to traverse through remove_subtree()'s directory tree,
avoiding the need for recursive calls to readdir(). Simplify
remove_subtree()'s code.

A conversion similar in purpose was previously done at 46d092a
("for_each_reflog(): reimplement using iterators", 2016-05-21).

Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
---
 entry.c | 38 ++++++++++++--------------------------
 1 file changed, 12 insertions(+), 26 deletions(-)

diff --git a/entry.c b/entry.c
index c6eea24..309b9ad 100644
--- a/entry.c
+++ b/entry.c
@@ -2,6 +2,8 @@
 #include "blob.h"
 #include "dir.h"
 #include "streaming.h"
+#include "iterator.h"
+#include "dir-iterator.h"

 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -44,33 +46,17 @@ static void create_directories(const char *path, int path_len,
 	free(buf);
 }

-static void remove_subtree(struct strbuf *path)
+static void remove_subtree(const char *path)
 {
-	DIR *dir = opendir(path->buf);
-	struct dirent *de;
-	int origlen = path->len;
-
-	if (!dir)
-		die_errno("cannot opendir '%s'", path->buf);
-	while ((de = readdir(dir)) != NULL) {
-		struct stat st;
-
-		if (is_dot_or_dotdot(de->d_name))
-			continue;
-
-		strbuf_addch(path, '/');
-		strbuf_addstr(path, de->d_name);
-		if (lstat(path->buf, &st))
-			die_errno("cannot lstat '%s'", path->buf);
-		if (S_ISDIR(st.st_mode))
-			remove_subtree(path);
-		else if (unlink(path->buf))
-			die_errno("cannot unlink '%s'", path->buf);
-		strbuf_setlen(path, origlen);
+	struct dir_iterator *diter = dir_iterator_begin(path, DIR_ITERATOR_POST_ORDER_TRAVERSAL | DIR_ITERATOR_LIST_ROOT_DIR);
+
+	while (dir_iterator_advance(diter) == ITER_OK) {
+		if (S_ISDIR(diter->st.st_mode)) {
+			if (rmdir(diter->path.buf))
+				die_errno("cannot rmdir '%s'", diter->path.buf);
+		} else if (unlink(diter->path.buf))
+			die_errno("cannot unlink '%s'", diter->path.buf);
 	}
-	closedir(dir);
-	if (rmdir(path->buf))
-		die_errno("cannot rmdir '%s'", path->buf);
 }

 static int create_file(const char *path, unsigned int mode)
@@ -282,7 +268,7 @@ int checkout_entry(struct cache_entry *ce,
 				return 0;
 			if (!state->force)
 				return error("%s is a directory", path.buf);
-			remove_subtree(&path);
+			remove_subtree(path.buf);
 		} else if (unlink(path.buf))
 			return error_errno("unable to unlink old '%s'", path.buf);
 	} else if (state->not_new)
--
2.7.4 (Apple Git-66)


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 4/5] dir_iterator: refactor state machine model
  2017-04-02  4:35 ` [PATCH v6 4/5] dir_iterator: refactor state machine model Daniel Ferreira
@ 2017-04-02  4:46   ` Daniel Ferreira (theiostream)
  2017-04-02  5:35     ` Michael Haggerty
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Ferreira (theiostream) @ 2017-04-02  4:46 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Junio C Hamano, Stefan Beller, Duy Nguyen, Michael Haggerty,
	Jeff King, Daniel Ferreira

Gah, I just realized I failed to correct refs/files-backend.c's
behavior and kept 0 instead of DIR_ITERATOR_PRE_ORDER_TRAVERSAL as its
flag. I'll correct this on a v7, but I'll wait for the rest of your
reviews before sending that revision.

On Sun, Apr 2, 2017 at 1:35 AM, Daniel Ferreira <bnmvco@gmail.com> wrote:
> Perform major refactor of dir_iterator_advance(). dir_iterator has
> ceased to rely on a convoluted state machine mechanism of two loops and
> two state variables (level.initialized and level.dir_state). This serves
> to ease comprehension of the iterator mechanism and ease addition of new
> features to the iterator.
>
> Create an option for the dir_iterator API to iterate over subdirectories
> only after having iterated through their contents. This feature was
> predicted, although not implemented by 0fe5043 ("dir_iterator: new API
> for iterating over a directory tree", 2016-06-18). This is useful for
> recursively removing a directory and calling rmdir() on a directory only
> after all of its contents have been wiped.
>
> Add an option for the dir_iterator API to iterate over the root
> directory (the one it was initialized with) as well.
>
> Add the "flags" parameter to dir_iterator_create, allowing for the
> aforementioned new features to be enabled. The new default behavior
> (i.e. flags set to 0) does not iterate over directories. Flag
> DIR_ITERATOR_PRE_ORDER_TRAVERSAL iterates over a directory before doing
> so over its contents. DIR_ITERATOR_POST_ORDER_TRAVERSAL iterates over a
> directory after doing so over its contents. DIR_ITERATOR_LIST_ROOT_DIR
> iterates over the root directory. These flags do not conflict with each
> other and may be used simultaneously.
>
> Amend a call to dir_iterator_begin() in refs/files-backend.c to pass
> the flags parameter introduced.
>
> Improve t/t0065-dir-iterator.sh and t/helper/test-dir-iterator.c to
> test "post-order" and "iterate-over-root" modes.
>
> Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
> ---
>  dir-iterator.c               | 149 +++++++++++++++++++++++++++----------------
>  dir-iterator.h               |  28 ++++++--
>  refs/files-backend.c         |   2 +-
>  t/helper/test-dir-iterator.c |   6 +-
>  t/t0065-dir-iterator.sh      |  61 +++++++++++++++++-
>  5 files changed, 179 insertions(+), 67 deletions(-)
>
> diff --git a/dir-iterator.c b/dir-iterator.c
> index ce8bf81..4b23889 100644
> --- a/dir-iterator.c
> +++ b/dir-iterator.c
> @@ -4,8 +4,6 @@
>  #include "dir-iterator.h"
>
>  struct dir_iterator_level {
> -       int initialized;
> -
>         DIR *dir;
>
>         /*
> @@ -20,8 +18,11 @@ struct dir_iterator_level {
>          * iteration and also iterated into):
>          */
>         enum {
> -               DIR_STATE_ITER,
> -               DIR_STATE_RECURSE
> +               DIR_STATE_PUSHED,
> +               DIR_STATE_PRE_ITERATION,
> +               DIR_STATE_ITERATING,
> +               DIR_STATE_POST_ITERATION,
> +               DIR_STATE_EXHAUSTED
>         } dir_state;
>  };
>
> @@ -48,15 +49,20 @@ struct dir_iterator_int {
>          * that will be included in this iteration.
>          */
>         struct dir_iterator_level *levels;
> +
> +       /* Holds the flags passed to dir_iterator_begin(). */
> +       unsigned flags;
>  };
>
>  static inline void push_dir_level(struct dir_iterator_int *iter, struct dir_iterator_level *level)
>  {
> -       level->dir_state = DIR_STATE_RECURSE;
>         ALLOC_GROW(iter->levels, iter->levels_nr + 1,
>                    iter->levels_alloc);
> +
> +       /* Push a new level */
>         level = &iter->levels[iter->levels_nr++];
> -       level->initialized = 0;
> +       level->dir = NULL;
> +       level->dir_state = DIR_STATE_PUSHED;
>  }
>
>  static inline int pop_dir_level(struct dir_iterator_int *iter)
> @@ -75,18 +81,35 @@ static inline int set_iterator_data(struct dir_iterator_int *iter, struct dir_it
>         }
>
>         /*
> -        * We have to set these each time because
> -        * the path strbuf might have been realloc()ed.
> +        * Check if we are dealing with the root directory as an
> +        * item that's being iterated through.
>          */
> -       iter->base.relative_path =
> -               iter->base.path.buf + iter->levels[0].prefix_len;
> +       if (level->dir_state != DIR_STATE_ITERATING &&
> +               iter->levels_nr == 1) {
> +               iter->base.relative_path = ".";
> +       }
> +       else {
> +               iter->base.relative_path =
> +                       iter->base.path.buf + iter->levels[0].prefix_len;
> +       }
>         iter->base.basename =
>                 iter->base.path.buf + level->prefix_len;
> -       level->dir_state = DIR_STATE_ITER;
>
>         return 0;
>  }
>
> +/*
> + * This function uses a state machine with the following states:
> + * -> DIR_STATE_PUSHED: the directory has been pushed to the
> + * iterator traversal tree.
> + * -> DIR_STATE_PRE_ITERATION: the directory is *NOT* initialized. The
> + * dirpath has already been returned if pre-order traversal is set.
> + * -> DIR_STATE_ITERATING: the directory is initialized. We are traversing
> + * through it.
> + * -> DIR_STATE_POST_ITERATION: the directory has been iterated through.
> + * We are ready to close it.
> + * -> DIR_STATE_EXHAUSTED: the directory is closed and ready to be popped.
> + */
>  int dir_iterator_advance(struct dir_iterator *dir_iterator)
>  {
>         struct dir_iterator_int *iter =
> @@ -97,7 +120,18 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
>                         &iter->levels[iter->levels_nr - 1];
>                 struct dirent *de;
>
> -               if (!level->initialized) {
> +               if (level->dir_state == DIR_STATE_PUSHED) {
> +                       level->dir_state = DIR_STATE_PRE_ITERATION;
> +
> +                       if (iter->flags & DIR_ITERATOR_PRE_ORDER_TRAVERSAL) {
> +                               /* We may not want the root directory to be iterated over */
> +                               if (iter->levels_nr != 1 ||
> +                                       (iter->flags & DIR_ITERATOR_LIST_ROOT_DIR)) {
> +                                       set_iterator_data(iter, level);
> +                                       return ITER_OK;
> +                               }
> +                       }
> +               } else if (level->dir_state == DIR_STATE_PRE_ITERATION) {
>                         /*
>                          * Note: dir_iterator_begin() ensures that
>                          * path is not the empty string.
> @@ -108,63 +142,32 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
>
>                         level->dir = opendir(iter->base.path.buf);
>                         if (!level->dir && errno != ENOENT) {
> +                               /*
> +                                * This level wasn't opened sucessfully; pretend we
> +                                * iterated through it already.
> +                                */
>                                 warning("error opening directory %s: %s",
>                                         iter->base.path.buf, strerror(errno));
> -                               /* Popping the level is handled below */
> -                       }
>
> -                       level->initialized = 1;
> -               } else if (S_ISDIR(iter->base.st.st_mode)) {
> -                       if (level->dir_state == DIR_STATE_ITER) {
> -                               /*
> -                                * The directory was just iterated
> -                                * over; now prepare to iterate into
> -                                * it.
> -                                */
> -                               push_dir_level(iter, level);
> +                               level->dir_state = DIR_STATE_POST_ITERATION;
>                                 continue;
> -                       } else {
> -                               /*
> -                                * The directory has already been
> -                                * iterated over and iterated into;
> -                                * we're done with it.
> -                                */
>                         }
> -               }
>
> -               if (!level->dir) {
> -                       /*
> -                        * This level is exhausted (or wasn't opened
> -                        * successfully); pop up a level.
> -                        */
> -                       if (pop_dir_level(iter) == 0)
> -                               return dir_iterator_abort(dir_iterator);
> -
> -                       continue;
> -               }
> -
> -               /*
> -                * Loop until we find an entry that we can give back
> -                * to the caller:
> -                */
> -               while (1) {
> +                       level->dir_state = DIR_STATE_ITERATING;
> +               } else if (level->dir_state == DIR_STATE_ITERATING) {
>                         strbuf_setlen(&iter->base.path, level->prefix_len);
>                         errno = 0;
>                         de = readdir(level->dir);
>
>                         if (!de) {
> -                               /* This level is exhausted; pop up a level. */
> +                               /* In case of readdir() error */
>                                 if (errno) {
>                                         warning("error reading directory %s: %s",
>                                                 iter->base.path.buf, strerror(errno));
> -                               } else if (closedir(level->dir))
> -                                       warning("error closing directory %s: %s",
> -                                               iter->base.path.buf, strerror(errno));
> +                               }
>
> -                               level->dir = NULL;
> -                               if (pop_dir_level(iter) == 0)
> -                                       return dir_iterator_abort(dir_iterator);
> -                               break;
> +                               level->dir_state = DIR_STATE_POST_ITERATION;
> +                               continue;
>                         }
>
>                         if (is_dot_or_dotdot(de->d_name))
> @@ -175,7 +178,38 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
>                         if (set_iterator_data(iter, level))
>                                 continue;
>
> +                       if (S_ISDIR(iter->base.st.st_mode)) {
> +                               push_dir_level(iter, level);
> +                               continue;
> +                       }
> +
>                         return ITER_OK;
> +               } else if (level->dir_state == DIR_STATE_POST_ITERATION) {
> +                       if (level->dir != NULL && closedir(level->dir)) {
> +                               warning("error closing directory %s: %s",
> +                                       iter->base.path.buf, strerror(errno));
> +                       }
> +                       level->dir_state = DIR_STATE_EXHAUSTED;
> +
> +                       strbuf_setlen(&iter->base.path, level->prefix_len);
> +                       /*
> +                        * Since we are iterating through the dirpath
> +                        * after we have gone through it, we still need
> +                        * to get rid of the trailing slash we appended.
> +                        */
> +                       strbuf_strip_suffix(&iter->base.path, "/");
> +
> +                       if (iter->flags & DIR_ITERATOR_POST_ORDER_TRAVERSAL) {
> +                               /* We may not want the root directory to be iterated over */
> +                               if (iter->levels_nr != 1 ||
> +                                       (iter->flags & DIR_ITERATOR_LIST_ROOT_DIR)) {
> +                                       set_iterator_data(iter, level);
> +                                       return ITER_OK;
> +                               }
> +                       }
> +               } else if (level->dir_state == DIR_STATE_EXHAUSTED) {
> +                       if (pop_dir_level(iter) == 0)
> +                               return dir_iterator_abort(dir_iterator);
>                 }
>         }
>  }
> @@ -201,7 +235,7 @@ int dir_iterator_abort(struct dir_iterator *dir_iterator)
>         return ITER_DONE;
>  }
>
> -struct dir_iterator *dir_iterator_begin(const char *path)
> +struct dir_iterator *dir_iterator_begin(const char *path, unsigned flags)
>  {
>         struct dir_iterator_int *iter = xcalloc(1, sizeof(*iter));
>         struct dir_iterator *dir_iterator = &iter->base;
> @@ -209,13 +243,16 @@ struct dir_iterator *dir_iterator_begin(const char *path)
>         if (!path || !*path)
>                 die("BUG: empty path passed to dir_iterator_begin()");
>
> +       iter->flags = flags;
> +
>         strbuf_init(&iter->base.path, PATH_MAX);
>         strbuf_addstr(&iter->base.path, path);
>
>         ALLOC_GROW(iter->levels, 10, iter->levels_alloc);
>
>         iter->levels_nr = 1;
> -       iter->levels[0].initialized = 0;
> +       iter->levels[0].dir = NULL;
> +       iter->levels[0].dir_state = DIR_STATE_PUSHED;
>
>         return dir_iterator;
>  }
> diff --git a/dir-iterator.h b/dir-iterator.h
> index 27739e6..0e82f36 100644
> --- a/dir-iterator.h
> +++ b/dir-iterator.h
> @@ -11,13 +11,12 @@
>   * Every time dir_iterator_advance() is called, update the members of
>   * the dir_iterator structure to reflect the next path in the
>   * iteration. The order that paths are iterated over within a
> - * directory is undefined, but directory paths are always iterated
> - * over before the subdirectory contents.
> + * directory is undefined.
>   *
>   * A typical iteration looks like this:
>   *
>   *     int ok;
> - *     struct iterator *iter = dir_iterator_begin(path);
> + *     struct iterator *iter = dir_iterator_begin(path, flags);
>   *
>   *     while ((ok = dir_iterator_advance(iter)) == ITER_OK) {
>   *             if (want_to_stop_iteration()) {
> @@ -38,6 +37,22 @@
>   * dir_iterator_advance() again.
>   */
>
> +/*
> + * Possible flags for dir_iterator_begin().
> + *
> + * -> DIR_ITERATOR_PRE_ORDER_TRAVERSAL: the iterator shall return
> + * a dirpath it has found before iterating through that directory's
> + * contents.
> + * -> DIR_ITERATOR_POST_ORDER_TRAVERSAL: the iterator shall return
> + * a dirpath it has found after iterating through that directory's
> + * contents.
> + * -> DIR_ITERATOR_LIST_ROOT_DIR: the iterator shall return the dirpath
> + * of the root directory it is iterating through.
> + */
> +#define DIR_ITERATOR_PRE_ORDER_TRAVERSAL (1 << 0)
> +#define DIR_ITERATOR_POST_ORDER_TRAVERSAL (1 << 1)
> +#define DIR_ITERATOR_LIST_ROOT_DIR (1 << 2)
> +
>  struct dir_iterator {
>         /* The current path: */
>         struct strbuf path;
> @@ -57,15 +72,16 @@ struct dir_iterator {
>  };
>
>  /*
> - * Start a directory iteration over path. Return a dir_iterator that
> - * holds the internal state of the iteration.
> + * Start a directory iteration over path, with options specified in
> + * 'flags'. Return a dir_iterator that holds the internal state of
> + * the iteration.
>   *
>   * The iteration includes all paths under path, not including path
>   * itself and not including "." or ".." entries.
>   *
>   * path is the starting directory. An internal copy will be made.
>   */
> -struct dir_iterator *dir_iterator_begin(const char *path);
> +struct dir_iterator *dir_iterator_begin(const char *path, unsigned flags);
>
>  /*
>   * Advance the iterator to the first or next item and return ITER_OK.
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 50188e9..b4bba74 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -3346,7 +3346,7 @@ static struct ref_iterator *files_reflog_iterator_begin(struct ref_store *ref_st
>         files_downcast(ref_store, 0, "reflog_iterator_begin");
>
>         base_ref_iterator_init(ref_iterator, &files_reflog_iterator_vtable);
> -       iter->dir_iterator = dir_iterator_begin(git_path("logs"));
> +       iter->dir_iterator = dir_iterator_begin(git_path("logs"), 0);
>         return ref_iterator;
>  }
>
> diff --git a/t/helper/test-dir-iterator.c b/t/helper/test-dir-iterator.c
> index 06f03fc..a1b8c78 100644
> --- a/t/helper/test-dir-iterator.c
> +++ b/t/helper/test-dir-iterator.c
> @@ -6,6 +6,7 @@
>  int cmd_main(int argc, const char **argv) {
>         struct strbuf path = STRBUF_INIT;
>         struct dir_iterator *diter;
> +       unsigned flag = DIR_ITERATOR_PRE_ORDER_TRAVERSAL;
>
>         if (argc < 2) {
>                 return 1;
> @@ -13,7 +14,10 @@ int cmd_main(int argc, const char **argv) {
>
>         strbuf_add(&path, argv[1], strlen(argv[1]));
>
> -       diter = dir_iterator_begin(path.buf);
> +       if (argc == 3)
> +               flag = atoi(argv[2]);
> +
> +       diter = dir_iterator_begin(path.buf, flag);
>
>         while (dir_iterator_advance(diter) == ITER_OK) {
>                 if (S_ISDIR(diter->st.st_mode))
> diff --git a/t/t0065-dir-iterator.sh b/t/t0065-dir-iterator.sh
> index b857c07..ade3ee0 100755
> --- a/t/t0065-dir-iterator.sh
> +++ b/t/t0065-dir-iterator.sh
> @@ -28,12 +28,28 @@ test_expect_success 'dir-iterator should iterate through all files' '
>         >dir/a/e &&
>         >dir/d/e/d/a &&
>
> -       test-dir-iterator ./dir | sort >./actual-pre-order-sorted-output &&
> +       test-dir-iterator ./dir 1 | sort >./actual-pre-order-sorted-output &&
>         rm -rf dir &&
>
>         test_cmp expect-sorted-output actual-pre-order-sorted-output
>  '
>
> +test_expect_success 'dir-iterator should iterate through all files on post-order mode' '
> +       mkdir -p dir &&
> +       mkdir -p dir/a/b/c/ &&
> +       >dir/b &&
> +       >dir/c &&
> +       mkdir -p dir/d/e/d/ &&
> +       >dir/a/b/c/d &&
> +       >dir/a/e &&
> +       >dir/d/e/d/a &&
> +
> +       test-dir-iterator ./dir 2 | sort >actual-post-order-sorted-output &&
> +       rm -rf dir &&
> +
> +       test_cmp expect-sorted-output actual-post-order-sorted-output
> +'
> +
>  cat >expect-pre-order-output <<-\EOF &&
>  [d] (a) ./dir/a
>  [d] (a/b) ./dir/a/b
> @@ -41,14 +57,53 @@ cat >expect-pre-order-output <<-\EOF &&
>  [f] (a/b/c/d) ./dir/a/b/c/d
>  EOF
>
> -test_expect_success 'dir-iterator should list files in the correct order' '
> +test_expect_success 'dir-iterator should list files properly on pre-order mode' '
>         mkdir -p dir/a/b/c/ &&
>         >dir/a/b/c/d &&
>
> -       test-dir-iterator ./dir >actual-pre-order-output &&
> +       test-dir-iterator ./dir 1 >actual-pre-order-output &&
>         rm -rf dir &&
>
>         test_cmp expect-pre-order-output actual-pre-order-output
>  '
>
> +cat >expect-post-order-output <<-\EOF &&
> +[f] (a/b/c/d) ./dir/a/b/c/d
> +[d] (a/b/c) ./dir/a/b/c
> +[d] (a/b) ./dir/a/b
> +[d] (a) ./dir/a
> +EOF
> +
> +test_expect_success 'dir-iterator should list files properly on post-order mode' '
> +       mkdir -p dir/a/b/c/ &&
> +       >dir/a/b/c/d &&
> +
> +       test-dir-iterator ./dir 2 >actual-post-order-output &&
> +       rm -rf dir &&
> +
> +       test_cmp expect-post-order-output actual-post-order-output
> +'
> +
> +cat >expect-pre-order-post-order-root-dir-output <<-\EOF &&
> +[d] (.) ./dir
> +[d] (a) ./dir/a
> +[d] (a/b) ./dir/a/b
> +[d] (a/b/c) ./dir/a/b/c
> +[f] (a/b/c/d) ./dir/a/b/c/d
> +[d] (a/b/c) ./dir/a/b/c
> +[d] (a/b) ./dir/a/b
> +[d] (a) ./dir/a
> +[d] (.) ./dir
> +EOF
> +
> +test_expect_success 'dir-iterator should list files properly on pre-order + post-order + root-dir mode' '
> +       mkdir -p dir/a/b/c/ &&
> +       >dir/a/b/c/d &&
> +
> +       test-dir-iterator ./dir 7 >actual-pre-order-post-order-root-dir-output &&
> +       rm -rf dir &&
> +
> +       test_cmp expect-pre-order-post-order-root-dir-output actual-pre-order-post-order-root-dir-output
> +'
> +
>  test_done
> --
> 2.7.4 (Apple Git-66)
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v6 4/5] dir_iterator: refactor state machine model
  2017-04-02  4:46   ` Daniel Ferreira (theiostream)
@ 2017-04-02  5:35     ` Michael Haggerty
  0 siblings, 0 replies; 8+ messages in thread
From: Michael Haggerty @ 2017-04-02  5:35 UTC (permalink / raw)
  To: Daniel Ferreira (theiostream), Git Mailing List
  Cc: Junio C Hamano, Stefan Beller, Duy Nguyen, Jeff King

On 04/02/2017 06:46 AM, Daniel Ferreira (theiostream) wrote:
> Gah, I just realized I failed to correct refs/files-backend.c's
> behavior and kept 0 instead of DIR_ITERATOR_PRE_ORDER_TRAVERSAL as its
> flag. I'll correct this on a v7, but I'll wait for the rest of your
> reviews before sending that revision.

Even if I make that change, a lot of tests fail, both when I run the
tests against your commit 4/5 and when I run them against 5/5.

I tested by applying your patch series to the current upstream master
branch. When I apply the patches, there is one conflict (related to
#includes), in entry.c. Maybe the failure is because you are working
from a different branch point. What commit are you branching your series
off of?

You should definitely run the full test suite against each of your
commits before you submit. One convenient way to do so is with my tool
`git-test` [1], which runs the tests and keeps track of their results to
avoid testing the same commit over and over again:

    # One-time setup (adjust the "-j" values based on what's
    # fastest on your computer):
    git test add "make -j8 && make -j16 test"

    # Run tests against a range of commits:
    git test run master..mybranch

Even better is to run the tests in a separate linked worktree, so that
you can continue working in your main repository while the tests are
running. The `git-test` README explains how.

It also speeds things up (and gives better output) if you use `prove` to
run the tests, and run the tests on a ramdisk if possible. To do so, I
include the following lines in my `config.mak`:

    TMP := /var/tmp
    ROOT := $(TMP)/git-tests-$(shell git rev-parse --show-toplevel |
sha1sum | head -c10)
    DEFAULT_TEST_TARGET = prove
    GIT_TEST_OPTS = -q --root=$(ROOT) --tee
    GIT_PROVE_OPTS = --timer --jobs 16 --state=fresh,hot,slow,save

(On my system `/var/tmp` is a ramdisk.)

Finally, when I submit a patch series, I usually also push a copy to my
GitHub fork of the project and include a reference to that branch in my
cover letter. That makes it easier for casual readers to fetch the code
and/or look at it online, and also makes it 100% clear what branch point
I've chosen. This is by no means obligatory but I find it helpful.

There doesn't seem to be much point reviewing broken code, so I'll wait
for your feedback and/or fix.

Michael

[1] https://github.com/mhagger/git-test


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-04-02  5:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-02  4:35 [PATCH v6 0/5] [GSoC] remove_subtree(): reimplement using iterators Daniel Ferreira
2017-04-02  4:35 ` [PATCH v6 1/5] dir_iterator: add tests for dir_iterator API Daniel Ferreira
2017-04-02  4:35 ` [PATCH v6 2/5] remove_subtree(): test removing nested directories Daniel Ferreira
2017-04-02  4:35 ` [PATCH v6 3/5] dir_iterator: add helpers to dir_iterator_advance Daniel Ferreira
2017-04-02  4:35 ` [PATCH v6 4/5] dir_iterator: refactor state machine model Daniel Ferreira
2017-04-02  4:46   ` Daniel Ferreira (theiostream)
2017-04-02  5:35     ` Michael Haggerty
2017-04-02  4:35 ` [PATCH v6 5/5] remove_subtree(): reimplement using iterators Daniel Ferreira

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).