git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2] [GSoC] remove_subtree(): reimplement using iterators
@ 2017-03-25 11:27 Daniel Ferreira
  2017-03-25 11:51 ` Duy Nguyen
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Ferreira @ 2017-03-25 11:27 UTC (permalink / raw)
  To: git; +Cc: gitster, pclouds, sbeller, Daniel Ferreira

Use dir_iterator to traverse through remove_subtree()'s directory tree,
avoiding the need for recursive calls to readdir(). Simplify
remove_subtree()'s code.

A conversion similar in purpose was previously done at 46d092a
("for_each_reflog(): reimplement using iterators", 2016-05-21).

Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
---

This is a second-version patch of the Google Summer of Code microproject for
refactoring recursive readdir() calls to use dir_iterator instead. v1 can be
found in:

https://public-inbox.org/git/CAGZ79kZwT-9mHTiOJ5CEjk2wDFkn6+NcogjX0=vjhsAh16ANYg@mail.gmail.com/T/#mae023e7a7d7626f00e0923833c4359f5af493730

Additionally, for debugging purposes I turned remove_subtree() into a no-op
and ran git tests. Some failures were at:

* t2000-checkout-cache-clash.sh
* t2003-checkout-cache-mkdir.sh

If you guys could check those files out and warn me if any additional tests
would be welcome, please let me know.

Thanks.

 entry.c | 28 +++++++---------------------
 1 file changed, 7 insertions(+), 21 deletions(-)

diff --git a/entry.c b/entry.c
index c6eea240b..3cb92592d 100644
--- a/entry.c
+++ b/entry.c
@@ -2,6 +2,8 @@
 #include "blob.h"
 #include "dir.h"
 #include "streaming.h"
+#include "iterator.h"
+#include "dir-iterator.h"

 static void create_directories(const char *path, int path_len,
 			       const struct checkout *state)
@@ -46,29 +48,13 @@ static void create_directories(const char *path, int path_len,

 static void remove_subtree(struct strbuf *path)
 {
-	DIR *dir = opendir(path->buf);
-	struct dirent *de;
-	int origlen = path->len;
-
-	if (!dir)
-		die_errno("cannot opendir '%s'", path->buf);
-	while ((de = readdir(dir)) != NULL) {
-		struct stat st;
-
-		if (is_dot_or_dotdot(de->d_name))
-			continue;
-
-		strbuf_addch(path, '/');
-		strbuf_addstr(path, de->d_name);
-		if (lstat(path->buf, &st))
-			die_errno("cannot lstat '%s'", path->buf);
-		if (S_ISDIR(st.st_mode))
-			remove_subtree(path);
-		else if (unlink(path->buf))
+	struct dir_iterator *diter = dir_iterator_begin(path->buf);
+
+	while (dir_iterator_advance(diter) == ITER_OK) {
+		if (unlink(diter->path.buf))
 			die_errno("cannot unlink '%s'", path->buf);
-		strbuf_setlen(path, origlen);
 	}
-	closedir(dir);
+
 	if (rmdir(path->buf))
 		die_errno("cannot rmdir '%s'", path->buf);
 }
--
2.12.1.433.g82305b74f.dirty


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] [GSoC] remove_subtree(): reimplement using iterators
  2017-03-25 11:27 [PATCH v2] [GSoC] remove_subtree(): reimplement using iterators Daniel Ferreira
@ 2017-03-25 11:51 ` Duy Nguyen
  2017-03-25 12:13   ` Daniel Ferreira (theiostream)
  0 siblings, 1 reply; 4+ messages in thread
From: Duy Nguyen @ 2017-03-25 11:51 UTC (permalink / raw)
  To: Daniel Ferreira; +Cc: Git Mailing List, Junio C Hamano, Stefan Beller

On Sat, Mar 25, 2017 at 6:27 PM, Daniel Ferreira <bnmvco@gmail.com> wrote:
> Use dir_iterator to traverse through remove_subtree()'s directory tree,
> avoiding the need for recursive calls to readdir(). Simplify
> remove_subtree()'s code.
>
> A conversion similar in purpose was previously done at 46d092a
> ("for_each_reflog(): reimplement using iterators", 2016-05-21).
>
> Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
> ---
>
> This is a second-version patch of the Google Summer of Code microproject for
> refactoring recursive readdir() calls to use dir_iterator instead. v1 can be
> found in:
>
> https://public-inbox.org/git/CAGZ79kZwT-9mHTiOJ5CEjk2wDFkn6+NcogjX0=vjhsAh16ANYg@mail.gmail.com/T/#mae023e7a7d7626f00e0923833c4359f5af493730
>
> Additionally, for debugging purposes I turned remove_subtree() into a no-op
> and ran git tests. Some failures were at:
>
> * t2000-checkout-cache-clash.sh
> * t2003-checkout-cache-mkdir.sh
>
> If you guys could check those files out and warn me if any additional tests
> would be welcome, please let me know.
>
> Thanks.
>
>  entry.c | 28 +++++++---------------------
>  1 file changed, 7 insertions(+), 21 deletions(-)
>
> diff --git a/entry.c b/entry.c
> index c6eea240b..3cb92592d 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -2,6 +2,8 @@
>  #include "blob.h"
>  #include "dir.h"
>  #include "streaming.h"
> +#include "iterator.h"
> +#include "dir-iterator.h"
>
>  static void create_directories(const char *path, int path_len,
>                                const struct checkout *state)
> @@ -46,29 +48,13 @@ static void create_directories(const char *path, int path_len,
>
>  static void remove_subtree(struct strbuf *path)
>  {
> -       DIR *dir = opendir(path->buf);
> -       struct dirent *de;
> -       int origlen = path->len;
> -
> -       if (!dir)
> -               die_errno("cannot opendir '%s'", path->buf);
> -       while ((de = readdir(dir)) != NULL) {
> -               struct stat st;
> -
> -               if (is_dot_or_dotdot(de->d_name))
> -                       continue;
> -
> -               strbuf_addch(path, '/');
> -               strbuf_addstr(path, de->d_name);
> -               if (lstat(path->buf, &st))
> -                       die_errno("cannot lstat '%s'", path->buf);
> -               if (S_ISDIR(st.st_mode))
> -                       remove_subtree(path);
> -               else if (unlink(path->buf))
> +       struct dir_iterator *diter = dir_iterator_begin(path->buf);
> +
> +       while (dir_iterator_advance(diter) == ITER_OK) {
> +               if (unlink(diter->path.buf))
>                         die_errno("cannot unlink '%s'", path->buf);
> -               strbuf_setlen(path, origlen);
>         }
> -       closedir(dir);
> +
>         if (rmdir(path->buf))
>                 die_errno("cannot rmdir '%s'", path->buf);

Even though it's very nice that lots of code is deleted. This is not
entirely correct, is it? Before this patch, rmdir() is called for
every recursive remove_subtree() call. After this patch, it's only
called once (and likely fails unless you have no subdirectories).

>  }
> --
> 2.12.1.433.g82305b74f.dirty
>



-- 
Duy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] [GSoC] remove_subtree(): reimplement using iterators
  2017-03-25 11:51 ` Duy Nguyen
@ 2017-03-25 12:13   ` Daniel Ferreira (theiostream)
  2017-03-25 12:29     ` Duy Nguyen
  0 siblings, 1 reply; 4+ messages in thread
From: Daniel Ferreira (theiostream) @ 2017-03-25 12:13 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List, Junio C Hamano, Stefan Beller

You are correct, which shows that since all tests pass, we need to
come up with better cases for this function.

As for a solution, I believe that the best way to go for it is to
dir_iterator's implementation to have an "Option to iterate over
directory paths before vs. after their contents" (something predicted
in the commit that created it). If it iterates over directories after
all of its contents (currently it does so before) we just need to
check if the entry is a directory and if so, rmdir() it. Does that
make sense?

On Sat, Mar 25, 2017 at 8:51 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Sat, Mar 25, 2017 at 6:27 PM, Daniel Ferreira <bnmvco@gmail.com> wrote:
>> Use dir_iterator to traverse through remove_subtree()'s directory tree,
>> avoiding the need for recursive calls to readdir(). Simplify
>> remove_subtree()'s code.
>>
>> A conversion similar in purpose was previously done at 46d092a
>> ("for_each_reflog(): reimplement using iterators", 2016-05-21).
>>
>> Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
>> ---
>>
>> This is a second-version patch of the Google Summer of Code microproject for
>> refactoring recursive readdir() calls to use dir_iterator instead. v1 can be
>> found in:
>>
>> https://public-inbox.org/git/CAGZ79kZwT-9mHTiOJ5CEjk2wDFkn6+NcogjX0=vjhsAh16ANYg@mail.gmail.com/T/#mae023e7a7d7626f00e0923833c4359f5af493730
>>
>> Additionally, for debugging purposes I turned remove_subtree() into a no-op
>> and ran git tests. Some failures were at:
>>
>> * t2000-checkout-cache-clash.sh
>> * t2003-checkout-cache-mkdir.sh
>>
>> If you guys could check those files out and warn me if any additional tests
>> would be welcome, please let me know.
>>
>> Thanks.
>>
>>  entry.c | 28 +++++++---------------------
>>  1 file changed, 7 insertions(+), 21 deletions(-)
>>
>> diff --git a/entry.c b/entry.c
>> index c6eea240b..3cb92592d 100644
>> --- a/entry.c
>> +++ b/entry.c
>> @@ -2,6 +2,8 @@
>>  #include "blob.h"
>>  #include "dir.h"
>>  #include "streaming.h"
>> +#include "iterator.h"
>> +#include "dir-iterator.h"
>>
>>  static void create_directories(const char *path, int path_len,
>>                                const struct checkout *state)
>> @@ -46,29 +48,13 @@ static void create_directories(const char *path, int path_len,
>>
>>  static void remove_subtree(struct strbuf *path)
>>  {
>> -       DIR *dir = opendir(path->buf);
>> -       struct dirent *de;
>> -       int origlen = path->len;
>> -
>> -       if (!dir)
>> -               die_errno("cannot opendir '%s'", path->buf);
>> -       while ((de = readdir(dir)) != NULL) {
>> -               struct stat st;
>> -
>> -               if (is_dot_or_dotdot(de->d_name))
>> -                       continue;
>> -
>> -               strbuf_addch(path, '/');
>> -               strbuf_addstr(path, de->d_name);
>> -               if (lstat(path->buf, &st))
>> -                       die_errno("cannot lstat '%s'", path->buf);
>> -               if (S_ISDIR(st.st_mode))
>> -                       remove_subtree(path);
>> -               else if (unlink(path->buf))
>> +       struct dir_iterator *diter = dir_iterator_begin(path->buf);
>> +
>> +       while (dir_iterator_advance(diter) == ITER_OK) {
>> +               if (unlink(diter->path.buf))
>>                         die_errno("cannot unlink '%s'", path->buf);
>> -               strbuf_setlen(path, origlen);
>>         }
>> -       closedir(dir);
>> +
>>         if (rmdir(path->buf))
>>                 die_errno("cannot rmdir '%s'", path->buf);
>
> Even though it's very nice that lots of code is deleted. This is not
> entirely correct, is it? Before this patch, rmdir() is called for
> every recursive remove_subtree() call. After this patch, it's only
> called once (and likely fails unless you have no subdirectories).
>
>>  }
>> --
>> 2.12.1.433.g82305b74f.dirty
>>
>
>
>
> --
> Duy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v2] [GSoC] remove_subtree(): reimplement using iterators
  2017-03-25 12:13   ` Daniel Ferreira (theiostream)
@ 2017-03-25 12:29     ` Duy Nguyen
  0 siblings, 0 replies; 4+ messages in thread
From: Duy Nguyen @ 2017-03-25 12:29 UTC (permalink / raw)
  To: Daniel Ferreira (theiostream)
  Cc: Git Mailing List, Junio C Hamano, Stefan Beller

On Sat, Mar 25, 2017 at 7:13 PM, Daniel Ferreira (theiostream)
<bnmvco@gmail.com> wrote:
> You are correct, which shows that since all tests pass, we need to
> come up with better cases for this function.
>
> As for a solution, I believe that the best way to go for it is to
> dir_iterator's implementation to have an "Option to iterate over
> directory paths before vs. after their contents" (something predicted
> in the commit that created it). If it iterates over directories after
> all of its contents (currently it does so before) we just need to
> check if the entry is a directory and if so, rmdir() it. Does that
> make sense?

Yes. However since this is a microproject, intended to be done quickly
just to get you familiarize with the community and the process, I
suggest you look for another readdir() place that can be easily
converted to using dir-iterator first so you could go through the
whole thing. But of course I will not object you improving
dir-iterator and clean remove_subtree() up ;-)
-- 
Duy

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-03-25 12:29 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-25 11:27 [PATCH v2] [GSoC] remove_subtree(): reimplement using iterators Daniel Ferreira
2017-03-25 11:51 ` Duy Nguyen
2017-03-25 12:13   ` Daniel Ferreira (theiostream)
2017-03-25 12:29     ` Duy Nguyen

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).