git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: git@vger.kernel.org
Cc: "Thomas Gummerer" <t.gummerer@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal
Date: Sat, 23 Feb 2019 16:03:09 -0300	[thread overview]
Message-ID: <20190223190309.6728-4-matheus.bernardino@usp.br> (raw)
In-Reply-To: <20190223190309.6728-1-matheus.bernardino@usp.br>

Replace usage of opendir/readdir/closedir API to traverse directories
recursively, at copy_or_link_directory function, by the dir-iterator
API. This simplifies the code and avoid recursive calls to
copy_or_link_directory.

This process also brings some safe behaviour changes to
copy_or_link_directory:
 - It will no longer follows symbolic links. This is not a problem,
   since the function is only used to copy .git/objects directory, and
   symbolic links are not expected there.
 - Hidden directories won't be skipped anymore. In fact, it is odd that
   the function currently skip hidden directories but not hidden files.
   The reason for that could be unintentional: probably the intention
   was to skip '.' and '..' only, but it ended up accidentally skipping
   all directories starting with '.'. Again, it must not be a problem
   not to skip hidden dirs since hidden dirs/files are not expected at
   .git/objects.
 - Now, copy_or_link_directory will call die() in case of an error on
   openddir, readdir or lstat, inside dir_iterator_advance. That means
   it will abort in case of an error trying to fetch any iteration
   entry.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
Changes in v2:
 - Improved patch message
 - Removed a now unused variable
 - Put warning on stat error back
 - Added pedantic option to dir-iterator initialization
 - Modified copy_or_link_directory not to skip hidden paths

 builtin/clone.c | 47 ++++++++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 19 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 862d2ea69c..515dc91d63 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -23,6 +23,8 @@
 #include "transport.h"
 #include "strbuf.h"
 #include "dir.h"
+#include "dir-iterator.h"
+#include "iterator.h"
 #include "sigchain.h"
 #include "branch.h"
 #include "remote.h"
@@ -411,42 +413,45 @@ static void mkdir_if_missing(const char *pathname, mode_t mode)
 }
 
 static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
-				   const char *src_repo, int src_baselen)
+				   const char *src_repo)
 {
-	struct dirent *de;
-	struct stat buf;
 	int src_len, dest_len;
-	DIR *dir;
-
-	dir = opendir(src->buf);
-	if (!dir)
-		die_errno(_("failed to open '%s'"), src->buf);
+	struct dir_iterator *iter;
+	int iter_status;
+	struct stat st;
 
 	mkdir_if_missing(dest->buf, 0777);
 
+	iter = dir_iterator_begin(src->buf, 1);
+
 	strbuf_addch(src, '/');
 	src_len = src->len;
 	strbuf_addch(dest, '/');
 	dest_len = dest->len;
 
-	while ((de = readdir(dir)) != NULL) {
+	while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) {
 		strbuf_setlen(src, src_len);
-		strbuf_addstr(src, de->d_name);
+		strbuf_addstr(src, iter->relative_path);
 		strbuf_setlen(dest, dest_len);
-		strbuf_addstr(dest, de->d_name);
-		if (stat(src->buf, &buf)) {
+		strbuf_addstr(dest, iter->relative_path);
+
+		/*
+		 * dir_iterator_advance already calls lstat to populate iter->st
+		 * but, unlike stat, lstat does not checks for permissions on
+		 * the given path.
+		 */
+		if (stat(src->buf, &st)) {
 			warning (_("failed to stat %s\n"), src->buf);
 			continue;
 		}
-		if (S_ISDIR(buf.st_mode)) {
-			if (de->d_name[0] != '.')
-				copy_or_link_directory(src, dest,
-						       src_repo, src_baselen);
+
+		if (S_ISDIR(iter->st.st_mode)) {
+			mkdir_if_missing(dest->buf, 0777);
 			continue;
 		}
 
 		/* Files that cannot be copied bit-for-bit... */
-		if (!strcmp(src->buf + src_baselen, "/info/alternates")) {
+		if (!strcmp(iter->relative_path, "info/alternates")) {
 			copy_alternates(src, dest, src_repo);
 			continue;
 		}
@@ -463,7 +468,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
 		if (copy_file_with_time(dest->buf, src->buf, 0666))
 			die_errno(_("failed to copy file to '%s'"), dest->buf);
 	}
-	closedir(dir);
+
+	if (iter_status != ITER_DONE) {
+		strbuf_setlen(src, src_len);
+		die(_("failed to iterate over '%s'"), src->buf);
+	}
 }
 
 static void clone_local(const char *src_repo, const char *dest_repo)
@@ -481,7 +490,7 @@ static void clone_local(const char *src_repo, const char *dest_repo)
 		get_common_dir(&dest, dest_repo);
 		strbuf_addstr(&src, "/objects");
 		strbuf_addstr(&dest, "/objects");
-		copy_or_link_directory(&src, &dest, src_repo, src.len);
+		copy_or_link_directory(&src, &dest, src_repo);
 		strbuf_release(&src);
 		strbuf_release(&dest);
 	}
-- 
2.20.1


  parent reply	other threads:[~2019-02-23 19:03 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-23 19:03 [GSoC][PATCH 0/3] clone: convert explicit dir traversal to dir-iterator Matheus Tavares
2019-02-23 19:03 ` [GSoC][PATCH 1/3] dir-iterator: add pedantic option to dir_iterator_begin Matheus Tavares
2019-02-23 21:35   ` Thomas Gummerer
2019-02-24  8:35     ` Christian Couder
2019-02-24 17:43       ` Matheus Tavares Bernardino
2019-02-24 21:06         ` Thomas Gummerer
2019-02-23 19:03 ` [GSoC][PATCH 2/3] clone: extract function from copy_or_link_directory Matheus Tavares
2019-02-24  8:38   ` Christian Couder
2019-02-23 19:03 ` Matheus Tavares [this message]
2019-02-23 21:48   ` [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal Thomas Gummerer
2019-02-24 18:19     ` Matheus Tavares Bernardino
2019-02-23 22:40   ` Ævar Arnfjörð Bjarmason
2019-02-24  9:41     ` Christian Couder
2019-02-24 14:45       ` Ævar Arnfjörð Bjarmason
2019-02-25  9:45         ` Duy Nguyen
2019-02-26  0:26           ` [WIP RFC PATCH 0/7] clone: dir iterator refactoring with tests Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 1/7] dir-iterator: add pedantic option to dir_iterator_begin Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 2/7] dir-iterator: use stat() instead of lstat() Ævar Arnfjörð Bjarmason
2019-02-26  1:53             ` Matheus Tavares Bernardino
2019-02-26  0:26           ` [WIP RFC PATCH 3/7] clone: extract function from copy_or_link_directory Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 4/7] clone: test for our behavior on odd objects/* content Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 5/7] clone: use dir-iterator to avoid explicit dir traversal Ævar Arnfjörð Bjarmason
2019-02-26  3:48             ` Matheus Tavares Bernardino
2019-02-26 11:33               ` Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 6/7] clone: stop ignoring dotdirs in --local etc. clone Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 7/7] clone: break cloning repos that have symlinks in them Ævar Arnfjörð Bjarmason
2019-02-25  2:31       ` [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares Bernardino
2019-02-25 10:25         ` Ævar Arnfjörð Bjarmason
2019-02-25 20:40           ` Christian Couder
2019-02-26 10:33         ` Christian Couder
2019-02-23 19:07 ` [GSoC][PATCH 0/3] clone: convert explicit dir traversal to dir-iterator Matheus Tavares Bernardino
2019-02-23 20:10   ` Ævar Arnfjörð Bjarmason
2019-02-23 21:59 ` Thomas Gummerer
2019-02-24 16:34   ` Matheus Tavares Bernardino
2019-02-24 21:07     ` Thomas Gummerer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190223190309.6728-4-matheus.bernardino@usp.br \
    --to=matheus.bernardino@usp.br \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=t.gummerer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).