From: Thomas Gummerer <t.gummerer@gmail.com>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>,
"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal
Date: Sat, 23 Feb 2019 21:48:56 +0000 [thread overview]
Message-ID: <20190223214856.GQ6085@hank.intra.tgummerer.com> (raw)
In-Reply-To: <20190223190309.6728-4-matheus.bernardino@usp.br>
On 02/23, Matheus Tavares wrote:
> Replace usage of opendir/readdir/closedir API to traverse directories
> recursively, at copy_or_link_directory function, by the dir-iterator
> API. This simplifies the code and avoid recursive calls to
> copy_or_link_directory.
>
> This process also brings some safe behaviour changes to
> copy_or_link_directory:
> - It will no longer follows symbolic links. This is not a problem,
> since the function is only used to copy .git/objects directory, and
> symbolic links are not expected there.
> - Hidden directories won't be skipped anymore. In fact, it is odd that
> the function currently skip hidden directories but not hidden files.
> The reason for that could be unintentional: probably the intention
> was to skip '.' and '..' only, but it ended up accidentally skipping
> all directories starting with '.'. Again, it must not be a problem
> not to skip hidden dirs since hidden dirs/files are not expected at
> .git/objects.
> - Now, copy_or_link_directory will call die() in case of an error on
> openddir, readdir or lstat, inside dir_iterator_advance. That means
> it will abort in case of an error trying to fetch any iteration
> entry.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
> Changes in v2:
> - Improved patch message
> - Removed a now unused variable
s/variable/parameter/ I believe?
> - Put warning on stat error back
> - Added pedantic option to dir-iterator initialization
> - Modified copy_or_link_directory not to skip hidden paths
Thanks, these descriptions are very useful for reviewers that had a
look at previous rounds.
> builtin/clone.c | 47 ++++++++++++++++++++++++++++-------------------
> 1 file changed, 28 insertions(+), 19 deletions(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 862d2ea69c..515dc91d63 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -23,6 +23,8 @@
> #include "transport.h"
> #include "strbuf.h"
> #include "dir.h"
> +#include "dir-iterator.h"
> +#include "iterator.h"
> #include "sigchain.h"
> #include "branch.h"
> #include "remote.h"
> @@ -411,42 +413,45 @@ static void mkdir_if_missing(const char *pathname, mode_t mode)
> }
>
> static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
> - const char *src_repo, int src_baselen)
> + const char *src_repo)
> {
> - struct dirent *de;
> - struct stat buf;
> int src_len, dest_len;
> - DIR *dir;
> -
> - dir = opendir(src->buf);
> - if (!dir)
> - die_errno(_("failed to open '%s'"), src->buf);
> + struct dir_iterator *iter;
> + int iter_status;
> + struct stat st;
>
> mkdir_if_missing(dest->buf, 0777);
>
> + iter = dir_iterator_begin(src->buf, 1);
> +
> strbuf_addch(src, '/');
> src_len = src->len;
> strbuf_addch(dest, '/');
> dest_len = dest->len;
>
> - while ((de = readdir(dir)) != NULL) {
> + while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) {
> strbuf_setlen(src, src_len);
> - strbuf_addstr(src, de->d_name);
> + strbuf_addstr(src, iter->relative_path);
> strbuf_setlen(dest, dest_len);
> - strbuf_addstr(dest, de->d_name);
> - if (stat(src->buf, &buf)) {
> + strbuf_addstr(dest, iter->relative_path);
> +
> + /*
> + * dir_iterator_advance already calls lstat to populate iter->st
> + * but, unlike stat, lstat does not checks for permissions on
> + * the given path.
> + */
Hmm, lstat does check the permissions on the path, it just doesn't
follow symlinks. I think I actually mislead you in my previous review
here, and was reading the code in dir-iterator.c all wrong.
I thought it said "if (errno == ENOENT) warning(...)", however the
condition is "errno != ENOENT", which is why I thought we were loosing
warnings when errno == EACCES for example.
As we decided that we would no longer follow symlinks now, I think we
can actually get rid of the stat call here. Sorry about the confusion.
> + if (stat(src->buf, &st)) {
> warning (_("failed to stat %s\n"), src->buf);
> continue;
> }
> - if (S_ISDIR(buf.st_mode)) {
> - if (de->d_name[0] != '.')
> - copy_or_link_directory(src, dest,
> - src_repo, src_baselen);
> +
> + if (S_ISDIR(iter->st.st_mode)) {
> + mkdir_if_missing(dest->buf, 0777);
> continue;
> }
>
> /* Files that cannot be copied bit-for-bit... */
> - if (!strcmp(src->buf + src_baselen, "/info/alternates")) {
> + if (!strcmp(iter->relative_path, "info/alternates")) {
> copy_alternates(src, dest, src_repo);
> continue;
> }
> @@ -463,7 +468,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
> if (copy_file_with_time(dest->buf, src->buf, 0666))
> die_errno(_("failed to copy file to '%s'"), dest->buf);
> }
> - closedir(dir);
> +
> + if (iter_status != ITER_DONE) {
> + strbuf_setlen(src, src_len);
> + die(_("failed to iterate over '%s'"), src->buf);
> + }
> }
>
> static void clone_local(const char *src_repo, const char *dest_repo)
> @@ -481,7 +490,7 @@ static void clone_local(const char *src_repo, const char *dest_repo)
> get_common_dir(&dest, dest_repo);
> strbuf_addstr(&src, "/objects");
> strbuf_addstr(&dest, "/objects");
> - copy_or_link_directory(&src, &dest, src_repo, src.len);
> + copy_or_link_directory(&src, &dest, src_repo);
> strbuf_release(&src);
> strbuf_release(&dest);
> }
> --
> 2.20.1
>
next prev parent reply other threads:[~2019-02-23 21:49 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-23 19:03 [GSoC][PATCH 0/3] clone: convert explicit dir traversal to dir-iterator Matheus Tavares
2019-02-23 19:03 ` [GSoC][PATCH 1/3] dir-iterator: add pedantic option to dir_iterator_begin Matheus Tavares
2019-02-23 21:35 ` Thomas Gummerer
2019-02-24 8:35 ` Christian Couder
2019-02-24 17:43 ` Matheus Tavares Bernardino
2019-02-24 21:06 ` Thomas Gummerer
2019-02-23 19:03 ` [GSoC][PATCH 2/3] clone: extract function from copy_or_link_directory Matheus Tavares
2019-02-24 8:38 ` Christian Couder
2019-02-23 19:03 ` [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-02-23 21:48 ` Thomas Gummerer [this message]
2019-02-24 18:19 ` Matheus Tavares Bernardino
2019-02-23 22:40 ` Ævar Arnfjörð Bjarmason
2019-02-24 9:41 ` Christian Couder
2019-02-24 14:45 ` Ævar Arnfjörð Bjarmason
2019-02-25 9:45 ` Duy Nguyen
2019-02-26 0:26 ` [WIP RFC PATCH 0/7] clone: dir iterator refactoring with tests Ævar Arnfjörð Bjarmason
2019-02-26 0:26 ` [WIP RFC PATCH 1/7] dir-iterator: add pedantic option to dir_iterator_begin Ævar Arnfjörð Bjarmason
2019-02-26 0:26 ` [WIP RFC PATCH 2/7] dir-iterator: use stat() instead of lstat() Ævar Arnfjörð Bjarmason
2019-02-26 1:53 ` Matheus Tavares Bernardino
2019-02-26 0:26 ` [WIP RFC PATCH 3/7] clone: extract function from copy_or_link_directory Ævar Arnfjörð Bjarmason
2019-02-26 0:26 ` [WIP RFC PATCH 4/7] clone: test for our behavior on odd objects/* content Ævar Arnfjörð Bjarmason
2019-02-26 0:26 ` [WIP RFC PATCH 5/7] clone: use dir-iterator to avoid explicit dir traversal Ævar Arnfjörð Bjarmason
2019-02-26 3:48 ` Matheus Tavares Bernardino
2019-02-26 11:33 ` Ævar Arnfjörð Bjarmason
2019-02-26 0:26 ` [WIP RFC PATCH 6/7] clone: stop ignoring dotdirs in --local etc. clone Ævar Arnfjörð Bjarmason
2019-02-26 0:26 ` [WIP RFC PATCH 7/7] clone: break cloning repos that have symlinks in them Ævar Arnfjörð Bjarmason
2019-02-25 2:31 ` [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares Bernardino
2019-02-25 10:25 ` Ævar Arnfjörð Bjarmason
2019-02-25 20:40 ` Christian Couder
2019-02-26 10:33 ` Christian Couder
2019-02-23 19:07 ` [GSoC][PATCH 0/3] clone: convert explicit dir traversal to dir-iterator Matheus Tavares Bernardino
2019-02-23 20:10 ` Ævar Arnfjörð Bjarmason
2019-02-23 21:59 ` Thomas Gummerer
2019-02-24 16:34 ` Matheus Tavares Bernardino
2019-02-24 21:07 ` Thomas Gummerer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190223214856.GQ6085@hank.intra.tgummerer.com \
--to=t.gummerer@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=matheus.bernardino@usp.br \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).