git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git <git@vger.kernel.org>, "Junio C Hamano" <gitster@pobox.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Thomas Gummerer" <t.gummerer@gmail.com>,
	"Christian Couder" <christian.couder@gmail.com>
Subject: Re: [WIP RFC PATCH 5/7] clone: use dir-iterator to avoid explicit dir traversal
Date: Tue, 26 Feb 2019 00:48:44 -0300	[thread overview]
Message-ID: <CAHd-oW7HvhkORZogjb0nPioe3KWmXDDACF6Tu-Kmayc7=1z7nA@mail.gmail.com> (raw)
In-Reply-To: <20190226002625.13022-6-avarab@gmail.com>

On Mon, Feb 25, 2019 at 9:26 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> From: Matheus Tavares <matheus.bernardino@usp.br>
>
> Replace usage of opendir/readdir/closedir API to traverse directories
> recursively, at copy_or_link_directory function, by the dir-iterator
> API. This simplifies the code and avoid recursive calls to
> copy_or_link_directory.
>
> [Ævar: This should be bug-compatible with the existing "clone"
> behavior. The whole bit here with "iter->relative_path[0] == '.'" is a
> dirty hack. We don't copy dot-dirs, and then later on just blindly
> ignore ENOENT errors as we descend into them. That case really wants
> to be a is_dotdir_or_file_within() test instead]
>
> Now, copy_or_link_directory will call die() in case of an error on
> openddir, readdir or lstat, inside dir_iterator_advance. That means it
> will abort in case of an error trying to fetch any iteration entry.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/clone.c | 55 +++++++++++++++++++++++++++++--------------------
>  1 file changed, 33 insertions(+), 22 deletions(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 862d2ea69c..c32e9022b3 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -23,6 +23,8 @@
>  #include "transport.h"
>  #include "strbuf.h"
>  #include "dir.h"
> +#include "dir-iterator.h"
> +#include "iterator.h"
>  #include "sigchain.h"
>  #include "branch.h"
>  #include "remote.h"
> @@ -411,42 +413,47 @@ static void mkdir_if_missing(const char *pathname, mode_t mode)
>  }
>
>  static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
> -                                  const char *src_repo, int src_baselen)
> +                                  const char *src_repo)
>  {
> -       struct dirent *de;
> -       struct stat buf;
>         int src_len, dest_len;
> -       DIR *dir;
> -
> -       dir = opendir(src->buf);
> -       if (!dir)
> -               die_errno(_("failed to open '%s'"), src->buf);
> +       struct dir_iterator *iter;
> +       int iter_status;
> +       struct stat st;
>
>         mkdir_if_missing(dest->buf, 0777);
>
> +       iter = dir_iterator_begin(src->buf, 1);
> +
>         strbuf_addch(src, '/');
>         src_len = src->len;
>         strbuf_addch(dest, '/');
>         dest_len = dest->len;
>
> -       while ((de = readdir(dir)) != NULL) {
> +       while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) {
>                 strbuf_setlen(src, src_len);
> -               strbuf_addstr(src, de->d_name);
> +               strbuf_addstr(src, iter->relative_path);
>                 strbuf_setlen(dest, dest_len);
> -               strbuf_addstr(dest, de->d_name);
> -               if (stat(src->buf, &buf)) {
> +               strbuf_addstr(dest, iter->relative_path);
> +
> +               /*
> +                * dir_iterator_advance already calls lstat to populate iter->st
> +                * but, unlike stat, lstat does not checks for permissions on
> +                * the given path.
> +                */
> +               if (stat(src->buf, &st)) {
>                         warning (_("failed to stat %s\n"), src->buf);
>                         continue;
>                 }
> -               if (S_ISDIR(buf.st_mode)) {
> -                       if (de->d_name[0] != '.')
> -                               copy_or_link_directory(src, dest,
> -                                                      src_repo, src_baselen);
> +
> +               if (S_ISDIR(iter->st.st_mode)) {
> +                       if (iter->relative_path[0] == '.')

I think it should be iter->basename[0] here, instead, right?

I also have a more conceptual question here: This additions (or the
is_dotdir_of_file_within as suggested) are just to make patch
compatible with the current behaviour, but are going to be removed
soon after. As this would be kind of a noise, wouldn't it be better to
have a patch before this, already correcting copy_or_link_directory's
behaviour on hidden dirs and them this?

> +                               continue;
> +                       mkdir_if_missing(dest->buf, 0777);
>                         continue;
>                 }
>
>                 /* Files that cannot be copied bit-for-bit... */
> -               if (!strcmp(src->buf + src_baselen, "/info/alternates")) {
> +               if (!strcmp(iter->relative_path, "info/alternates")) {
>                         copy_alternates(src, dest, src_repo);
>                         continue;
>                 }
> @@ -456,14 +463,18 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
>                 if (!option_no_hardlinks) {
>                         if (!link(src->buf, dest->buf))
>                                 continue;
> -                       if (option_local > 0)
> -                               die_errno(_("failed to create link '%s'"), dest->buf);
> +                       if (option_local > 0 && errno != ENOENT)
> +                               warning_errno(_("failed to create link '%s'"), dest->buf);
>                         option_no_hardlinks = 1;
>                 }
> -               if (copy_file_with_time(dest->buf, src->buf, 0666))
> +               if (copy_file_with_time(dest->buf, src->buf, 0666) && errno != ENOENT)
>                         die_errno(_("failed to copy file to '%s'"), dest->buf);
>         }
> -       closedir(dir);
> +
> +       if (iter_status != ITER_DONE) {
> +               strbuf_setlen(src, src_len);
> +               die(_("failed to iterate over '%s'"), src->buf);
> +       }
>  }
>
>  static void clone_local(const char *src_repo, const char *dest_repo)
> @@ -481,7 +492,7 @@ static void clone_local(const char *src_repo, const char *dest_repo)
>                 get_common_dir(&dest, dest_repo);
>                 strbuf_addstr(&src, "/objects");
>                 strbuf_addstr(&dest, "/objects");
> -               copy_or_link_directory(&src, &dest, src_repo, src.len);
> +               copy_or_link_directory(&src, &dest, src_repo);
>                 strbuf_release(&src);
>                 strbuf_release(&dest);
>         }
> --
> 2.21.0.rc2.1.g2d5e20a900.dirty
>

  reply	other threads:[~2019-02-26  3:48 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-23 19:03 [GSoC][PATCH 0/3] clone: convert explicit dir traversal to dir-iterator Matheus Tavares
2019-02-23 19:03 ` [GSoC][PATCH 1/3] dir-iterator: add pedantic option to dir_iterator_begin Matheus Tavares
2019-02-23 21:35   ` Thomas Gummerer
2019-02-24  8:35     ` Christian Couder
2019-02-24 17:43       ` Matheus Tavares Bernardino
2019-02-24 21:06         ` Thomas Gummerer
2019-02-23 19:03 ` [GSoC][PATCH 2/3] clone: extract function from copy_or_link_directory Matheus Tavares
2019-02-24  8:38   ` Christian Couder
2019-02-23 19:03 ` [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-02-23 21:48   ` Thomas Gummerer
2019-02-24 18:19     ` Matheus Tavares Bernardino
2019-02-23 22:40   ` Ævar Arnfjörð Bjarmason
2019-02-24  9:41     ` Christian Couder
2019-02-24 14:45       ` Ævar Arnfjörð Bjarmason
2019-02-25  9:45         ` Duy Nguyen
2019-02-26  0:26           ` [WIP RFC PATCH 0/7] clone: dir iterator refactoring with tests Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 1/7] dir-iterator: add pedantic option to dir_iterator_begin Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 2/7] dir-iterator: use stat() instead of lstat() Ævar Arnfjörð Bjarmason
2019-02-26  1:53             ` Matheus Tavares Bernardino
2019-02-26  0:26           ` [WIP RFC PATCH 3/7] clone: extract function from copy_or_link_directory Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 4/7] clone: test for our behavior on odd objects/* content Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 5/7] clone: use dir-iterator to avoid explicit dir traversal Ævar Arnfjörð Bjarmason
2019-02-26  3:48             ` Matheus Tavares Bernardino [this message]
2019-02-26 11:33               ` Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 6/7] clone: stop ignoring dotdirs in --local etc. clone Ævar Arnfjörð Bjarmason
2019-02-26  0:26           ` [WIP RFC PATCH 7/7] clone: break cloning repos that have symlinks in them Ævar Arnfjörð Bjarmason
2019-02-25  2:31       ` [GSoC][PATCH 3/3] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares Bernardino
2019-02-25 10:25         ` Ævar Arnfjörð Bjarmason
2019-02-25 20:40           ` Christian Couder
2019-02-26 10:33         ` Christian Couder
2019-02-23 19:07 ` [GSoC][PATCH 0/3] clone: convert explicit dir traversal to dir-iterator Matheus Tavares Bernardino
2019-02-23 20:10   ` Ævar Arnfjörð Bjarmason
2019-02-23 21:59 ` Thomas Gummerer
2019-02-24 16:34   ` Matheus Tavares Bernardino
2019-02-24 21:07     ` Thomas Gummerer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHd-oW7HvhkORZogjb0nPioe3KWmXDDACF6Tu-Kmayc7=1z7nA@mail.gmail.com' \
    --to=matheus.bernardino@usp.br \
    --cc=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=t.gummerer@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).