* [GSoC][RFC PATCH] clone: use dir-iterator to avoid explicit dir traversal
@ 2019-02-13 20:55 Matheus Tavares
2019-02-14 21:16 ` Christian Couder
0 siblings, 1 reply; 4+ messages in thread
From: Matheus Tavares @ 2019-02-13 20:55 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano
Replace usage of opendir/readdir/closedir API to traverse directories
recursively, at copy_or_link_directory function, by the dir-iterator
API.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
This is my microproject for GSoC 2019. It's still a RFC because I have
some questions. Any help will be much appreciated.
There're three places inside copy_or_link_directory's loop where
die_errno() is called. Should I call dir_iterator_abort, at these
places, before die_errno() is called (to free resources)?
And if so, should I check dir_iterator_abort's return code? It's said at
dir-iterator.h that dir_iterator_abort returns ITER_ERROR on error, but
by the code, we can see that it only possibly returns ITER_DONE.
Finally, if this call and check both needs to be done, there'll be code
replication in this three places. Should I add a goto and do all this
stuff at the function end? (But then, I would have to store die_errno's
error messages and errno code in temporary variables. Is this approach
good?)
builtin/clone.c | 65 +++++++++++++++++++++++++++++--------------------
1 file changed, 38 insertions(+), 27 deletions(-)
diff --git a/builtin/clone.c b/builtin/clone.c
index 50bde99618..f5208ad9ca 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -23,6 +23,8 @@
#include "transport.h"
#include "strbuf.h"
#include "dir.h"
+#include "dir-iterator.h"
+#include "iterator.h"
#include "sigchain.h"
#include "branch.h"
#include "remote.h"
@@ -392,50 +394,55 @@ static void copy_alternates(struct strbuf *src, struct strbuf *dst,
fclose(in);
}
-static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
- const char *src_repo, int src_baselen)
+static void mkdir_if_missing(const char *pathname, mode_t mode)
{
- struct dirent *de;
+ /*
+ * Tries to create a dir at pathname. If pathname already exists and
+ * is a dir, do nothing.
+ */
struct stat buf;
- int src_len, dest_len;
- DIR *dir;
- dir = opendir(src->buf);
- if (!dir)
- die_errno(_("failed to open '%s'"), src->buf);
-
- if (mkdir(dest->buf, 0777)) {
+ if (mkdir(pathname, mode)) {
if (errno != EEXIST)
- die_errno(_("failed to create directory '%s'"), dest->buf);
- else if (stat(dest->buf, &buf))
- die_errno(_("failed to stat '%s'"), dest->buf);
+ die_errno(_("failed to create directory '%s'"),
+ pathname);
+ else if (stat(pathname, &buf))
+ die_errno(_("failed to stat '%s'"), pathname);
else if (!S_ISDIR(buf.st_mode))
- die(_("%s exists and is not a directory"), dest->buf);
+ die(_("%s exists and is not a directory"), pathname);
}
+}
+
+static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
+ const char *src_repo, int src_baselen)
+{
+ int src_len, dest_len;
+ struct dir_iterator *iter;
+ int iter_status;
+
+ mkdir_if_missing(dest->buf, 0777);
+
+ iter = dir_iterator_begin(src->buf);
strbuf_addch(src, '/');
src_len = src->len;
strbuf_addch(dest, '/');
dest_len = dest->len;
- while ((de = readdir(dir)) != NULL) {
+ while ((iter_status = dir_iterator_advance(iter)) == ITER_OK) {
strbuf_setlen(src, src_len);
- strbuf_addstr(src, de->d_name);
+ strbuf_addstr(src, iter->relative_path);
strbuf_setlen(dest, dest_len);
- strbuf_addstr(dest, de->d_name);
- if (stat(src->buf, &buf)) {
- warning (_("failed to stat %s\n"), src->buf);
- continue;
- }
- if (S_ISDIR(buf.st_mode)) {
- if (de->d_name[0] != '.')
- copy_or_link_directory(src, dest,
- src_repo, src_baselen);
+ strbuf_addstr(dest, iter->relative_path);
+
+ if (S_ISDIR(iter->st.st_mode)) {
+ if (iter->basename[0] != '.')
+ mkdir_if_missing(dest->buf, 0777);
continue;
}
/* Files that cannot be copied bit-for-bit... */
- if (!strcmp(src->buf + src_baselen, "/info/alternates")) {
+ if (!strcmp(iter->relative_path, "info/alternates")) {
copy_alternates(src, dest, src_repo);
continue;
}
@@ -452,7 +459,11 @@ static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
if (copy_file_with_time(dest->buf, src->buf, 0666))
die_errno(_("failed to copy file to '%s'"), dest->buf);
}
- closedir(dir);
+
+ if (iter_status != ITER_DONE) {
+ strbuf_setlen(src, src_len);
+ die(_("failed to iterate over '%s'"), src->buf);
+ }
}
static void clone_local(const char *src_repo, const char *dest_repo)
--
2.20.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [GSoC][RFC PATCH] clone: use dir-iterator to avoid explicit dir traversal
2019-02-13 20:55 [GSoC][RFC PATCH] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
@ 2019-02-14 21:16 ` Christian Couder
2019-02-14 22:03 ` Matheus Tavares Bernardino
0 siblings, 1 reply; 4+ messages in thread
From: Christian Couder @ 2019-02-14 21:16 UTC (permalink / raw)
To: Matheus Tavares; +Cc: git, Junio C Hamano
On Thu, Feb 14, 2019 at 1:16 PM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> Replace usage of opendir/readdir/closedir API to traverse directories
> recursively, at copy_or_link_directory function, by the dir-iterator
> API.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
> This is my microproject for GSoC 2019. It's still a RFC because I have
> some questions. Any help will be much appreciated.
Thanks for working on a microproject!
> There're three places inside copy_or_link_directory's loop where
> die_errno() is called. Should I call dir_iterator_abort, at these
> places, before die_errno() is called (to free resources)?
I don't think it's necessary. We care about freeing resources when we
report errors (for example by returning an error code from inside a
function), but not when we are exiting.
> -static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
> - const char *src_repo, int src_baselen)
> +static void mkdir_if_missing(const char *pathname, mode_t mode)
It looks like your patch is both splitting copy_or_link_directory()
into 2 functions and converting it to use the dir-iterator API. Maybe
it would be better to have 2 patches instead, one for splitting it and
one for converting it.
> {
> - struct dirent *de;
> + /*
> + * Tries to create a dir at pathname. If pathname already exists and
> + * is a dir, do nothing.
> + */
I think we usually put such comments just before the function. And
maybe it could be shortened to "Create a dir at pathname unless
there's already one"
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [GSoC][RFC PATCH] clone: use dir-iterator to avoid explicit dir traversal
2019-02-14 21:16 ` Christian Couder
@ 2019-02-14 22:03 ` Matheus Tavares Bernardino
2019-02-15 8:59 ` Christian Couder
0 siblings, 1 reply; 4+ messages in thread
From: Matheus Tavares Bernardino @ 2019-02-14 22:03 UTC (permalink / raw)
To: Christian Couder; +Cc: git, Junio C Hamano
On Thu, Feb 14, 2019 at 7:16 PM Christian Couder
<christian.couder@gmail.com> wrote:
>
> On Thu, Feb 14, 2019 at 1:16 PM Matheus Tavares
> <matheus.bernardino@usp.br> wrote:
> >
> > Replace usage of opendir/readdir/closedir API to traverse directories
> > recursively, at copy_or_link_directory function, by the dir-iterator
> > API.
> >
> > Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> > ---
> > This is my microproject for GSoC 2019. It's still a RFC because I have
> > some questions. Any help will be much appreciated.
>
> Thanks for working on a microproject!
>
Hi, Christian. Thank you for the review and comments.
> > There're three places inside copy_or_link_directory's loop where
> > die_errno() is called. Should I call dir_iterator_abort, at these
> > places, before die_errno() is called (to free resources)?
>
> I don't think it's necessary. We care about freeing resources when we
> report errors (for example by returning an error code from inside a
> function), but not when we are exiting.
>
Ok, thanks!
> > -static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
> > - const char *src_repo, int src_baselen)
> > +static void mkdir_if_missing(const char *pathname, mode_t mode)
>
> It looks like your patch is both splitting copy_or_link_directory()
> into 2 functions and converting it to use the dir-iterator API. Maybe
> it would be better to have 2 patches instead, one for splitting it and
> one for converting it.
>
Got it. As the justification for splitting the function was to use the
extracted part in the section that was previously recursive, I thought
both changes should be in the same patch. But I really was in doubt
about that. Should I split it into two patches and mention that
justification at the first one? Or just split?
> > {
> > - struct dirent *de;
> > + /*
> > + * Tries to create a dir at pathname. If pathname already exists and
> > + * is a dir, do nothing.
> > + */
>
> I think we usually put such comments just before the function. And
> maybe it could be shortened to "Create a dir at pathname unless
> there's already one"
Right, the shortened version does sounds much better, thanks! About
the comment placement, I followed what I saw in other functions from
the same file ("copy_alternates", for example). But also, I couldn't
find any instruction about that at Documentation/CodingGuidelines. So
should I move it as you suggested?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [GSoC][RFC PATCH] clone: use dir-iterator to avoid explicit dir traversal
2019-02-14 22:03 ` Matheus Tavares Bernardino
@ 2019-02-15 8:59 ` Christian Couder
0 siblings, 0 replies; 4+ messages in thread
From: Christian Couder @ 2019-02-15 8:59 UTC (permalink / raw)
To: Matheus Tavares Bernardino; +Cc: git, Junio C Hamano
On Thu, Feb 14, 2019 at 11:04 PM Matheus Tavares Bernardino
<matheus.bernardino@usp.br> wrote:
>
> On Thu, Feb 14, 2019 at 7:16 PM Christian Couder
> <christian.couder@gmail.com> wrote:
> > > -static void copy_or_link_directory(struct strbuf *src, struct strbuf *dest,
> > > - const char *src_repo, int src_baselen)
> > > +static void mkdir_if_missing(const char *pathname, mode_t mode)
> >
> > It looks like your patch is both splitting copy_or_link_directory()
> > into 2 functions and converting it to use the dir-iterator API. Maybe
> > it would be better to have 2 patches instead, one for splitting it and
> > one for converting it.
> >
>
> Got it. As the justification for splitting the function was to use the
> extracted part in the section that was previously recursive, I thought
> both changes should be in the same patch. But I really was in doubt
> about that. Should I split it into two patches and mention that
> justification at the first one? Or just split?
If 2 patches instead of 1 makes it easier to review and understand
what's going on, then I think you should indeed send 2 patches and
mention the justification for splitting the function in the commit
message of the first patch.
> > I think we usually put such comments just before the function. And
> > maybe it could be shortened to "Create a dir at pathname unless
> > there's already one"
>
> Right, the shortened version does sounds much better, thanks! About
> the comment placement, I followed what I saw in other functions from
> the same file ("copy_alternates", for example).
Then it's ok to place it like you did. Sorry about the noise.
> But also, I couldn't
> find any instruction about that at Documentation/CodingGuidelines. So
> should I move it as you suggested?
I think we have not been very consistent over the years. Recently I
think we have tried to add or move API documentation inside header
files, and in general before functions in the code, but yeah it might
not have been documented.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-02-15 9:00 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-13 20:55 [GSoC][RFC PATCH] clone: use dir-iterator to avoid explicit dir traversal Matheus Tavares
2019-02-14 21:16 ` Christian Couder
2019-02-14 22:03 ` Matheus Tavares Bernardino
2019-02-15 8:59 ` Christian Couder
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).