git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Brandon Williams <bmwill@google.com>
Cc: git@vger.kernel.org
Subject: Re: [RFC/PATCH] ls-files: adding support for submodules
Date: Fri, 9 Sep 2016 18:35:14 -0400	[thread overview]
Message-ID: <20160909223513.3rirneqxmrcyi4k4@sigill.intra.peff.net> (raw)
In-Reply-To: <1473458004-41460-1-git-send-email-bmwill@google.com>

On Fri, Sep 09, 2016 at 02:53:24PM -0700, Brandon Williams wrote:

> Allow ls-files to recognize submodules in order to retrieve a list of
> files from a repository's submodules.  This is done by forking off a
> process to recursively call ls-files on all submodules.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
> Hey git developers!
> 
> I'm new to the community and this is the first patch for an open source project
> that I have worked on.
> 
> I'm looking forward to working on the project!

Welcome. :)

Submodules are not really my area of expertise, so I don't have any
commentary on the goal of the patch, except that it sounds reasonable to
my layman's ears.

The implementation looks fairly clean. A few comments:

> +static void show_gitlink(const struct cache_entry *ce)
> +{
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +	struct strbuf buf = STRBUF_INIT;
> +	struct strbuf name = STRBUF_INIT;
> +	int submodule_name_len;
> +	FILE *fp;
> +
> +	argv_array_push(&cp.args, "ls-files");
> +	argv_array_push(&cp.args, "--recurse-submodules");
> +	cp.git_cmd = 1;
> +	cp.dir = ce->name;
> +	cp.out = -1;
> +	start_command(&cp);
> +	fp = fdopen(cp.out, "r");

You should error-check the result of start_command(). I guess the
reasonable outcome would be to die(), as it is a sign that we could not
fork, find git, etc.

Ditto for fdopen (you can use xfdopen for that).

> +	/*
> +	 * The ls-files child process produces filenames relative to
> +	 * the submodule. Prefix each line with the submodule path
> +	 * to make it relative to the current repository.
> +	 */
> +	strbuf_addstr(&name, ce->name);
> +	strbuf_addch(&name, '/');
> +	submodule_name_len = name.len;
> +	while (strbuf_getline(&buf, fp) != EOF) {
> +		strbuf_addbuf(&name, &buf);
> +		write_name(name.buf);
> +		strbuf_setlen(&name, submodule_name_len);
> +	}

What happens if the filename in the submodule needs quoting? You'll get
the quoted value in your buffer, and then re-quote it again in
write_name().

The simplest thing would probably be to use "ls-files -z" for the
recursive invocation, and then split on NUL bytes (we have
strbuf_getline_nul for that).

> +	finish_command(&cp);

What should happen if finish_command() tells us that the ls-files
sub-process reported an error? It may not be worth aborting the rest of
the listing, but we might want to propagate that in our own return code.

> +	strbuf_release(&buf);
> +	strbuf_release(&name);
> +	fclose(fp);
> +}

A minor style nit, but I would generally fclose(fp) before running
finish_command() (i.e., resource clean up in the reverse order of
allocation). It doesn't matter in this case because "fp" is output from
the process, and we know we've already read to EOF. For other cases, it
could cause a deadlock (e.g., we end up in wait() for the child process
to finish, but it is blocked in write() waiting for us to read). So I
think it's a good habit to get into.

> @@ -519,6 +566,17 @@ int cmd_ls_files(int argc, const char **argv, const char *cmd_prefix)
>  	if (require_work_tree && !is_inside_work_tree())
>  		setup_work_tree();
>  
> +	if (recurse_submodules &&
> +	    (show_stage || show_deleted || show_others || show_unmerged ||
> +	     show_killed || show_modified || show_resolve_undo ||
> +	     show_valid_bit || show_tag || show_eol))
> +		die("ls-files --recurse-submodules can only be used in "
> +		    "--cached mode");

Woah, that list of variables is getting rather long. This is not a
problem introduced by your patch, so it's not a blocker. But I wonder if
some of them are mutually exclusive and could be collapsed to a single
variable.

I guess the reason for this "only with --cached" is that you do not
propagate the options down to the recursive process. If we were to do
that, then this big list of restrictions would go away. I'd be OK with
starting with more limited functionality like your patch, though. I
think doing the recursive thing correctly would also involve parsing the
output of each to append the filename prefix.

So I suppose another option would be to teach ls-files a "prefix" option
to add to each filename, and just pass in the submodule path. Then you
can let the sub-processes write directly to the common stdout, and I
think it would be safe to blindly pass the parent argv into the child
processes.

-Peff

  reply	other threads:[~2016-09-09 22:35 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-09 21:53 [RFC/PATCH] ls-files: adding support for submodules Brandon Williams
2016-09-09 22:35 ` Jeff King [this message]
2016-09-09 22:40   ` Junio C Hamano
2016-09-09 23:01 ` Junio C Hamano
2016-09-09 23:47   ` Stefan Beller
2016-09-11 22:10     ` Junio C Hamano
2016-09-12  0:51       ` Jeff King
2016-09-12  0:52         ` Jeff King
2016-09-12 16:01           ` Junio C Hamano
     [not found]             ` <CAKoko1oSEac_Nr1SkRB=dM_r3Jnew1Et2ZKj716iU3JLyHe2GQ@mail.gmail.com>
2016-09-12 17:39               ` Brandon Williams
2016-09-12  1:47       ` Junio C Hamano
2016-09-13  0:33 ` [PATCH v2] " Brandon Williams
2016-09-13  3:35   ` Brandon Williams
2016-09-13 16:31   ` Junio C Hamano
2016-09-15 20:51     ` Junio C Hamano
2016-09-15 20:51       ` [PATCH 1/2] SQUASH??? Junio C Hamano
2016-09-15 20:51       ` [PATCH 2/2] SQUASH??? Undecided Junio C Hamano
2016-09-15 21:12         ` Stefan Beller
2016-09-15 21:37           ` Brandon Williams
2016-09-15 20:57     ` [PATCH v2] ls-files: adding support for submodules Junio C Hamano
2016-09-15 20:58     ` Junio C Hamano
2016-09-15 20:58     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160909223513.3rirneqxmrcyi4k4@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=bmwill@google.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).