git@vger.kernel.org list mirror (unofficial, one of many)
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Shourya Shukla <shouryashukla.oo@gmail.com>
Cc: git@vger.kernel.org, christian.couder@gmail.com,
	kaartic.sivaraam@gmail.com, Johannes.Schindelin@gmx.de,
	liu.denton@gmail.com, Prathamesh Chavan <pc44800@gmail.com>,
	Christian Couder <chriscool@tuxfamily.org>,
	Stefan Beller <stefanbeller@gmail.com>
Subject: Re: [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C
Date: Wed, 07 Oct 2020 15:19:46 -0700
Message-ID: <xmqqo8ldznjx.fsf@gitster.c.googlers.com> (raw)
In-Reply-To: <20201007074538.25891-3-shouryashukla.oo@gmail.com> (Shourya Shukla's message of "Wed, 7 Oct 2020 13:15:37 +0530")

Shourya Shukla <shouryashukla.oo@gmail.com> writes:

> +static void fprintf_submodule_remote(const char *str)

The fact that the helper happens to use fprintf() to do its job is
much less important than it writes to the standard error stream.
Name it after what it does than how it does so.  Is there a word
that explains at a higher-level concept than "print to stderr" that
this function tries to achieve?  

Same question for the name of the only caller of this function,
modify_remote_v().  That name does not mean anything to readers
other than that it futz with output from "remote -v" command, which
is the least interesting piece of information.  What does it try to
achieve by using "remote -v"?  Can we name the function after that?

> +{
> +	const char *p = str;
> +	const char *start;
> +	const char *end;
> +	char *name, *url;
> +
> +	start = p;
> +	while (*p != ' ')
> +		p++;
> +	end = p;
> +	name = xstrndup(start, end - start);
> +	while(*p == ' ')
> +		p++;

Perhaps make a small helper out of these seven lines, so that the
caller can say something like

    p = str;
    name = parse_token(&p);
    url = parse_token(&p);

This one you should be able to do without any extra allocation,
though.  Just write a parse_token() that finds start and length,
prepare "char *name; int namelen" and the same pair for URL,
and then

	fprintf(stderr, "  %.*s\t%.*s\n",
		namelen, name, urllen, url);

> +	fprintf(stderr, "  %s\t%s\n", name, url);
> +	free(name);
> +	free(url);
> +}

> +static int check_sm_exists(unsigned int force, const char *path) {
> +
> +	int cache_pos, dir_in_cache = 0;

Have a blank line here to separate decl and the first statement.

> +	if (read_cache() < 0)
> +		die(_("index file corrupt"));
> +
> +	cache_pos = cache_name_pos(path, strlen(path));
> +	if(cache_pos < 0 && (directory_exists_in_index(&the_index,
> +	   path, strlen(path)) == index_directory))
> +		dir_in_cache = 1;

Funny line wrapping.  Try to cut long line at an operator as close
to the root of the parse tree (in this case, &&) as possible, i.e.

	if (cache_pos < 0 &&
	    directory_exists_in_index(&the_index, path, strlen(path)) == index_directory)

It is OK to further wrap after == if the second line bothers you.

A bigger question.  Can the path be a regular file but at a higher
stage because we are in the middle of a conflicted merge?  We'd get
cache_pos that is negative in that case, too, and we definitely would
want to say the path already exists in the index in such a case, but ...

> +
> +	if (!force) {
> +		if (cache_pos >= 0 || dir_in_cache)
> +			die(_("'%s' already exists in the index"), path);

... the current code may not trigger this die() in such a case, no?

> +	} else {
> +		struct cache_entry *ce = NULL;
> +		if (cache_pos >= 0)
> +			ce = the_index.cache[cache_pos];
> +		if (dir_in_cache || (ce && !S_ISGITLINK(ce->ce_mode)))
> +			die(_("'%s' already exists in the index and is not a "
> +			      "submodule"), path);

Likewise here.  cache_pos < 0 does not automatically mean it does
not exist.  It tells you that it does not exist as a merged entry.

> +	}
> +	return 0;
> +}
> +
> +static void modify_remote_v(struct strbuf *sb)

This roughly corresponds to this part of the original

    grep '(fetch)' | sed -e s,^,"  ", -e s,' (fetch)',,

I actualy would suggest moving the "git remote -v" invocation and
capturing of its output to this helper function and name it after
what it does, which seems to be "show fetch remotes" to me.

> +{
> +	int i;
> +	for (i = 0; i < sb->len; i++) {
> +		const char *start = sb->buf + i;
> +		const char *end = start;
> +		while (sb->buf[i++] != '\n')
> +			end++;
> +		if (!strcmp("fetch", xstrndup(end - 6, 5)))

The original makes sure the 'fetch' appears inside "()" but this
does not.  Any reason why we want to do it differently?

> +			fprintf_submodule_remote(xstrndup(start, end - start - 7));

The result of xstrndup() is leaking here.

In any case, with a helper function like parse_token() suggested
before, you can get rid of the fprintf_submodule_remote() helper and
open code it here, without any temporary allocation and freeing.
You'd have the start of each line of "git remote -v" output (so you
know where it starts and it ends), and a parser that roughly does
this:

	/*
	 * cp points at the current location, and ep points the
	 * end of the buffer.  find the tail of the current string
	 * and store its length in *len, skip over whitespaces and
	 * return the location to be used as the new cp.
	 */
	const char *parse_token(char *cp, char *ep, int *len);

and make the latter half of this function (former half would be
spawning "remote -v" and capturing its output in sb) a loop whose
body may look like

	{
		char *end, *name, *url, *tail;
		int namelen, urllen;

		end = strchrnul(start, '\n');
		name = start;
		url = parse_token(name, end, &namelen);
		tail = parse_token(url, end, &urllen);
		if (!memcmp(tail, "(fetch)", 7))
			fprintf(stderr, ...); /* see above */
		start = *end ? end + 1 : end;
	}

> +static int add_submodule(struct add_data *info)
> +{
> +	/* perhaps the path exists and is already a git repo, else clone it */
> +	if (is_directory(info->sm_path)) {
> +		char *sub_git_path = xstrfmt("%s/.git", info->sm_path);
> +		if (is_directory(sub_git_path) || file_exists(sub_git_path))
> +			printf(_("Adding existing repo at '%s' to the index\n"),
> +				 info->sm_path);
> +		else
> +			die(_("'%s' already exists and is not a valid git repo"),
> +			      info->sm_path);
> +		free(sub_git_path);
> +	} else {
> +		struct strvec clone_args = STRVEC_INIT;
> +		struct child_process cp = CHILD_PROCESS_INIT;
> +		char *submodule_git_dir = xstrfmt(".git/modules/%s", info->sm_name);
> +
> +		if (is_directory(submodule_git_dir)) {
> +			if (!info->force) {

As I said, it would be a better organization to have a helper
function that does what is done from here ...

> +				struct child_process cp_rem = CHILD_PROCESS_INIT;
> +				struct strbuf sb_rem = STRBUF_INIT;
> +				cp_rem.git_cmd = 1;
> +				fprintf(stderr, _("A git directory for '%s' is "
> +					"found locally with remote(s):\n"),
> +					info->sm_name);
> +				strvec_pushf(&cp_rem.env_array,
> +					     "GIT_DIR=%s", submodule_git_dir);
> +				strvec_push(&cp_rem.env_array, "GIT_WORK_TREE=.");
> +				strvec_pushl(&cp_rem.args, "remote", "-v", NULL);
> +				if (!capture_command(&cp_rem, &sb_rem, 0)) {
> +					modify_remote_v(&sb_rem);
> +				}

... up to here.  Can you say what the purpose of that helper
function?  I'd say it is for a given git repository (specified by
submodule_git_dir, which we will pass as the parameter to that
helper), report the names and URLs of fetch remotes defined in that
repository.  So, perhaps its signature might be:

	static void report_fetch_remotes(FILE *output, const char *git_dir);

where we would make a call to it from here like so:

				report_fetch_remotes(stderr, submodule_git_dir);

> +				error(_("If you want to reuse this local git "
> +				      "directory instead of cloning again from\n "
> +				      "  %s\n"
> +				      "use the '--force' option. If the local "
> +				      "git directory is not the correct repo\n"
> +				      "or you are unsure what this means choose "
> +				      "another name with the '--name' option."),
> +				      info->realrepo);
> +				return 1;
> +			} else {
> +				printf(_("Reactivating local git directory for "
> +					 "submodule '%s'."), info->sm_path);
> +			}
> +		}
> +		free(submodule_git_dir);

I think I've reviewed up to this point this time around.

Thanks.


  parent reply	other threads:[~2020-10-07 22:19 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-07  7:45 [PATCH v2 0/3] submodule: port subcommand add " Shourya Shukla
2020-10-07  7:45 ` [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
2020-10-07 18:05   ` Junio C Hamano
2020-10-12 10:11     ` Shourya Shukla
2020-11-18 23:25   ` Emily Shaffer
2020-10-07  7:45 ` [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
2020-10-07 18:37   ` Junio C Hamano
2020-10-07 22:19   ` Junio C Hamano [this message]
2020-10-08 17:19   ` Junio C Hamano
2020-10-09  5:09   ` Junio C Hamano
2020-11-18 23:13     ` Jonathan Tan
2020-11-19  7:44       ` Ævar Arnfjörð Bjarmason
2020-11-19 12:38         ` Johannes Schindelin
2020-11-19 20:37           ` Junio C Hamano
2020-10-07  7:45 ` [PATCH v2 3/3] t7400: add test to check 'submodule add' for tracked paths Shourya Shukla
2020-11-19  0:03 ` [PATCH v2 0/3] submodule: port subcommand add from shell to C Josh Steadmon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqo8ldznjx.fsf@gitster.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=chriscool@tuxfamily.org \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=kaartic.sivaraam@gmail.com \
    --cc=liu.denton@gmail.com \
    --cc=pc44800@gmail.com \
    --cc=shouryashukla.oo@gmail.com \
    --cc=stefanbeller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org list mirror (unofficial, one of many)

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V1 git git/ https://public-inbox.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors.
Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.io/gmane.comp.version-control.git
 note: .onion URLs require Tor: https://www.torproject.org/

code repositories for the project(s) associated with this inbox:

	https://80x24.org/mirrors/git.git

AGPL code for this site: git clone https://public-inbox.org/public-inbox.git