git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v3 0/3] submodule: port subcommand add from shell to C
@ 2020-12-14 23:19 Shourya Shukla
  2020-12-14 23:19 ` [PATCH v3 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Shourya Shukla @ 2020-12-14 23:19 UTC (permalink / raw)
  To: git
  Cc: gitster, Johannes.Schindelin, liu.denton, christian.couder,
	kaartic.sivaraam, Shourya Shukla

Greetings,

This is the v3 of the patch series with the same title. You may view
the v2 here:
https://lore.kernel.org/git/20201007074538.25891-1-shouryashukla.oo@gmail.com/

I have applied almost all of the changes asked before, except a few
which confused me a little. It would be great if I could get some help
about them:

    1. In this mail: https://lore.kernel.org/git/xmqqo8ldznjx.fsf@gitster.c.googlers.com/
       Junio asked me to accomodate for a merge in progress since
       'cache_pos < 0' does not necessarily mean that the path exists.
       I was wondering which function would be the most appropriate for
       the if-statements:
            if (!force) {
		        if (cache_pos >= 0 || dir_in_cache)
            }
       I was thinking of going with 'read_cache_unmerged()'. I thought
       of this by seeing what is triggered in case of a merge conflict
       and cam across this. What is your opinion on this?

    2. In this mail: https://lore.kernel.org/git/xmqqimbky6st.fsf@gitster.c.googlers.com/
       In this section:
            /* strip trailing '/' */
	        if (is_dir_sep(sm_path[strlen(sm_path) -1]))
		        sm_path[strlen(sm_path) - 1] = '\0';

       Junio makes a reasonable argument that we need to make sure that
       multiple trailing slashes are eliminated but my code only takes
       care of a single trailing slash. I was looking into the code of
       'normalize_path_copy()' and saw that the function it essentially
       calls: 'normalize_path_copy_len()' does not perform anything on
       the trailing slashes and this behaviour is mentioned as a
       NEEDSWORK.

       I was thinking of correcting the above function instead of
       putting in an extra loop. Is this feasible?

    3. In the following segment:
        /*
         * NEEDSWORK: In a multi-working-tree world, this needs to be
         * set in the per-worktree config.
         */
        if (!git_config_get_string("submodule.active", &var) && var) {

        There was a comment: "What if this were a valueless true
        ("[submodule] active\n" without "= true")?  Wouldn't get_string()
        fail?"

        I was under the impression that even if the above failed, it
        will not really affect the big picture since at the we will set
        'submodule.name.active" as true irrespective of the above value.
        Is this correct?

Feedback and reviews are appreciated.

Regards,
Shourya Shukla

Shourya Shukla (3):
  dir: change the scope of function 'directory_exists_in_index()'
  submodule: port submodule subcommand 'add' from shell to C
  t7400: add test to check 'submodule add' for tracked paths

 builtin/submodule--helper.c | 410 +++++++++++++++++++++++++++++++++++-
 dir.c                       |  30 ++-
 dir.h                       |   9 +
 git-submodule.sh            | 161 +-------------
 t/t7400-submodule-basic.sh  |  13 +-
 5 files changed, 443 insertions(+), 180 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/3] dir: change the scope of function 'directory_exists_in_index()'
  2020-12-14 23:19 [PATCH v3 0/3] submodule: port subcommand add from shell to C Shourya Shukla
@ 2020-12-14 23:19 ` Shourya Shukla
  2020-12-19  0:08   ` Johannes Schindelin
  2020-12-14 23:19 ` [PATCH v3 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Shourya Shukla @ 2020-12-14 23:19 UTC (permalink / raw)
  To: git
  Cc: gitster, Johannes.Schindelin, liu.denton, christian.couder,
	kaartic.sivaraam, Shourya Shukla

Change the scope of the function 'directory_exists_in_index()' as well
as declare it in 'dir.h'.

Since the return type of the function is the enumerator 'exist_status',
change its scope as well and declare it in 'dir.h'. While at it, rename
the members of the aforementioned enum so as to avoid any naming clashes
or confusions later on.

Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <periperidip@gmail.com>
---
 builtin/submodule--helper.c | 408 ++++++++++++++++++++++++++++++++++++
 dir.c                       |  30 ++-
 dir.h                       |   9 +
 3 files changed, 429 insertions(+), 18 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index c30896c897..4dfad35d77 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2744,6 +2744,414 @@ static int module_set_branch(int argc, const char **argv, const char *prefix)
 	return !!ret;
 }
 
+struct add_data {
+	const char *prefix;
+	const char *branch;
+	const char *reference_path;
+	const char *sm_path;
+	const char *sm_name;
+	const char *repo;
+	const char *realrepo;
+	int depth;
+	unsigned int force: 1;
+	unsigned int quiet: 1;
+	unsigned int progress: 1;
+	unsigned int dissociate: 1;
+};
+#define ADD_DATA_INIT { 0 }
+
+/*
+ * Guess the directory name from the repository URL by performing the
+ * operations below in the following order:
+ *
+ * - If the URL ends with '/', remove that.
+ *
+ * - If the result of the above ends with zero or more ':', followed
+ *  by zero or more '/', followed by ".git", drop the matching part.
+ *
+ * - If the result of the above has '/' or ':' in it, remove everything
+ *  before it and '/' or ':' itself.
+ */
+static char *guess_dir_name(const char *repo)
+{
+	const char *start, *end;
+
+	start = repo;
+	end = repo + strlen(repo);
+
+	/* remove the trailing '/' */
+	if (repo < end - 1 && end[-1] == '/')
+		end--;
+
+	/* remove the trailing ':', '/' and '.git' */
+	if (repo < end - 4 && !memcmp(".git", end - 4, 4)) {
+		end -= 4;
+		while (repo < end - 1 && end[-1] == '/')
+			end--;
+		while (repo < end - 1 && end[-1] == ':')
+			end--;
+	}
+
+	/* find the last ':' or '/' */
+	for (start = end - 1; repo <= start; start--) {
+		if (*start == '/' || *start == ':')
+			break;
+	}
+	/* exclude '/' or ':' itself */
+	start++;
+
+	return xmemdupz(start, end - start);
+}
+
+static int can_create_submodule(unsigned int force, const char *path)
+{
+	int cache_pos, dir_in_cache = 0;
+
+	if (read_cache() < 0)
+		die(_("index file corrupt"));
+
+	cache_pos = cache_name_pos(path, strlen(path));
+	if(cache_pos < 0 &&
+	   directory_exists_in_index(&the_index, path, strlen(path)) == is_cache_directory)
+		dir_in_cache = 1;
+
+	if (!force) {
+		if (cache_pos >= 0 || dir_in_cache)
+			die(_("'%s' already exists in the index"), path);
+	} else {
+		struct cache_entry *ce = NULL;
+		if (cache_pos >= 0)
+			ce = the_index.cache[cache_pos];
+		if (dir_in_cache || (ce && !S_ISGITLINK(ce->ce_mode)))
+			die(_("'%s' already exists in the index and is not a "
+			      "submodule"), path);
+	}
+	return 0;
+}
+
+static const char *parse_token(const char *cp, int *len)
+{
+	const char *p = cp, *start, *end;
+	char *str;
+
+	start = p;
+	while (*p != ' ')
+		p++;
+	end = p;
+	str = xstrndup(start, end - start);
+
+	while(*p == ' ')
+		p++;
+
+	return str;
+}
+
+static void report_fetch_remotes(FILE *output, const char *sm_name, const char *git_dir)
+{
+	struct child_process cp_rem = CHILD_PROCESS_INIT;
+	struct strbuf sb_rem = STRBUF_INIT;
+
+	cp_rem.git_cmd = 1;
+	fprintf(stderr, _("A git directory for '%s' is "
+		"found locally with remote(s):\n"), sm_name);
+	strvec_pushf(&cp_rem.env_array,
+		     "GIT_DIR=%s", git_dir);
+	strvec_push(&cp_rem.env_array, "GIT_WORK_TREE=.");
+	strvec_pushl(&cp_rem.args, "remote", "-v", NULL);
+	if (!capture_command(&cp_rem, &sb_rem, 0)) {
+		int i;
+
+		for (i = 0; i < sb_rem.len; i++) {
+			char *start = sb_rem.buf + i, *end = start;
+			const char *name = start, *url, *tail;
+			int namelen, urllen;
+
+			while (sb_rem.buf[i++] != '\n')
+				end++;
+			url = parse_token(name, &namelen);
+			tail = parse_token(url, &urllen);
+			if (!memcmp(tail, "(fetch)", 7))
+				fprintf(stderr, "  %s\t%s\n", name, url);
+			start = *end ? end + 1 : end;
+		}
+	}
+}
+
+static int add_submodule(struct add_data *info)
+{
+	/* perhaps the path exists and is already a git repo, else clone it */
+	if (is_directory(info->sm_path)) {
+		char *sub_git_path = xstrfmt("%s/.git", info->sm_path);
+		if (is_directory(sub_git_path) || file_exists(sub_git_path))
+			printf(_("Adding existing repo at '%s' to the index\n"),
+				 info->sm_path);
+		else
+			die(_("'%s' already exists and is not a valid git repo"),
+			      info->sm_path);
+		free(sub_git_path);
+	} else {
+		struct strvec clone_args = STRVEC_INIT;
+		struct child_process cp = CHILD_PROCESS_INIT;
+		char *submodule_git_dir = xstrfmt(".git/modules/%s", info->sm_name);
+
+		if (is_directory(submodule_git_dir)) {
+			if (!info->force) {
+				report_fetch_remotes(stderr, info->sm_name,
+						     submodule_git_dir);
+				error(_("If you want to reuse this local git "
+				      "directory instead of cloning again from\n "
+				      "  %s\n"
+				      "use the '--force' option. If the local "
+				      "git directory is not the correct repo\n"
+				      "or you are unsure what this means choose "
+				      "another name with the '--name' option."),
+				      info->realrepo);
+				return 1;
+			} else {
+				printf(_("Reactivating local git directory for "
+					 "submodule '%s'."), info->sm_path);
+			}
+		}
+		free(submodule_git_dir);
+
+		strvec_push(&clone_args, "clone");
+
+		if (info->quiet)
+			strvec_push(&clone_args, "--quiet");
+
+		if (info->progress)
+			strvec_push(&clone_args, "--progress");
+
+		if (info->prefix)
+			strvec_pushl(&clone_args, "--prefix", info->prefix, NULL);
+		strvec_pushl(&clone_args, "--path", info->sm_path, "--name",
+			     info->sm_name, "--url", info->realrepo, NULL);
+		if (info->reference_path)
+			strvec_pushl(&clone_args, "--reference",
+				     info->reference_path, NULL);
+		if (info->dissociate)
+			strvec_push(&clone_args, "--dissociate");
+
+		if (info->depth >= 0)
+			strvec_pushf(&clone_args, "--depth=%d", info->depth);
+
+		if (module_clone(clone_args.nr, clone_args.v, info->prefix)) {
+			strvec_clear(&clone_args);
+			return -1;
+		}
+
+		prepare_submodule_repo_env(&cp.env_array);
+		cp.git_cmd = 1;
+		cp.dir = info->sm_path;
+		strvec_pushl(&cp.args, "checkout", "-f", "-q", NULL);
+
+		if (info->branch) {
+			strvec_pushl(&cp.args, "-B", info->branch, NULL);
+			strvec_pushf(&cp.args, "origin/%s", info->branch);
+		}
+
+		if (run_command(&cp))
+			die(_("unable to checkout submodule '%s'"), info->sm_path);
+	}
+	return 0;
+}
+
+static void config_added_submodule(struct add_data *info)
+{
+	char *key, *var = NULL;
+	struct child_process cp = CHILD_PROCESS_INIT;
+	struct child_process cp2 = CHILD_PROCESS_INIT;
+
+	key = xstrfmt("submodule.%s.url", info->sm_name);
+	git_config_set_gently(key, info->realrepo);
+	free(key);
+
+	cp.git_cmd = 1;
+	strvec_pushl(&cp.args, "add", "--no-warn-embedded-repo", NULL);
+	if (info->force)
+		strvec_push(&cp.args, "--force");
+	strvec_pushl(&cp.args, "--", info->sm_path, NULL);
+
+	if (run_command(&cp))
+		die(_("failed to add submodule '%s'"), info->sm_path);
+
+	key = xstrfmt("submodule.%s.path", info->sm_name);
+	config_set_in_gitmodules_file_gently(key, info->sm_path);
+	free(key);
+	key = xstrfmt("submodule.%s.url", info->sm_name);
+	config_set_in_gitmodules_file_gently(key, info->repo);
+	free(key);
+	key = xstrfmt("submodule.%s.branch", info->sm_name);
+	if (info->branch)
+		config_set_in_gitmodules_file_gently(key, info->branch);
+	free(key);
+
+	cp2.git_cmd = 1;
+	strvec_pushl(&cp2.args, "add", "--force", NULL);
+	strvec_pushl(&cp2.args, "--", ".gitmodules", NULL);
+
+	if (run_command(&cp2))
+		die(_("Failed to register submodule '%s'"), info->sm_path);
+
+	/*
+	 * NEEDSWORK: In a multi-working-tree world, this needs to be
+	 * set in the per-worktree config.
+	 */
+	if (!git_config_get_string("submodule.active", &var) && var) {
+
+		/*
+		 * If the submodule being adding isn't already covered by the
+		 * current configured pathspec, set the submodule's active flag
+		 */
+		if (!is_submodule_active(the_repository, info->sm_path)) {
+			key = xstrfmt("submodule.%s.active", info->sm_name);
+			git_config_set_gently(key, "true");
+			free(key);
+		}
+	} else {
+		key = xstrfmt("submodule.%s.active", info->sm_name);
+		git_config_set_gently(key, "true");
+		free(key);
+	}
+}
+
+static int module_add(int argc, const char **argv, const char *prefix)
+{
+	const char *branch = NULL, *custom_name = NULL, *realrepo = NULL;
+	const char *reference_path = NULL, *repo = NULL, *name = NULL;
+	char *sm_path;
+	int force = 0, quiet = 0, depth = -1, progress = 0, dissociate = 0;
+	struct add_data info = ADD_DATA_INIT;
+	struct strbuf sb = STRBUF_INIT;
+
+	struct option options[] = {
+		OPT_STRING('b', "branch", &branch, N_("branch"),
+			   N_("branch of repository to add as submodule")),
+		OPT_BOOL('f', "force", &force, N_("allow adding an otherwise "
+						  "ignored submodule path")),
+		OPT__QUIET(&quiet, N_("print only error messages")),
+		OPT_BOOL(0, "progress", &progress, N_("force cloning progress")),
+		OPT_STRING(0, "reference", &reference_path, N_("repository"),
+			   N_("reference repository")),
+		OPT_BOOL(0, "dissociate", &dissociate, N_("borrow the objects from reference repositories")),
+		OPT_STRING(0, "name", &custom_name, N_("name"),
+			   N_("sets the submodule’s name to the given string "
+			      "instead of defaulting to its path")),
+		OPT_INTEGER(0, "depth", &depth, N_("depth for shallow clones")),
+		OPT_END()
+	};
+
+	const char *const usage[] = {
+		N_("git submodule--helper add [<options>] [--] [<path>]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, options, usage, 0);
+
+	if (!is_writing_gitmodules_ok())
+		die(_("please make sure that the .gitmodules file is in the working tree"));
+
+	if (prefix && *prefix && reference_path &&
+	    !is_absolute_path(reference_path))
+		reference_path = xstrfmt("%s%s", prefix, reference_path);
+
+	if (argc == 0 || argc > 2) {
+		usage_with_options(usage, options);
+	} else if (argc == 1) {
+		repo = argv[0];
+		sm_path = guess_dir_name(repo);
+	} else {
+		repo = argv[0];
+		sm_path = xstrdup(argv[1]);
+	}
+
+	if (prefix && *prefix && !is_absolute_path(sm_path))
+		sm_path = xstrfmt("%s%s", prefix, sm_path);
+
+	/* assure repo is absolute or relative to parent */
+	if (starts_with_dot_dot_slash(repo) || starts_with_dot_slash(repo)) {
+		char *remote = get_default_remote();
+		char *remoteurl;
+		struct strbuf sb = STRBUF_INIT;
+
+		if (prefix)
+			die(_("relative path can only be used from the toplevel "
+			      "of the working tree"));
+		/* dereference source url relative to parent's url */
+		strbuf_addf(&sb, "remote.%s.url", remote);
+		if (git_config_get_string(sb.buf, &remoteurl))
+			remoteurl = xgetcwd();
+		realrepo = relative_url(remoteurl, repo, NULL);
+
+		free(remoteurl);
+		free(remote);
+	} else if (is_dir_sep(repo[0]) || strchr(repo, ':')) {
+		realrepo = repo;
+	} else {
+		die(_("repo URL: '%s' must be absolute or begin with ./|../"),
+		      repo);
+	}
+
+	/*
+	 * normalize path:
+	 * multiple //; leading ./; /./; /../;
+	 */
+	normalize_path_copy(sm_path, sm_path);
+	/* strip trailing '/' */
+	if (is_dir_sep(sm_path[strlen(sm_path) -1]))
+		sm_path[strlen(sm_path) - 1] = '\0';
+
+	if (can_create_submodule(force, sm_path))
+		return 1;
+
+	strbuf_addstr(&sb, sm_path);
+	if (is_nonbare_repository_dir(&sb)) {
+		struct object_id oid;
+		if (resolve_gitlink_ref(sm_path, "HEAD", &oid) < 0)
+			die(_("'%s' does not have a commit checked out"), sm_path);
+	}
+
+	if (!force) {
+		int exit_code = -1;
+		struct strbuf sb = STRBUF_INIT;
+		struct child_process cp = CHILD_PROCESS_INIT;
+		cp.git_cmd = 1;
+		cp.no_stdout = 1;
+		strvec_pushl(&cp.args, "add", "--dry-run", "--ignore-missing",
+			     "--no-warn-embedded-repo", sm_path, NULL);
+		if ((exit_code = pipe_command(&cp, NULL, 0, NULL, 0, &sb, 0))) {
+			strbuf_complete_line(&sb);
+			fputs(sb.buf, stderr);
+			return exit_code;
+		}
+		strbuf_release(&sb);
+	}
+
+	name = custom_name ? custom_name : sm_path;
+	if (check_submodule_name(name))
+		die(_("'%s' is not a valid submodule name"), name);
+
+	info.prefix = prefix;
+	info.sm_name = name;
+	info.sm_path = sm_path;
+	info.repo = repo;
+	info.realrepo = realrepo;
+	info.reference_path = reference_path;
+	info.branch = branch;
+	info.depth = depth;
+	info.progress = !!progress;
+	info.dissociate = !!dissociate;
+	info.force = !!force;
+	info.quiet = !!quiet;
+
+	if (add_submodule(&info))
+		return 1;
+	config_added_submodule(&info);
+	free(sm_path);
+
+	return 0;
+}
+
 #define SUPPORT_SUPER_PREFIX (1<<0)
 
 struct cmd_struct {
diff --git a/dir.c b/dir.c
index d637461da5..f37de276f2 100644
--- a/dir.c
+++ b/dir.c
@@ -1655,12 +1655,6 @@ struct dir_entry *dir_add_ignored(struct dir_struct *dir,
 	return dir->ignored[dir->ignored_nr++] = dir_entry_new(pathname, len);
 }
 
-enum exist_status {
-	index_nonexistent = 0,
-	index_directory,
-	index_gitdir
-};
-
 /*
  * Do not use the alphabetically sorted index to look up
  * the directory name; instead, use the case insensitive
@@ -1672,13 +1666,13 @@ static enum exist_status directory_exists_in_index_icase(struct index_state *ist
 	struct cache_entry *ce;
 
 	if (index_dir_exists(istate, dirname, len))
-		return index_directory;
+		return is_cache_directory;
 
 	ce = index_file_exists(istate, dirname, len, ignore_case);
 	if (ce && S_ISGITLINK(ce->ce_mode))
-		return index_gitdir;
+		return is_cache_gitdir;
 
-	return index_nonexistent;
+	return is_cache_absent;
 }
 
 /*
@@ -1688,8 +1682,8 @@ static enum exist_status directory_exists_in_index_icase(struct index_state *ist
  * the files it contains) will sort with the '/' at the
  * end.
  */
-static enum exist_status directory_exists_in_index(struct index_state *istate,
-						   const char *dirname, int len)
+enum exist_status directory_exists_in_index(struct index_state *istate,
+					    const char *dirname, int len)
 {
 	int pos;
 
@@ -1709,11 +1703,11 @@ static enum exist_status directory_exists_in_index(struct index_state *istate,
 		if (endchar > '/')
 			break;
 		if (endchar == '/')
-			return index_directory;
+			return is_cache_directory;
 		if (!endchar && S_ISGITLINK(ce->ce_mode))
-			return index_gitdir;
+			return is_cache_gitdir;
 	}
-	return index_nonexistent;
+	return is_cache_absent;
 }
 
 /*
@@ -1767,11 +1761,11 @@ static enum path_treatment treat_directory(struct dir_struct *dir,
 	/* The "len-1" is to strip the final '/' */
 	enum exist_status status = directory_exists_in_index(istate, dirname, len-1);
 
-	if (status == index_directory)
+	if (status == is_cache_directory)
 		return path_recurse;
-	if (status == index_gitdir)
+	if (status == is_cache_gitdir)
 		return path_none;
-	if (status != index_nonexistent)
+	if (status != is_cache_absent)
 		BUG("Unhandled value for directory_exists_in_index: %d\n", status);
 
 	/*
@@ -2190,7 +2184,7 @@ static enum path_treatment treat_path(struct dir_struct *dir,
 	if ((dir->flags & DIR_COLLECT_KILLED_ONLY) &&
 	    (dtype == DT_DIR) &&
 	    !has_path_in_index &&
-	    (directory_exists_in_index(istate, path->buf, path->len) == index_nonexistent))
+	    (directory_exists_in_index(istate, path->buf, path->len) == is_cache_absent))
 		return path_none;
 
 	excluded = is_excluded(dir, istate, path->buf, &dtype);
diff --git a/dir.h b/dir.h
index a3c40dec51..af817a21b2 100644
--- a/dir.h
+++ b/dir.h
@@ -370,6 +370,15 @@ int read_directory(struct dir_struct *, struct index_state *istate,
 		   const char *path, int len,
 		   const struct pathspec *pathspec);
 
+enum exist_status {
+	is_cache_absent = 0,
+	is_cache_directory,
+	is_cache_gitdir
+};
+
+enum exist_status directory_exists_in_index(struct index_state *istate,
+					    const char *dirname, int len);
+
 enum pattern_match_result {
 	UNDECIDED = -1,
 	NOT_MATCHED = 0,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/3] submodule: port submodule subcommand 'add' from shell to C
  2020-12-14 23:19 [PATCH v3 0/3] submodule: port subcommand add from shell to C Shourya Shukla
  2020-12-14 23:19 ` [PATCH v3 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
@ 2020-12-14 23:19 ` Shourya Shukla
  2020-12-14 23:19 ` [PATCH v3 3/3] t7400: add test to check 'submodule add' for tracked paths Shourya Shukla
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Shourya Shukla @ 2020-12-14 23:19 UTC (permalink / raw)
  To: git
  Cc: gitster, Johannes.Schindelin, liu.denton, christian.couder,
	kaartic.sivaraam, Shourya Shukla, Christian Couder, Stefan Beller,
	Prathamesh Chavan

Convert submodule subcommand 'add' to a builtin and call it via
'git-submodule.sh'.

Also, since the command die()s out in case of absence of commits in the
submodule, the keyword 'fatal' is prefixed in the error messages.
Therefore, prepend the keyword in the expected output of test t7400.6.

While at it, eliminate the extra preprocessor directive
`#include "dir.h"` at the start of 'submodule--helper.c'.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Stefan Beller <stefanbeller@gmail.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <periperidip@gmail.com>
---
 builtin/submodule--helper.c |   2 +-
 git-submodule.sh            | 161 +-----------------------------------
 t/t7400-submodule-basic.sh  |   2 +-
 3 files changed, 3 insertions(+), 162 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 4dfad35d77..4f1d892b9a 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -19,7 +19,6 @@
 #include "diffcore.h"
 #include "diff.h"
 #include "object-store.h"
-#include "dir.h"
 #include "advice.h"
 
 #define OPT_QUIET (1 << 0)
@@ -3185,6 +3184,7 @@ static struct cmd_struct commands[] = {
 	{"config", module_config, 0},
 	{"set-url", module_set_url, 0},
 	{"set-branch", module_set_branch, 0},
+	{"add", module_add, SUPPORT_SUPER_PREFIX},
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index eb90f18229..b586f9532d 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -145,166 +145,7 @@ cmd_add()
 		shift
 	done
 
-	if ! git submodule--helper config --check-writeable >/dev/null 2>&1
-	then
-		 die "$(eval_gettext "please make sure that the .gitmodules file is in the working tree")"
-	fi
-
-	if test -n "$reference_path"
-	then
-		is_absolute_path "$reference_path" ||
-		reference_path="$wt_prefix$reference_path"
-
-		reference="--reference=$reference_path"
-	fi
-
-	repo=$1
-	sm_path=$2
-
-	if test -z "$sm_path"; then
-		sm_path=$(printf '%s\n' "$repo" |
-			sed -e 's|/$||' -e 's|:*/*\.git$||' -e 's|.*[/:]||g')
-	fi
-
-	if test -z "$repo" || test -z "$sm_path"; then
-		usage
-	fi
-
-	is_absolute_path "$sm_path" || sm_path="$wt_prefix$sm_path"
-
-	# assure repo is absolute or relative to parent
-	case "$repo" in
-	./*|../*)
-		test -z "$wt_prefix" ||
-		die "$(gettext "Relative path can only be used from the toplevel of the working tree")"
-
-		# dereference source url relative to parent's url
-		realrepo=$(git submodule--helper resolve-relative-url "$repo") || exit
-		;;
-	*:*|/*)
-		# absolute url
-		realrepo=$repo
-		;;
-	*)
-		die "$(eval_gettext "repo URL: '\$repo' must be absolute or begin with ./|../")"
-	;;
-	esac
-
-	# normalize path:
-	# multiple //; leading ./; /./; /../; trailing /
-	sm_path=$(printf '%s/\n' "$sm_path" |
-		sed -e '
-			s|//*|/|g
-			s|^\(\./\)*||
-			s|/\(\./\)*|/|g
-			:start
-			s|\([^/]*\)/\.\./||
-			tstart
-			s|/*$||
-		')
-	if test -z "$force"
-	then
-		git ls-files --error-unmatch "$sm_path" > /dev/null 2>&1 &&
-		die "$(eval_gettext "'\$sm_path' already exists in the index")"
-	else
-		git ls-files -s "$sm_path" | sane_grep -v "^160000" > /dev/null 2>&1 &&
-		die "$(eval_gettext "'\$sm_path' already exists in the index and is not a submodule")"
-	fi
-
-	if test -d "$sm_path" &&
-		test -z $(git -C "$sm_path" rev-parse --show-cdup 2>/dev/null)
-	then
-	    git -C "$sm_path" rev-parse --verify -q HEAD >/dev/null ||
-	    die "$(eval_gettext "'\$sm_path' does not have a commit checked out")"
-	fi
-
-	if test -z "$force"
-	then
-	    dryerr=$(git add --dry-run --ignore-missing --no-warn-embedded-repo "$sm_path" 2>&1 >/dev/null)
-	    res=$?
-	    if test $res -ne 0
-	    then
-		 echo >&2 "$dryerr"
-		 exit $res
-	    fi
-	fi
-
-	if test -n "$custom_name"
-	then
-		sm_name="$custom_name"
-	else
-		sm_name="$sm_path"
-	fi
-
-	if ! git submodule--helper check-name "$sm_name"
-	then
-		die "$(eval_gettext "'$sm_name' is not a valid submodule name")"
-	fi
-
-	# perhaps the path exists and is already a git repo, else clone it
-	if test -e "$sm_path"
-	then
-		if test -d "$sm_path"/.git || test -f "$sm_path"/.git
-		then
-			eval_gettextln "Adding existing repo at '\$sm_path' to the index"
-		else
-			die "$(eval_gettext "'\$sm_path' already exists and is not a valid git repo")"
-		fi
-
-	else
-		if test -d ".git/modules/$sm_name"
-		then
-			if test -z "$force"
-			then
-				eval_gettextln >&2 "A git directory for '\$sm_name' is found locally with remote(s):"
-				GIT_DIR=".git/modules/$sm_name" GIT_WORK_TREE=. git remote -v | grep '(fetch)' | sed -e s,^,"  ", -e s,' (fetch)',, >&2
-				die "$(eval_gettextln "\
-If you want to reuse this local git directory instead of cloning again from
-  \$realrepo
-use the '--force' option. If the local git directory is not the correct repo
-or you are unsure what this means choose another name with the '--name' option.")"
-			else
-				eval_gettextln "Reactivating local git directory for submodule '\$sm_name'."
-			fi
-		fi
-		git submodule--helper clone ${GIT_QUIET:+--quiet} ${progress:+"--progress"} --prefix "$wt_prefix" --path "$sm_path" --name "$sm_name" --url "$realrepo" ${reference:+"$reference"} ${dissociate:+"--dissociate"} ${depth:+"$depth"} || exit
-		(
-			sanitize_submodule_env
-			cd "$sm_path" &&
-			# ash fails to wordsplit ${branch:+-b "$branch"...}
-			case "$branch" in
-			'') git checkout -f -q ;;
-			?*) git checkout -f -q -B "$branch" "origin/$branch" ;;
-			esac
-		) || die "$(eval_gettext "Unable to checkout submodule '\$sm_path'")"
-	fi
-	git config submodule."$sm_name".url "$realrepo"
-
-	git add --no-warn-embedded-repo $force "$sm_path" ||
-	die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
-
-	git submodule--helper config submodule."$sm_name".path "$sm_path" &&
-	git submodule--helper config submodule."$sm_name".url "$repo" &&
-	if test -n "$branch"
-	then
-		git submodule--helper config submodule."$sm_name".branch "$branch"
-	fi &&
-	git add --force .gitmodules ||
-	die "$(eval_gettext "Failed to register submodule '\$sm_path'")"
-
-	# NEEDSWORK: In a multi-working-tree world, this needs to be
-	# set in the per-worktree config.
-	if git config --get submodule.active >/dev/null
-	then
-		# If the submodule being adding isn't already covered by the
-		# current configured pathspec, set the submodule's active flag
-		if ! git submodule--helper is-active "$sm_path"
-		then
-			git config submodule."$sm_name".active "true"
-		fi
-	else
-		git config submodule."$sm_name".active "true"
-	fi
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper add ${force:+--force} ${GIT_QUIET:+--quiet} ${progress:+--progress} ${branch:+--branch "$branch"} ${reference_path:+--reference "$reference_path"} ${dissociate:+--dissociate} ${custom_name:+--name "$custom_name"} ${depth:+"$depth"} -- "$@"
 }
 
 #
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index fec7e0299d..4ab8298385 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -48,7 +48,7 @@ test_expect_success 'submodule update aborts on missing gitmodules url' '
 
 test_expect_success 'add aborts on repository with no commits' '
 	cat >expect <<-\EOF &&
-	'"'repo-no-commits'"' does not have a commit checked out
+	fatal: '"'repo-no-commits'"' does not have a commit checked out
 	EOF
 	git init repo-no-commits &&
 	test_must_fail git submodule add ../a ./repo-no-commits 2>actual &&
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/3] t7400: add test to check 'submodule add' for tracked paths
  2020-12-14 23:19 [PATCH v3 0/3] submodule: port subcommand add from shell to C Shourya Shukla
  2020-12-14 23:19 ` [PATCH v3 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
  2020-12-14 23:19 ` [PATCH v3 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
@ 2020-12-14 23:19 ` Shourya Shukla
  2020-12-15 21:44 ` [PATCH v3 0/3] submodule: port subcommand add from shell to C Junio C Hamano
  2020-12-22 23:42 ` Junio C Hamano
  4 siblings, 0 replies; 10+ messages in thread
From: Shourya Shukla @ 2020-12-14 23:19 UTC (permalink / raw)
  To: git
  Cc: gitster, Johannes.Schindelin, liu.denton, christian.couder,
	kaartic.sivaraam, Shourya Shukla

Add test to check if 'git submodule add' works on paths which are
tracked by Git.

Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <periperidip@gmail.com>
---
 t/t7400-submodule-basic.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 4ab8298385..d9317192e0 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -193,6 +193,17 @@ test_expect_success 'submodule add to .gitignored path with --force' '
 	)
 '
 
+test_expect_success 'submodule add to path with tracked contents fails' '
+	(
+		cd addtest-ignore &&
+		mkdir track &&
+		git add -f track &&
+		git commit -m "add tracked path" &&
+		! git submodule add "$submodurl" submod >output 2>&1 &&
+		test_file_not_empty output
+	)
+'
+
 test_expect_success 'submodule add to reconfigure existing submodule with --force' '
 	(
 		cd addtest-ignore &&
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/3] submodule: port subcommand add from shell to C
  2020-12-14 23:19 [PATCH v3 0/3] submodule: port subcommand add from shell to C Shourya Shukla
                   ` (2 preceding siblings ...)
  2020-12-14 23:19 ` [PATCH v3 3/3] t7400: add test to check 'submodule add' for tracked paths Shourya Shukla
@ 2020-12-15 21:44 ` Junio C Hamano
  2020-12-17 14:16   ` Shourya Shukla
  2020-12-22 23:42 ` Junio C Hamano
  4 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2020-12-15 21:44 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: git, Johannes.Schindelin, liu.denton, christian.couder,
	kaartic.sivaraam

Shourya Shukla <periperidip@gmail.com> writes:

>     3. In the following segment:
>         /*
>          * NEEDSWORK: In a multi-working-tree world, this needs to be
>          * set in the per-worktree config.
>          */
>         if (!git_config_get_string("submodule.active", &var) && var) {
>
>         There was a comment: "What if this were a valueless true
>         ("[submodule] active\n" without "= true")?  Wouldn't get_string()
>         fail?"
>
>         I was under the impression that even if the above failed, it
>         will not really affect the big picture since at the we will set
>         'submodule.name.active" as true irrespective of the above value.
>         Is this correct?

Let's see what kind of value the "submodule.active" variable is
meant to be set to.  Documentation/config/submodule.txt has this:

    submodule.active::
            A repeated field which contains a pathspec used to match against a
            submodule's path to determine if the submodule is of interest to git
            commands. See linkgit:gitsubmodules[7] for details.

It definitely is a string value, and making it a valueless true is
an error in the configuration.  I wonder if we want to diagnose such
an error, or can we just pretend we didn't see it and keep going?

Also the "var" (one of the values set for this multi-valued
variable) is never used in the body of the "if" statement.  The
other user of "submodule.active" in module_init() seems to use
config_get_value_multi() on it.  The new code may deserve a comment
to explain why that is OK to (1) grab just a single value out of the
multi-valued variable, and (2) not even look at its value.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/3] submodule: port subcommand add from shell to C
  2020-12-15 21:44 ` [PATCH v3 0/3] submodule: port subcommand add from shell to C Junio C Hamano
@ 2020-12-17 14:16   ` Shourya Shukla
  2020-12-17 22:20     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Shourya Shukla @ 2020-12-17 14:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, liu.denton, christian.couder, kaartic.sivaraam

On 15/12 01:44, Junio C Hamano wrote:
> Shourya Shukla <periperidip@gmail.com> writes:
> 
> >     3. In the following segment:
> >         /*
> >          * NEEDSWORK: In a multi-working-tree world, this needs to be
> >          * set in the per-worktree config.
> >          */
> >         if (!git_config_get_string("submodule.active", &var) && var) {
> >
> >         There was a comment: "What if this were a valueless true
> >         ("[submodule] active\n" without "= true")?  Wouldn't get_string()
> >         fail?"
> >
> >         I was under the impression that even if the above failed, it
> >         will not really affect the big picture since at the we will set
> >         'submodule.name.active" as true irrespective of the above value.
> >         Is this correct?
> 
> Let's see what kind of value the "submodule.active" variable is
> meant to be set to.  Documentation/config/submodule.txt has this:
> 
>     submodule.active::
>             A repeated field which contains a pathspec used to match against a
>             submodule's path to determine if the submodule is of interest to git
>             commands. See linkgit:gitsubmodules[7] for details.
> 
> It definitely is a string value, and making it a valueless true is
> an error in the configuration.

I think that we did not _make_ it a valueless true. It was already there
and we somehow managed to check it. If you mean that we should ensure
that we set it to "true" so that any such errors don't happen later on,
then that is a different thing.

> I wonder if we want to diagnose such
> an error, or can we just pretend we didn't see it and keep going?

I guess we could pretend we did not see it since it isn't affecting the
run of the sub-command. If you think otherwise, please suggest.

> Also the "var" (one of the values set for this multi-valued
> variable) is never used in the body of the "if" statement.  The
> other user of "submodule.active" in module_init() seems to use
> config_get_value_multi() on it.  The new code may deserve a comment
> to explain why that is OK to (1) grab just a single value out of the
> multi-valued variable, and (2) not even look at its value.

Understood. So a comment along the lines of:

	/*
	 * Since we are fetching information only about one submodule,
	 * we need not fetch a  list of submodules to check the activity
	 * status of a single submodule.
	 *
	 * In case of a valueless true, i.e, '[submodule] active\n'
	 * without '= true', we need not worry about any errors since
	 * irrespective of the above value, we will set
	 * 'submodule.<name>.active' as true.
	 */

will work? Also, could you please comment on the other two issues I
mentioned in the cover letter so I might as well start work on v4 of
this patch?

Regards,
Shourya Shukla


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/3] submodule: port subcommand add from shell to C
  2020-12-17 14:16   ` Shourya Shukla
@ 2020-12-17 22:20     ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2020-12-17 22:20 UTC (permalink / raw)
  To: Shourya Shukla; +Cc: git, liu.denton, christian.couder, kaartic.sivaraam

Shourya Shukla <periperidip@gmail.com> writes:

> On 15/12 01:44, Junio C Hamano wrote:
>> Shourya Shukla <periperidip@gmail.com> writes:
>> 
>> >     3. In the following segment:
>> >         /*
>> >          * NEEDSWORK: In a multi-working-tree world, this needs to be
>> >          * set in the per-worktree config.
>> >          */
>> >         if (!git_config_get_string("submodule.active", &var) && var) {
>> >
>> >         There was a comment: "What if this were a valueless true
>> >         ("[submodule] active\n" without "= true")?  Wouldn't get_string()
>> >         fail?"
>> >
>> >         I was under the impression that even if the above failed, it
>> >         will not really affect the big picture since at the we will set
>> >         'submodule.name.active" as true irrespective of the above value.
>> >         Is this correct?
>> 
>> Let's see what kind of value the "submodule.active" variable is
>> meant to be set to.  Documentation/config/submodule.txt has this:
>> 
>>     submodule.active::
>>             A repeated field which contains a pathspec used to match against a
>>             submodule's path to determine if the submodule is of interest to git
>>             commands. See linkgit:gitsubmodules[7] for details.
>> 
>> It definitely is a string value, and making it a valueless true is
>> an error in the configuration.
>
> I think that we did not _make_ it a valueless true. It was already there
> and we somehow managed to check it. If you mean that we should ensure
> that we set it to "true" so that any such errors don't happen later on,
> then that is a different thing.

Let me rephrase.  When a user has "[submodule] active" in his or her
configuration file, it is a configuration error.  When Git reads
"submodule.active" configuration variable to make a decision (like
the above code) and finds that the user has such an error, the user
would appreciate if the error is pointed out, so that it can be
corrected, rather than silently ignored.

>> Also the "var" (one of the values set for this multi-valued
>> variable) is never used in the body of the "if" statement.  The
>> other user of "submodule.active" in module_init() seems to use
>> config_get_value_multi() on it.  The new code may deserve a comment
>> to explain why that is OK to (1) grab just a single value out of the
>> multi-valued variable, and (2) not even look at its value.
>
> Understood. So a comment along the lines of:
>
> 	/*
> 	 * Since we are fetching information only about one submodule,
> 	 * we need not fetch a  list of submodules to check the activity
> 	 * status of a single submodule.

Makes me wonder if I am getting the semantics of submodule.active
variable right.

From the three-line description in the documentation (see above), I
would have guessed that if we have three values for
submodule.active, e.g.

	[submodule]
		active = $a
		active = $b
		active = $c

then when deciding if we want to see if a submodule at a $sm_path,
we'd see if $path matches any one of $a, $b, or $c and if it does,
it is determined that the submodule is "of interest to git
commands".

Yes, we may be fetching information only about one submodule at
$sm_path, but given the explanation of how the configuration
variable is designed to work, how can we _not_ fetch the list and
check all of them?

So the comment above (for that matter, the one below that talks
about valuless true) does not make any sense to me, sorry.

> 	 * In case of a valueless true, i.e, '[submodule] active\n'
> 	 * without '= true', we need not worry about any errors since
> 	 * irrespective of the above value, we will set
> 	 * 'submodule.<name>.active' as true.
> 	 */
>
> will work? 

The real reason why it is OK to just check existence of submodule.active
variable without seeing any value of them is because the check is done
to see if this call is needed at all:

	git submodule--helper is-active "$sm_path"

This "helper" eventually calls submodule.c::is_submodule_active()
that does the real check---it gets the multi-valued submodule.active
and checks them against the path to determine if the submodule is
"of interest".

On the other hand, when we know submodule.active does not exist, all
submodules are of interest when it comes to "submodule add".  That
is how

-	if git config --get submodule.active >/dev/null
-	then
-		# If the submodule being adding isn't already covered by the
-		# current configured pathspec, set the submodule's active flag
-		if ! git submodule--helper is-active "$sm_path"
-		then
-			git config submodule."$sm_name".active "true"
-		fi
-	else
-		git config submodule."$sm_name".active "true"
-	fi

taken from your [2/3] that ignored the values of submodule.active is
"correct"; we know that the real work is done elsewhere--we are only
learning if the is-active check is necessary.

I think explaining why it works correctly to show future readers
that the code was written by folks who knew what they were doing
would be worth the effort to help future code evolution.

This, from your [1/3], is a faithful translation of the above,
but ...

+	if (!git_config_get_string("submodule.active", &var) && var) {
+
+		/*
+		 * If the submodule being adding isn't already covered by the
+		 * current configured pathspec, set the submodule's active flag
+		 */
+		if (!is_submodule_active(the_repository, info->sm_path)) {
+			key = xstrfmt("submodule.%s.active", info->sm_name);
+			git_config_set_gently(key, "true");
+			free(key);
+		}
+	} else {
+		key = xstrfmt("submodule.%s.active", info->sm_name);
+		git_config_set_gently(key, "true");
+		free(key);
+	}

... by knowing why we check submodule.active but discard its value,
future developers (read: I think this is outside the scope of this
series) can rewrite it like so to make it more readable and reduce
the repeated code to set submodule.$sm_name.active to true.

        /*
	 * If submodule.active does not exist, we will activate this
	 * module unconditionally.
	 *
         * Otherwise, is_submodule_active() is asked to determine if
         * the path currently is of interest; because it will obtain
         * and iterate over this multi-valued variable by itself, we
         * do not need its values we obtain from git_config_get_string()
         * call here.  We are only checking if we need to ask the
         * is_submodule_active() helper function.  We explicitly set
	 * the submodule.$sm_name.active if submodule.active patterns
	 * do not cover the path (i.e. is_submodule_active() says "no".
         */
	if (git_config_get_string("submodule.active", &var) ||
	    !is_submodule_active(the_repository, info->sm_path)) {
		key = xstrfmt(...);
		git_config_set_gently(key, "true");
		free(key);
	}

and a comment like this would help such readers, for example.

By the way, as you might have noticed, your [1/3] contains a lot of
material that ought to be part of [2/3], doesn't it?  [1/3] was
supposed to be just borrowing helper from dir.c but has the new
"add" code implemented in the same patch.

> Also, could you please comment on the other two issues I
> mentioned in the cover letter so I might as well start work on v4 of
> this patch?

I'll leave the other two to other reviewers and mentors for now, but
may come back to them if I beat them.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] dir: change the scope of function 'directory_exists_in_index()'
  2020-12-14 23:19 ` [PATCH v3 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
@ 2020-12-19  0:08   ` Johannes Schindelin
  2020-12-19  0:47     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Johannes Schindelin @ 2020-12-19  0:08 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: git, gitster, liu.denton, christian.couder, kaartic.sivaraam

Hi Shourya,

On Tue, 15 Dec 2020, Shourya Shukla wrote:

> Change the scope of the function 'directory_exists_in_index()' as well
> as declare it in 'dir.h'.
>
> Since the return type of the function is the enumerator 'exist_status',
> change its scope as well and declare it in 'dir.h'. While at it, rename
> the members of the aforementioned enum so as to avoid any naming clashes
> or confusions later on.

This makes it sound as if only existing code was adjusted, in a minimal
way, but no new code was introduced. But that's not true:

>
> Helped-by: Christian Couder <christian.couder@gmail.com>
> Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
> Signed-off-by: Shourya Shukla <periperidip@gmail.com>
> ---
>  builtin/submodule--helper.c | 408 ++++++++++++++++++++++++++++++++++++
>  dir.c                       |  30 ++-
>  dir.h                       |   9 +
>  3 files changed, 429 insertions(+), 18 deletions(-)

Tons of new code there. And unfortunately...

>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index c30896c897..4dfad35d77 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -2744,6 +2744,414 @@ static int module_set_branch(int argc, const char **argv, const char *prefix)
>  	return !!ret;
>  }
>
> +struct add_data {
> +	const char *prefix;
> +	const char *branch;
> +	const char *reference_path;
> +	const char *sm_path;
> +	const char *sm_name;
> +	const char *repo;
> +	const char *realrepo;
> +	int depth;
> +	unsigned int force: 1;
> +	unsigned int quiet: 1;
> +	unsigned int progress: 1;
> +	unsigned int dissociate: 1;
> +};
> +#define ADD_DATA_INIT { 0 }
> +
> +/*
> + * Guess the directory name from the repository URL by performing the
> + * operations below in the following order:
> + *
> + * - If the URL ends with '/', remove that.
> + *
> + * - If the result of the above ends with zero or more ':', followed
> + *  by zero or more '/', followed by ".git", drop the matching part.
> + *
> + * - If the result of the above has '/' or ':' in it, remove everything
> + *  before it and '/' or ':' itself.
> + */
> +static char *guess_dir_name(const char *repo)
> +{
> +	const char *start, *end;
> +
> +	start = repo;
> +	end = repo + strlen(repo);
> +
> +	/* remove the trailing '/' */
> +	if (repo < end - 1 && end[-1] == '/')
> +		end--;
> +
> +	/* remove the trailing ':', '/' and '.git' */
> +	if (repo < end - 4 && !memcmp(".git", end - 4, 4)) {
> +		end -= 4;
> +		while (repo < end - 1 && end[-1] == '/')
> +			end--;
> +		while (repo < end - 1 && end[-1] == ':')
> +			end--;
> +	}
> +
> +	/* find the last ':' or '/' */
> +	for (start = end - 1; repo <= start; start--) {
> +		if (*start == '/' || *start == ':')
> +			break;
> +	}
> +	/* exclude '/' or ':' itself */
> +	start++;
> +
> +	return xmemdupz(start, end - start);
> +}
> +
> +static int can_create_submodule(unsigned int force, const char *path)
> +{
> +	int cache_pos, dir_in_cache = 0;
> +
> +	if (read_cache() < 0)
> +		die(_("index file corrupt"));
> +
> +	cache_pos = cache_name_pos(path, strlen(path));
> +	if(cache_pos < 0 &&
> +	   directory_exists_in_index(&the_index, path, strlen(path)) == is_cache_directory)
> +		dir_in_cache = 1;
> +
> +	if (!force) {
> +		if (cache_pos >= 0 || dir_in_cache)
> +			die(_("'%s' already exists in the index"), path);
> +	} else {
> +		struct cache_entry *ce = NULL;
> +		if (cache_pos >= 0)
> +			ce = the_index.cache[cache_pos];
> +		if (dir_in_cache || (ce && !S_ISGITLINK(ce->ce_mode)))
> +			die(_("'%s' already exists in the index and is not a "
> +			      "submodule"), path);
> +	}
> +	return 0;
> +}
> +
> +static const char *parse_token(const char *cp, int *len)
> +{
> +	const char *p = cp, *start, *end;
> +	char *str;
> +
> +	start = p;
> +	while (*p != ' ')
> +		p++;
> +	end = p;
> +	str = xstrndup(start, end - start);
> +
> +	while(*p == ' ')
> +		p++;
> +
> +	return str;
> +}

This function is not careful enough to avoid buffer overruns. It even
triggers a segmentation fault in our test suite:
https://github.com/gitgitgadget/git/runs/1574891976?check_suite_focus=true#step:6:3152

I need this to make it pass (only tested locally so far, but I trust you
to take the baton from here):

-- snipsnap --
From c28c0cd3ac21d546394335957fbaa350ab287c3f Mon Sep 17 00:00:00 2001
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sat, 19 Dec 2020 01:02:04 +0100
Subject: [PATCH] fixup??? dir: change the scope of function
 'directory_exists_in_index()'

This fixes the segmentation fault reported in the linux-musl job of our
CI builds. Valgrind has this to say about it:

==32354==
==32354== Process terminating with default action of signal 11 (SIGSEGV)
==32354==  Access not within mapped region at address 0x5C73000
==32354==    at 0x202F5A: parse_token (submodule--helper.c:2837)
==32354==    by 0x20319B: report_fetch_remotes (submodule--helper.c:2871)
==32354==    by 0x2033FD: add_submodule (submodule--helper.c:2898)
==32354==    by 0x204612: module_add (submodule--helper.c:3146)
==32354==    by 0x20478A: cmd_submodule__helper (submodule--helper.c:3202)
==32354==    by 0x12655E: run_builtin (git.c:458)
==32354==    by 0x1269B4: handle_builtin (git.c:712)
==32354==    by 0x126C79: run_argv (git.c:779)
==32354==    by 0x12715C: cmd_main (git.c:913)
==32354==    by 0x2149A2: main (common-main.c:52)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/submodule--helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 4f1d892b9a9..29a6f80b937 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2834,12 +2834,12 @@ static const char *parse_token(const char *cp, int *len)
 	char *str;

 	start = p;
-	while (*p != ' ')
+	while (*p && *p != ' ')
 		p++;
 	end = p;
 	str = xstrndup(start, end - start);

-	while(*p == ' ')
+	while(*p && *p == ' ')
 		p++;

 	return str;
--
2.29.2.windows.1.1.g3464b98ce68


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 1/3] dir: change the scope of function 'directory_exists_in_index()'
  2020-12-19  0:08   ` Johannes Schindelin
@ 2020-12-19  0:47     ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2020-12-19  0:47 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Shourya Shukla, git, liu.denton, christian.couder,
	kaartic.sivaraam

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi Shourya,
>
> On Tue, 15 Dec 2020, Shourya Shukla wrote:
>
>> Change the scope of the function 'directory_exists_in_index()' as well
>> as declare it in 'dir.h'.
>>
>> Since the return type of the function is the enumerator 'exist_status',
>> change its scope as well and declare it in 'dir.h'. While at it, rename
>> the members of the aforementioned enum so as to avoid any naming clashes
>> or confusions later on.
>
> This makes it sound as if only existing code was adjusted, in a minimal
> way, but no new code was introduced. But that's not true:

I noticed it last night, too---I suspect it was a mistake made while
shuffling changes across steps with rebase -i.

>> Helped-by: Christian Couder <christian.couder@gmail.com>
>> Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
>> Signed-off-by: Shourya Shukla <periperidip@gmail.com>
>> ---
>>  builtin/submodule--helper.c | 408 ++++++++++++++++++++++++++++++++++++
>>  dir.c                       |  30 ++-
>>  dir.h                       |   9 +
>>  3 files changed, 429 insertions(+), 18 deletions(-)
>
> Tons of new code there. And unfortunately...

>> +static const char *parse_token(const char *cp, int *len)
>> +{
>> +	const char *p = cp, *start, *end;
>> +	char *str;
>> +
>> +	start = p;
>> +	while (*p != ' ')
>> +		p++;
>> +	end = p;
>> +	str = xstrndup(start, end - start);
>> +
>> +	while(*p == ' ')
>> +		p++;
>> +
>> +	return str;
>> +}
>
> This function is not careful enough to avoid buffer overruns. It even
> triggers a segmentation fault in our test suite:
> https://github.com/gitgitgadget/git/runs/1574891976?check_suite_focus=true#step:6:3152

I notice that len is not used at all ;-)

> I need this to make it pass (only tested locally so far, but I trust you
> to take the baton from here):
>
> -- snipsnap --
> From c28c0cd3ac21d546394335957fbaa350ab287c3f Mon Sep 17 00:00:00 2001
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Sat, 19 Dec 2020 01:02:04 +0100
> Subject: [PATCH] fixup??? dir: change the scope of function
>  'directory_exists_in_index()'
>
> This fixes the segmentation fault reported in the linux-musl job of our
> CI builds. Valgrind has this to say about it:
>
> ==32354==
> ==32354== Process terminating with default action of signal 11 (SIGSEGV)
> ==32354==  Access not within mapped region at address 0x5C73000
> ==32354==    at 0x202F5A: parse_token (submodule--helper.c:2837)
> ==32354==    by 0x20319B: report_fetch_remotes (submodule--helper.c:2871)
> ==32354==    by 0x2033FD: add_submodule (submodule--helper.c:2898)
> ==32354==    by 0x204612: module_add (submodule--helper.c:3146)
> ==32354==    by 0x20478A: cmd_submodule__helper (submodule--helper.c:3202)
> ==32354==    by 0x12655E: run_builtin (git.c:458)
> ==32354==    by 0x1269B4: handle_builtin (git.c:712)
> ==32354==    by 0x126C79: run_argv (git.c:779)
> ==32354==    by 0x12715C: cmd_main (git.c:913)
> ==32354==    by 0x2149A2: main (common-main.c:52)
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  builtin/submodule--helper.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index 4f1d892b9a9..29a6f80b937 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -2834,12 +2834,12 @@ static const char *parse_token(const char *cp, int *len)
>  	char *str;
>
>  	start = p;
> -	while (*p != ' ')
> +	while (*p && *p != ' ')
>  		p++;
>  	end = p;
>  	str = xstrndup(start, end - start);
>
> -	while(*p == ' ')
> +	while(*p && *p == ' ')
>  		p++;
>
>  	return str;
> --
> 2.29.2.windows.1.1.g3464b98ce68

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 0/3] submodule: port subcommand add from shell to C
  2020-12-14 23:19 [PATCH v3 0/3] submodule: port subcommand add from shell to C Shourya Shukla
                   ` (3 preceding siblings ...)
  2020-12-15 21:44 ` [PATCH v3 0/3] submodule: port subcommand add from shell to C Junio C Hamano
@ 2020-12-22 23:42 ` Junio C Hamano
  4 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2020-12-22 23:42 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: git, Johannes.Schindelin, liu.denton, christian.couder,
	kaartic.sivaraam

Shourya Shukla <periperidip@gmail.com> writes:

> Feedback and reviews are appreciated.
>
> Regards,
> Shourya Shukla
>
> Shourya Shukla (3):
>   dir: change the scope of function 'directory_exists_in_index()'
>   submodule: port submodule subcommand 'add' from shell to C
>   t7400: add test to check 'submodule add' for tracked paths

Sorry for not being a feedback nor a review, but we are seeing a
segfault from "git submodule add" when the topic is tested with the
rest of 'seen':

  https://github.com/git/git/runs/1597682274#step:6:3155

It seems that you need to be logged in to see the full CI output to
the line level when visiting the above URL.

Thanks.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-12-22 23:46 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-14 23:19 [PATCH v3 0/3] submodule: port subcommand add from shell to C Shourya Shukla
2020-12-14 23:19 ` [PATCH v3 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
2020-12-19  0:08   ` Johannes Schindelin
2020-12-19  0:47     ` Junio C Hamano
2020-12-14 23:19 ` [PATCH v3 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
2020-12-14 23:19 ` [PATCH v3 3/3] t7400: add test to check 'submodule add' for tracked paths Shourya Shukla
2020-12-15 21:44 ` [PATCH v3 0/3] submodule: port subcommand add from shell to C Junio C Hamano
2020-12-17 14:16   ` Shourya Shukla
2020-12-17 22:20     ` Junio C Hamano
2020-12-22 23:42 ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).