git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v2 0/3] submodule: port subcommand add from shell to C
@ 2020-10-07  7:45 Shourya Shukla
  2020-10-07  7:45 ` [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Shourya Shukla @ 2020-10-07  7:45 UTC (permalink / raw)
  To: git
  Cc: gitster, christian.couder, kaartic.sivaraam, Johannes.Schindelin,
	liu.denton, Shourya Shukla

Hello all,

This is the v2 of the patch with the same title, delivered more than a
month ago as a part of my GSoC. Link to v1:
https://lore.kernel.org/git/20200824090359.403944-1-shouryashukla.oo@gmail.com/

The changelog is as follows:

    1. Introduce PATCH[1/3](dir: change the scope of function
       'directory_exists_in_index()', 2020-10-06). This was done since
       the above mentioned function will be used in the patch that
       follows.

    2. There are multiple changes in this commit:

            A. Improve the part which checks if the 'path' given as
               argument exists or not. Implementing Kaartic's
               suggestions on the patch, I had to make sure that the
               case for checking if the path has tracked contents or
               not also works.

            B. Also, wrap the aforementioned segment in a function
               since it became very long. The function is called
               'check_sm_exists()'.

            C. Also, use the function 'is_nonbare_repository_dir()'
               instead of 'is_directory()' when trying to resolve
               gitlink.

            D. Append keyword 'fatal' in front of the expected output of
               test t7400.6 since the command die()s out in case of
               absence of commits in a submodule.

            E. Remove the extra `#include "dir.h"` from
               'submodule--helper.c'.

    3. Introduce PATCH[3/3] (t7400: add test to check 'submodule add'
       for tracked paths, 2020-10-07). Kaartic pointed out that a test
       for path with tracked contents did not exist and hence it was
       necessary to write one. Therefore, this commit introduces a new
       test 't7400.18: submodule add to path with tracked contents
       fails'.

Comments and feedback are appreciated. Sorry for the month long delay, I
was on a vacation.

I am attaching a range-diff between v1 and v2 at the end of this mail.

Regards,
Shourya Shukla
-----

-:  ---------- > 1:  bdac00494e dir: change the scope of function 'directory_exists_in_index()'
1:  b08d81e179 ! 2:  3e20d0fe04 submodule: port submodule subcommand 'add' from shell to C
    @@ Commit message
         'git-submodule.sh'.

         Also, since the command die()s out in case of absence of commits in the
    -    submodule and exits with exit status 1 when we try adding a submodule
    -    which is mentioned in .gitignore, the keyword 'fatal' is prefixed in the
    -    error messages. Therefore, prepend the keyword in the expected outputs
    -    of tests t7400.6 and t7400.16.
    +    submodule, the keyword 'fatal' is prefixed in the error messages.
    +    Therefore, prepend the keyword in the expected output of test t7400.6.
    +
    +    While at it, eliminate the extra preprocessor directive
    +    `#include "dir.h"` at the start of 'submodule--helper.c'.

         Mentored-by: Christian Couder <chriscool@tuxfamily.org>
         Mentored-by: Stefan Beller <stefanbeller@gmail.com>
    @@ Commit message
         Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>

      ## builtin/submodule--helper.c ##
    +@@
    + #include "diffcore.h"
    + #include "diff.h"
    + #include "object-store.h"
    +-#include "dir.h"
    + #include "advice.h"
    +
    + #define OPT_QUIET (1 << 0)
     @@ builtin/submodule--helper.c: static int module_set_branch(int argc, const char **argv, const char *prefix)
        return !!ret;
      }
    @@ builtin/submodule--helper.c: static int module_set_branch(int argc, const char *
     +  free(url);
     +}
     +
    ++static int check_sm_exists(unsigned int force, const char *path) {
    ++
    ++  int cache_pos, dir_in_cache = 0;
    ++  if (read_cache() < 0)
    ++          die(_("index file corrupt"));
    ++
    ++  cache_pos = cache_name_pos(path, strlen(path));
    ++  if(cache_pos < 0 && (directory_exists_in_index(&the_index,
    ++     path, strlen(path)) == index_directory))
    ++          dir_in_cache = 1;
    ++
    ++  if (!force) {
    ++          if (cache_pos >= 0 || dir_in_cache)
    ++                  die(_("'%s' already exists in the index"), path);
    ++  } else {
    ++          struct cache_entry *ce = NULL;
    ++          if (cache_pos >= 0)
    ++                  ce = the_index.cache[cache_pos];
    ++          if (dir_in_cache || (ce && !S_ISGITLINK(ce->ce_mode)))
    ++                  die(_("'%s' already exists in the index and is not a "
    ++                        "submodule"), path);
    ++  }
    ++  return 0;
    ++}
    ++
     +static void modify_remote_v(struct strbuf *sb)
     +{
     +  int i;
    @@ builtin/submodule--helper.c: static int module_set_branch(int argc, const char *
     +  if (is_dir_sep(path[strlen(path) -1]))
     +          path[strlen(path) - 1] = '\0';
     +
    -+  if (!force) {
    -+          if (is_directory(path) && submodule_from_path(the_repository, &null_oid, path))
    -+                  die(_("'%s' already exists in the index"), path);
    -+  } else {
    -+          int err;
    -+          if (index_name_pos(&the_index, path, strlen(path)) >= 0 &&
    -+              !is_submodule_populated_gently(path, &err))
    -+                  die(_("'%s' already exists in the index and is not a "
    -+                        "submodule"), path);
    -+  }
    ++  if (check_sm_exists(force, path))
    ++          return 1;
     +
     +  strbuf_addstr(&sb, path);
    -+  if (is_directory(path)) {
    ++  if (is_nonbare_repository_dir(&sb)) {
     +          struct object_id oid;
     +          if (resolve_gitlink_ref(path, "HEAD", &oid) < 0)
     +                  die(_("'%s' does not have a commit checked out"), path);
    @@ builtin/submodule--helper.c: static int module_set_branch(int argc, const char *
     +          cp.no_stdout = 1;
     +          strvec_pushl(&cp.args, "add", "--dry-run", "--ignore-missing",
     +                       "--no-warn-embedded-repo", path, NULL);
    -+          if (pipe_command(&cp, NULL, 0, NULL, 0, &sb, 0))
    -+                  die(_("%s"), sb.buf);
    ++          if (pipe_command(&cp, NULL, 0, NULL, 0, &sb, 0)) {
    ++                  fprintf(stderr, _("%s"), sb.buf);
    ++                  return 1;
    ++          }
     +          strbuf_release(&sb);
     +  }
     +
    @@ t/t7400-submodule-basic.sh: test_expect_success 'submodule update aborts on miss
        EOF
        git init repo-no-commits &&
        test_must_fail git submodule add ../a ./repo-no-commits 2>actual &&
    -@@ t/t7400-submodule-basic.sh: test_expect_success 'submodule add to .gitignored path fails' '
    -   (
    -           cd addtest-ignore &&
    -           cat <<-\EOF >expect &&
    --          The following paths are ignored by one of your .gitignore files:
    -+          fatal: The following paths are ignored by one of your .gitignore files:
    -           submod
    -           hint: Use -f if you really want to add them.
    -           hint: Turn this message off by running
    -           hint: "git config advice.addIgnoredFile false"
    -+
    -           EOF
    -           # Does not use test_commit due to the ignore
    -           echo "*" > .gitignore &&
-:  ---------- > 3:  98b05eb46d t7400: add test to check 'submodule add' for tracked paths
-----

Prathamesh Chavan (1):
  submodule: port submodule subcommand 'add' from shell to C

Shourya Shukla (2):
  dir: change the scope of function 'directory_exists_in_index()'
  t7400: add test to check 'submodule add' for tracked paths

 builtin/submodule--helper.c | 391 +++++++++++++++++++++++++++++++++++-
 dir.c                       |  10 +-
 dir.h                       |   9 +
 git-submodule.sh            | 161 +--------------
 t/t7400-submodule-basic.sh  |  13 +-
 5 files changed, 414 insertions(+), 170 deletions(-)

-- 
2.28.0


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()'
  2020-10-07  7:45 [PATCH v2 0/3] submodule: port subcommand add from shell to C Shourya Shukla
@ 2020-10-07  7:45 ` Shourya Shukla
  2020-10-07 18:05   ` Junio C Hamano
  2020-11-18 23:25   ` Emily Shaffer
  2020-10-07  7:45 ` [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 16+ messages in thread
From: Shourya Shukla @ 2020-10-07  7:45 UTC (permalink / raw)
  To: git
  Cc: gitster, christian.couder, kaartic.sivaraam, Johannes.Schindelin,
	liu.denton, Shourya Shukla

Change the scope of the function 'directory_exists_in_index()' as well
as declare it in 'dir.h'.

Since the return type of the function is the enumerator 'exist_status',
change its scope as well and declare it in 'dir.h'.

Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
---
 dir.c | 10 ++--------
 dir.h |  9 +++++++++
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/dir.c b/dir.c
index 78387110e6..e67cf52fec 100644
--- a/dir.c
+++ b/dir.c
@@ -1655,12 +1655,6 @@ struct dir_entry *dir_add_ignored(struct dir_struct *dir,
 	return dir->ignored[dir->ignored_nr++] = dir_entry_new(pathname, len);
 }
 
-enum exist_status {
-	index_nonexistent = 0,
-	index_directory,
-	index_gitdir
-};
-
 /*
  * Do not use the alphabetically sorted index to look up
  * the directory name; instead, use the case insensitive
@@ -1688,8 +1682,8 @@ static enum exist_status directory_exists_in_index_icase(struct index_state *ist
  * the files it contains) will sort with the '/' at the
  * end.
  */
-static enum exist_status directory_exists_in_index(struct index_state *istate,
-						   const char *dirname, int len)
+enum exist_status directory_exists_in_index(struct index_state *istate,
+					    const char *dirname, int len)
 {
 	int pos;
 
diff --git a/dir.h b/dir.h
index a3c40dec51..e46f240528 100644
--- a/dir.h
+++ b/dir.h
@@ -370,6 +370,15 @@ int read_directory(struct dir_struct *, struct index_state *istate,
 		   const char *path, int len,
 		   const struct pathspec *pathspec);
 
+enum exist_status {
+	index_nonexistent = 0,
+	index_directory,
+	index_gitdir
+};
+
+enum exist_status directory_exists_in_index(struct index_state *istate,
+					    const char *dirname, int len);
+
 enum pattern_match_result {
 	UNDECIDED = -1,
 	NOT_MATCHED = 0,
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C
  2020-10-07  7:45 [PATCH v2 0/3] submodule: port subcommand add from shell to C Shourya Shukla
  2020-10-07  7:45 ` [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
@ 2020-10-07  7:45 ` Shourya Shukla
  2020-10-07 18:37   ` Junio C Hamano
                     ` (3 more replies)
  2020-10-07  7:45 ` [PATCH v2 3/3] t7400: add test to check 'submodule add' for tracked paths Shourya Shukla
  2020-11-19  0:03 ` [PATCH v2 0/3] submodule: port subcommand add from shell to C Josh Steadmon
  3 siblings, 4 replies; 16+ messages in thread
From: Shourya Shukla @ 2020-10-07  7:45 UTC (permalink / raw)
  To: git
  Cc: gitster, christian.couder, kaartic.sivaraam, Johannes.Schindelin,
	liu.denton, Prathamesh Chavan, Christian Couder, Stefan Beller,
	Shourya Shukla

From: Prathamesh Chavan <pc44800@gmail.com>

Convert submodule subcommand 'add' to a builtin and call it via
'git-submodule.sh'.

Also, since the command die()s out in case of absence of commits in the
submodule, the keyword 'fatal' is prefixed in the error messages.
Therefore, prepend the keyword in the expected output of test t7400.6.

While at it, eliminate the extra preprocessor directive
`#include "dir.h"` at the start of 'submodule--helper.c'.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Stefan Beller <stefanbeller@gmail.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
---
 builtin/submodule--helper.c | 391 +++++++++++++++++++++++++++++++++++-
 git-submodule.sh            | 161 +--------------
 t/t7400-submodule-basic.sh  |   2 +-
 3 files changed, 392 insertions(+), 162 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index de5ad73bb8..ec0a50d032 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -19,7 +19,6 @@
 #include "diffcore.h"
 #include "diff.h"
 #include "object-store.h"
-#include "dir.h"
 #include "advice.h"
 
 #define OPT_QUIET (1 << 0)
@@ -2744,6 +2743,395 @@ static int module_set_branch(int argc, const char **argv, const char *prefix)
 	return !!ret;
 }
 
+struct add_data {
+	const char *prefix;
+	const char *branch;
+	const char *reference_path;
+	const char *sm_path;
+	const char *sm_name;
+	const char *repo;
+	const char *realrepo;
+	int depth;
+	unsigned int force: 1;
+	unsigned int quiet: 1;
+	unsigned int progress: 1;
+	unsigned int dissociate: 1;
+};
+#define ADD_DATA_INIT { NULL, NULL, NULL, NULL, NULL, NULL, NULL, 0, 0, 0, 0, 0 }
+
+/*
+ * Guess dir name from repository: strip leading '.*[/:]',
+ * strip trailing '[:/]*.git'.
+ */
+static char *guess_dir_name(const char *repo)
+{
+	const char *p, *start, *end, *limit;
+	int after_slash_or_colon;
+
+	after_slash_or_colon = 0;
+	limit = repo + strlen(repo);
+	start = repo;
+	end = limit;
+	for (p = repo; p < limit; p++) {
+		if (starts_with(p, ".git")) {
+			/* strip trailing '[:/]*.git' */
+			if (!after_slash_or_colon)
+				end = p;
+			p += 3;
+		} else if (*p == '/' || *p == ':') {
+			/* strip leading '.*[/:]' */
+			if (end == limit)
+				end = p;
+			after_slash_or_colon = 1;
+		} else if (after_slash_or_colon) {
+			start = p;
+			end = limit;
+			after_slash_or_colon = 0;
+		}
+	}
+	return xstrndup(start, end - start);
+}
+
+static void fprintf_submodule_remote(const char *str)
+{
+	const char *p = str;
+	const char *start;
+	const char *end;
+	char *name, *url;
+
+	start = p;
+	while (*p != ' ')
+		p++;
+	end = p;
+	name = xstrndup(start, end - start);
+
+	while(*p == ' ')
+		p++;
+	start = p;
+	while (*p != ' ')
+		p++;
+	end = p;
+	url = xstrndup(start, end - start);
+
+	fprintf(stderr, "  %s\t%s\n", name, url);
+	free(name);
+	free(url);
+}
+
+static int check_sm_exists(unsigned int force, const char *path) {
+
+	int cache_pos, dir_in_cache = 0;
+	if (read_cache() < 0)
+		die(_("index file corrupt"));
+
+	cache_pos = cache_name_pos(path, strlen(path));
+	if(cache_pos < 0 && (directory_exists_in_index(&the_index,
+	   path, strlen(path)) == index_directory))
+		dir_in_cache = 1;
+
+	if (!force) {
+		if (cache_pos >= 0 || dir_in_cache)
+			die(_("'%s' already exists in the index"), path);
+	} else {
+		struct cache_entry *ce = NULL;
+		if (cache_pos >= 0)
+			ce = the_index.cache[cache_pos];
+		if (dir_in_cache || (ce && !S_ISGITLINK(ce->ce_mode)))
+			die(_("'%s' already exists in the index and is not a "
+			      "submodule"), path);
+	}
+	return 0;
+}
+
+static void modify_remote_v(struct strbuf *sb)
+{
+	int i;
+	for (i = 0; i < sb->len; i++) {
+		const char *start = sb->buf + i;
+		const char *end = start;
+		while (sb->buf[i++] != '\n')
+			end++;
+		if (!strcmp("fetch", xstrndup(end - 6, 5)))
+			fprintf_submodule_remote(xstrndup(start, end - start - 7));
+	}
+}
+
+static int add_submodule(struct add_data *info)
+{
+	/* perhaps the path exists and is already a git repo, else clone it */
+	if (is_directory(info->sm_path)) {
+		char *sub_git_path = xstrfmt("%s/.git", info->sm_path);
+		if (is_directory(sub_git_path) || file_exists(sub_git_path))
+			printf(_("Adding existing repo at '%s' to the index\n"),
+				 info->sm_path);
+		else
+			die(_("'%s' already exists and is not a valid git repo"),
+			      info->sm_path);
+		free(sub_git_path);
+	} else {
+		struct strvec clone_args = STRVEC_INIT;
+		struct child_process cp = CHILD_PROCESS_INIT;
+		char *submodule_git_dir = xstrfmt(".git/modules/%s", info->sm_name);
+
+		if (is_directory(submodule_git_dir)) {
+			if (!info->force) {
+				struct child_process cp_rem = CHILD_PROCESS_INIT;
+				struct strbuf sb_rem = STRBUF_INIT;
+				cp_rem.git_cmd = 1;
+				fprintf(stderr, _("A git directory for '%s' is "
+					"found locally with remote(s):\n"),
+					info->sm_name);
+				strvec_pushf(&cp_rem.env_array,
+					     "GIT_DIR=%s", submodule_git_dir);
+				strvec_push(&cp_rem.env_array, "GIT_WORK_TREE=.");
+				strvec_pushl(&cp_rem.args, "remote", "-v", NULL);
+				if (!capture_command(&cp_rem, &sb_rem, 0)) {
+					modify_remote_v(&sb_rem);
+				}
+				error(_("If you want to reuse this local git "
+				      "directory instead of cloning again from\n "
+				      "  %s\n"
+				      "use the '--force' option. If the local "
+				      "git directory is not the correct repo\n"
+				      "or you are unsure what this means choose "
+				      "another name with the '--name' option."),
+				      info->realrepo);
+				return 1;
+			} else {
+				printf(_("Reactivating local git directory for "
+					 "submodule '%s'."), info->sm_path);
+			}
+		}
+		free(submodule_git_dir);
+
+		strvec_push(&clone_args, "clone");
+
+		if (info->quiet)
+			strvec_push(&clone_args, "--quiet");
+
+		if (info->progress)
+			strvec_push(&clone_args, "--progress");
+
+		if (info->prefix)
+			strvec_pushl(&clone_args, "--prefix", info->prefix, NULL);
+		strvec_pushl(&clone_args, "--path", info->sm_path, "--name",
+			     info->sm_name, "--url", info->realrepo, NULL);
+		if (info->reference_path)
+			strvec_pushl(&clone_args, "--reference",
+				     info->reference_path, NULL);
+		if (info->dissociate)
+			strvec_push(&clone_args, "--dissociate");
+
+		if (info->depth >= 0)
+			strvec_pushf(&clone_args, "--depth=%d", info->depth);
+
+		if (module_clone(clone_args.nr, clone_args.v, info->prefix)) {
+			strvec_clear(&clone_args);
+			return -1;
+		}
+
+		prepare_submodule_repo_env(&cp.env_array);
+		cp.git_cmd = 1;
+		cp.dir = info->sm_path;
+		strvec_pushl(&cp.args, "checkout", "-f", "-q", NULL);
+
+		if (info->branch) {
+			strvec_pushl(&cp.args, "-B", info->branch, NULL);
+			strvec_pushf(&cp.args, "origin/%s", info->branch);
+		}
+
+		if (run_command(&cp))
+			die(_("unable to checkout submodule '%s'"), info->sm_path);
+	}
+	return 0;
+}
+
+static void config_added_submodule(struct add_data *info)
+{
+	char *key, *var = NULL;
+	struct child_process cp = CHILD_PROCESS_INIT;
+
+	key = xstrfmt("submodule.%s.url", info->sm_name);
+	git_config_set_gently(key, info->realrepo);
+	free(key);
+
+	cp.git_cmd = 1;
+	strvec_pushl(&cp.args, "add", "--no-warn-embedded-repo", NULL);
+	if (info->force)
+		strvec_push(&cp.args, "--force");
+	strvec_pushl(&cp.args, "--", info->sm_path, ".gitmodules", NULL);
+
+	key = xstrfmt("submodule.%s.path", info->sm_name);
+	git_config_set_in_file_gently(".gitmodules", key, info->sm_path);
+	free(key);
+	key = xstrfmt("submodule.%s.url", info->sm_name);
+	git_config_set_in_file_gently(".gitmodules", key, info->repo);
+	free(key);
+	key = xstrfmt("submodule.%s.branch", info->sm_name);
+	if (info->branch)
+		git_config_set_in_file_gently(".gitmodules", key, info->branch);
+	free(key);
+
+	if (run_command(&cp))
+		die(_("failed to add submodule '%s'"), info->sm_path);
+
+	/*
+	 * NEEDSWORK: In a multi-working-tree world, this needs to be
+	 * set in the per-worktree config.
+	 */
+	if (!git_config_get_string("submodule.active", &var) && var) {
+
+		/*
+		 * If the submodule being adding isn't already covered by the
+		 * current configured pathspec, set the submodule's active flag
+		 */
+		if (!is_submodule_active(the_repository, info->sm_path)) {
+			key = xstrfmt("submodule.%s.active", info->sm_name);
+			git_config_set_gently(key, "true");
+			free(key);
+		}
+	} else {
+		key = xstrfmt("submodule.%s.active", info->sm_name);
+		git_config_set_gently(key, "true");
+		free(key);
+	}
+}
+
+static int module_add(int argc, const char **argv, const char *prefix)
+{
+	const char *branch = NULL, *custom_name = NULL, *realrepo = NULL;
+	const char *reference_path = NULL, *repo = NULL, *name = NULL;
+	char *path;
+	int force = 0, quiet = 0, depth = -1, progress = 0, dissociate = 0;
+	struct add_data info = ADD_DATA_INIT;
+	struct strbuf sb = STRBUF_INIT;
+
+	struct option options[] = {
+		OPT_STRING('b', "branch", &branch, N_("branch"),
+			   N_("branch of repository to add as submodule")),
+		OPT_BOOL('f', "force", &force, N_("allow adding an otherwise "
+						  "ignored submodule path")),
+		OPT__QUIET(&quiet, N_("print only error messages")),
+		OPT_BOOL(0, "progress", &progress, N_("force cloning progress")),
+		OPT_STRING(0, "reference", &reference_path, N_("repository"),
+			   N_("reference repository")),
+		OPT_BOOL(0, "dissociate", &dissociate, N_("borrow the objects from reference repositories")),
+		OPT_STRING(0, "name", &custom_name, N_("name"),
+			   N_("sets the submodule’s name to the given string "
+			      "instead of defaulting to its path")),
+		OPT_INTEGER(0, "depth", &depth, N_("depth for shallow clones")),
+		OPT_END()
+	};
+
+	const char *const usage[] = {
+		N_("git submodule--helper add [<options>] [--] [<path>]"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, options, usage, 0);
+
+	if (!is_writing_gitmodules_ok())
+		die(_("please make sure that the .gitmodules file is in the working tree"));
+
+	if (reference_path && !is_absolute_path(reference_path) && prefix)
+		reference_path = xstrfmt("%s%s", prefix, reference_path);
+
+	if (argc == 0 || argc > 2) {
+		usage_with_options(usage, options);
+	} else if (argc == 1) {
+		repo = argv[0];
+		path = guess_dir_name(repo);
+	} else {
+		repo = argv[0];
+		path = xstrdup(argv[1]);
+	}
+
+	if (!is_absolute_path(path) && prefix)
+		path = xstrfmt("%s%s", prefix, path);
+
+	/* assure repo is absolute or relative to parent */
+	if (starts_with_dot_dot_slash(repo) || starts_with_dot_slash(repo)) {
+		char *remote = get_default_remote();
+		char *remoteurl;
+		struct strbuf sb = STRBUF_INIT;
+
+		if (prefix)
+			die(_("relative path can only be used from the toplevel "
+			      "of the working tree"));
+		/* dereference source url relative to parent's url */
+		strbuf_addf(&sb, "remote.%s.url", remote);
+		if (git_config_get_string(sb.buf, &remoteurl))
+			remoteurl = xgetcwd();
+		realrepo = relative_url(remoteurl, repo, NULL);
+
+		free(remoteurl);
+		free(remote);
+	} else if (is_dir_sep(repo[0]) || strchr(repo, ':')) {
+		realrepo = repo;
+	} else {
+		die(_("repo URL: '%s' must be absolute or begin with ./|../"),
+		      repo);
+	}
+
+	/*
+	 * normalize path:
+	 * multiple //; leading ./; /./; /../;
+	 */
+	normalize_path_copy(path, path);
+	/* strip trailing '/' */
+	if (is_dir_sep(path[strlen(path) -1]))
+		path[strlen(path) - 1] = '\0';
+
+	if (check_sm_exists(force, path))
+		return 1;
+
+	strbuf_addstr(&sb, path);
+	if (is_nonbare_repository_dir(&sb)) {
+		struct object_id oid;
+		if (resolve_gitlink_ref(path, "HEAD", &oid) < 0)
+			die(_("'%s' does not have a commit checked out"), path);
+	}
+
+	if (!force) {
+		struct strbuf sb = STRBUF_INIT;
+		struct child_process cp = CHILD_PROCESS_INIT;
+		cp.git_cmd = 1;
+		cp.no_stdout = 1;
+		strvec_pushl(&cp.args, "add", "--dry-run", "--ignore-missing",
+			     "--no-warn-embedded-repo", path, NULL);
+		if (pipe_command(&cp, NULL, 0, NULL, 0, &sb, 0)) {
+			fprintf(stderr, _("%s"), sb.buf);
+			return 1;
+		}
+		strbuf_release(&sb);
+	}
+
+	name = custom_name ? custom_name : path;
+	if (check_submodule_name(name))
+		die(_("'%s' is not a valid submodule name"), name);
+
+	info.prefix = prefix;
+	info.sm_name = name;
+	info.sm_path = path;
+	info.repo = repo;
+	info.realrepo = realrepo;
+	info.reference_path = reference_path;
+	info.branch = branch;
+	info.depth = depth;
+	info.progress = !!progress;
+	info.dissociate = !!dissociate;
+	info.force = !!force;
+	info.quiet = !!quiet;
+
+	if (add_submodule(&info))
+		return 1;
+	config_added_submodule(&info);
+
+	free(path);
+
+	return 0;
+}
+
 #define SUPPORT_SUPER_PREFIX (1<<0)
 
 struct cmd_struct {
@@ -2777,6 +3165,7 @@ static struct cmd_struct commands[] = {
 	{"config", module_config, 0},
 	{"set-url", module_set_url, 0},
 	{"set-branch", module_set_branch, 0},
+	{"add", module_add, SUPPORT_SUPER_PREFIX},
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index 7ce52872b7..f1cbe4934a 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -146,166 +146,7 @@ cmd_add()
 		shift
 	done
 
-	if ! git submodule--helper config --check-writeable >/dev/null 2>&1
-	then
-		 die "$(eval_gettext "please make sure that the .gitmodules file is in the working tree")"
-	fi
-
-	if test -n "$reference_path"
-	then
-		is_absolute_path "$reference_path" ||
-		reference_path="$wt_prefix$reference_path"
-
-		reference="--reference=$reference_path"
-	fi
-
-	repo=$1
-	sm_path=$2
-
-	if test -z "$sm_path"; then
-		sm_path=$(printf '%s\n' "$repo" |
-			sed -e 's|/$||' -e 's|:*/*\.git$||' -e 's|.*[/:]||g')
-	fi
-
-	if test -z "$repo" || test -z "$sm_path"; then
-		usage
-	fi
-
-	is_absolute_path "$sm_path" || sm_path="$wt_prefix$sm_path"
-
-	# assure repo is absolute or relative to parent
-	case "$repo" in
-	./*|../*)
-		test -z "$wt_prefix" ||
-		die "$(gettext "Relative path can only be used from the toplevel of the working tree")"
-
-		# dereference source url relative to parent's url
-		realrepo=$(git submodule--helper resolve-relative-url "$repo") || exit
-		;;
-	*:*|/*)
-		# absolute url
-		realrepo=$repo
-		;;
-	*)
-		die "$(eval_gettext "repo URL: '\$repo' must be absolute or begin with ./|../")"
-	;;
-	esac
-
-	# normalize path:
-	# multiple //; leading ./; /./; /../; trailing /
-	sm_path=$(printf '%s/\n' "$sm_path" |
-		sed -e '
-			s|//*|/|g
-			s|^\(\./\)*||
-			s|/\(\./\)*|/|g
-			:start
-			s|\([^/]*\)/\.\./||
-			tstart
-			s|/*$||
-		')
-	if test -z "$force"
-	then
-		git ls-files --error-unmatch "$sm_path" > /dev/null 2>&1 &&
-		die "$(eval_gettext "'\$sm_path' already exists in the index")"
-	else
-		git ls-files -s "$sm_path" | sane_grep -v "^160000" > /dev/null 2>&1 &&
-		die "$(eval_gettext "'\$sm_path' already exists in the index and is not a submodule")"
-	fi
-
-	if test -d "$sm_path" &&
-		test -z $(git -C "$sm_path" rev-parse --show-cdup 2>/dev/null)
-	then
-	    git -C "$sm_path" rev-parse --verify -q HEAD >/dev/null ||
-	    die "$(eval_gettext "'\$sm_path' does not have a commit checked out")"
-	fi
-
-	if test -z "$force"
-	then
-	    dryerr=$(git add --dry-run --ignore-missing --no-warn-embedded-repo "$sm_path" 2>&1 >/dev/null)
-	    res=$?
-	    if test $res -ne 0
-	    then
-		 echo >&2 "$dryerr"
-		 exit $res
-	    fi
-	fi
-
-	if test -n "$custom_name"
-	then
-		sm_name="$custom_name"
-	else
-		sm_name="$sm_path"
-	fi
-
-	if ! git submodule--helper check-name "$sm_name"
-	then
-		die "$(eval_gettext "'$sm_name' is not a valid submodule name")"
-	fi
-
-	# perhaps the path exists and is already a git repo, else clone it
-	if test -e "$sm_path"
-	then
-		if test -d "$sm_path"/.git || test -f "$sm_path"/.git
-		then
-			eval_gettextln "Adding existing repo at '\$sm_path' to the index"
-		else
-			die "$(eval_gettext "'\$sm_path' already exists and is not a valid git repo")"
-		fi
-
-	else
-		if test -d ".git/modules/$sm_name"
-		then
-			if test -z "$force"
-			then
-				eval_gettextln >&2 "A git directory for '\$sm_name' is found locally with remote(s):"
-				GIT_DIR=".git/modules/$sm_name" GIT_WORK_TREE=. git remote -v | grep '(fetch)' | sed -e s,^,"  ", -e s,' (fetch)',, >&2
-				die "$(eval_gettextln "\
-If you want to reuse this local git directory instead of cloning again from
-  \$realrepo
-use the '--force' option. If the local git directory is not the correct repo
-or you are unsure what this means choose another name with the '--name' option.")"
-			else
-				eval_gettextln "Reactivating local git directory for submodule '\$sm_name'."
-			fi
-		fi
-		git submodule--helper clone ${GIT_QUIET:+--quiet} ${progress:+"--progress"} --prefix "$wt_prefix" --path "$sm_path" --name "$sm_name" --url "$realrepo" ${reference:+"$reference"} ${dissociate:+"--dissociate"} ${depth:+"$depth"} || exit
-		(
-			sanitize_submodule_env
-			cd "$sm_path" &&
-			# ash fails to wordsplit ${branch:+-b "$branch"...}
-			case "$branch" in
-			'') git checkout -f -q ;;
-			?*) git checkout -f -q -B "$branch" "origin/$branch" ;;
-			esac
-		) || die "$(eval_gettext "Unable to checkout submodule '\$sm_path'")"
-	fi
-	git config submodule."$sm_name".url "$realrepo"
-
-	git add --no-warn-embedded-repo $force "$sm_path" ||
-	die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
-
-	git submodule--helper config submodule."$sm_name".path "$sm_path" &&
-	git submodule--helper config submodule."$sm_name".url "$repo" &&
-	if test -n "$branch"
-	then
-		git submodule--helper config submodule."$sm_name".branch "$branch"
-	fi &&
-	git add --force .gitmodules ||
-	die "$(eval_gettext "Failed to register submodule '\$sm_path'")"
-
-	# NEEDSWORK: In a multi-working-tree world, this needs to be
-	# set in the per-worktree config.
-	if git config --get submodule.active >/dev/null
-	then
-		# If the submodule being adding isn't already covered by the
-		# current configured pathspec, set the submodule's active flag
-		if ! git submodule--helper is-active "$sm_path"
-		then
-			git config submodule."$sm_name".active "true"
-		fi
-	else
-		git config submodule."$sm_name".active "true"
-	fi
+	git ${wt_prefix:+-C "$wt_prefix"} ${prefix:+--super-prefix "$prefix"} submodule--helper add ${force:+--force} ${GIT_QUIET:+--quiet} ${progress:+--progress} ${branch:+--branch "$branch"} ${reference_path:+--reference "$reference_path"} ${dissociate:+--dissociate} ${custom_name:+--name "$custom_name"} ${depth:+"$depth"} -- "$@"
 }
 
 #
diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index fec7e0299d..4ab8298385 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -48,7 +48,7 @@ test_expect_success 'submodule update aborts on missing gitmodules url' '
 
 test_expect_success 'add aborts on repository with no commits' '
 	cat >expect <<-\EOF &&
-	'"'repo-no-commits'"' does not have a commit checked out
+	fatal: '"'repo-no-commits'"' does not have a commit checked out
 	EOF
 	git init repo-no-commits &&
 	test_must_fail git submodule add ../a ./repo-no-commits 2>actual &&
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v2 3/3] t7400: add test to check 'submodule add' for tracked paths
  2020-10-07  7:45 [PATCH v2 0/3] submodule: port subcommand add from shell to C Shourya Shukla
  2020-10-07  7:45 ` [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
  2020-10-07  7:45 ` [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
@ 2020-10-07  7:45 ` Shourya Shukla
  2020-11-19  0:03 ` [PATCH v2 0/3] submodule: port subcommand add from shell to C Josh Steadmon
  3 siblings, 0 replies; 16+ messages in thread
From: Shourya Shukla @ 2020-10-07  7:45 UTC (permalink / raw)
  To: git
  Cc: gitster, christian.couder, kaartic.sivaraam, Johannes.Schindelin,
	liu.denton, Shourya Shukla

Add test to check if 'git submodule add' works on paths which are
tracked by Git.

Helped-by: Christian Couder <christian.couder@gmail.com>
Helped-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
---
 t/t7400-submodule-basic.sh | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 4ab8298385..d9317192e0 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -193,6 +193,17 @@ test_expect_success 'submodule add to .gitignored path with --force' '
 	)
 '
 
+test_expect_success 'submodule add to path with tracked contents fails' '
+	(
+		cd addtest-ignore &&
+		mkdir track &&
+		git add -f track &&
+		git commit -m "add tracked path" &&
+		! git submodule add "$submodurl" submod >output 2>&1 &&
+		test_file_not_empty output
+	)
+'
+
 test_expect_success 'submodule add to reconfigure existing submodule with --force' '
 	(
 		cd addtest-ignore &&
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()'
  2020-10-07  7:45 ` [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
@ 2020-10-07 18:05   ` Junio C Hamano
  2020-10-12 10:11     ` Shourya Shukla
  2020-11-18 23:25   ` Emily Shaffer
  1 sibling, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2020-10-07 18:05 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: git, christian.couder, kaartic.sivaraam, Johannes.Schindelin,
	liu.denton

Shourya Shukla <shouryashukla.oo@gmail.com> writes:

> diff --git a/dir.h b/dir.h
> index a3c40dec51..e46f240528 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -370,6 +370,15 @@ int read_directory(struct dir_struct *, struct index_state *istate,
>  		   const char *path, int len,
>  		   const struct pathspec *pathspec);
>  
> +enum exist_status {
> +	index_nonexistent = 0,
> +	index_directory,
> +	index_gitdir
> +};

These were adequate as private names used within the wall of dir.c,
but I doubt that they are named specific enough to stand out as
public symbols.

Unlike say "index_state" (the name of a struct type), whose nature
is quite global to any code that wants to access the in-core index,
"index_directory" is *NOT* such a name.  It is only of interest to
those who want to see "I have a directory name---does it appear as a
directory in the index?".  It is not even interesting to those who
want to ask similar and related questions like "I have this
pathname---does it appear as anything in the index, and if so what
type of entry is it?".  A worse part of this is that even if such a
helper function file_exists_in_index() were to be written, the
"exist_status" enum won't be usable to return the answer that
question, but yet the enum squats on a perfectly good name to
express "status" for the whole class that it does not represent.

So, NAK.  We need to come up with a better name for these symbols if
we were to expose them to the outside world.  The only good name
this patch makes public is "directory_exists_in_index()", which is
specific enough.

> +enum exist_status directory_exists_in_index(struct index_state *istate,
> +					    const char *dirname, int len);
> +
>  enum pattern_match_result {
>  	UNDECIDED = -1,
>  	NOT_MATCHED = 0,

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C
  2020-10-07  7:45 ` [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
@ 2020-10-07 18:37   ` Junio C Hamano
  2020-10-07 22:19   ` Junio C Hamano
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2020-10-07 18:37 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: git, christian.couder, kaartic.sivaraam, Johannes.Schindelin,
	liu.denton, Prathamesh Chavan, Christian Couder, Stefan Beller

Shourya Shukla <shouryashukla.oo@gmail.com> writes:

> From: Prathamesh Chavan <pc44800@gmail.com>
>
> Convert submodule subcommand 'add' to a builtin and call it via
> 'git-submodule.sh'.
>
> Also, since the command die()s out in case of absence of commits in the
> submodule, the keyword 'fatal' is prefixed in the error messages.
> Therefore, prepend the keyword in the expected output of test t7400.6.
>
> While at it, eliminate the extra preprocessor directive
> `#include "dir.h"` at the start of 'submodule--helper.c'.
>
> Mentored-by: Christian Couder <chriscool@tuxfamily.org>
> Mentored-by: Stefan Beller <stefanbeller@gmail.com>
> Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
> Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
> Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com>
> ---
>  builtin/submodule--helper.c | 391 +++++++++++++++++++++++++++++++++++-
>  git-submodule.sh            | 161 +--------------
>  t/t7400-submodule-basic.sh  |   2 +-
>  3 files changed, 392 insertions(+), 162 deletions(-)

Whoa.  That looks like a huge change.  Makes me wonder if we want
this split into multiple pieces, but let's read on.

> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
> index de5ad73bb8..ec0a50d032 100644
> --- a/builtin/submodule--helper.c
> +++ b/builtin/submodule--helper.c
> @@ -19,7 +19,6 @@
>  #include "diffcore.h"
>  #include "diff.h"
>  #include "object-store.h"
> -#include "dir.h"
>  #include "advice.h"
>  
>  #define OPT_QUIET (1 << 0)
> @@ -2744,6 +2743,395 @@ static int module_set_branch(int argc, const char **argv, const char *prefix)
>  	return !!ret;
>  }
>  
> +struct add_data {
> +	const char *prefix;
> +	const char *branch;
> +	const char *reference_path;
> +	const char *sm_path;
> +	const char *sm_name;
> +	const char *repo;
> +	const char *realrepo;
> +	int depth;
> +	unsigned int force: 1;
> +	unsigned int quiet: 1;
> +	unsigned int progress: 1;
> +	unsigned int dissociate: 1;
> +};
> +#define ADD_DATA_INIT { NULL, NULL, NULL, NULL, NULL, NULL, NULL, 0, 0, 0, 0, 0 }

This is used in a context like this:

	struct add_data data = ADD_DATA_INIT;

It is a tangent, but wouldn't

	#define ADD_DATA_INIT { 0 }

be a more appropriate way to express that there is nothing other
than the initialization to zero values going on?

> +/*
> + * Guess dir name from repository: strip leading '.*[/:]',
> + * strip trailing '[:/]*.git'.
> + */

The original also strips trailing '/'.  The original does these in
order:

 - if $repo ends with '/', remove that.  The above description does
   not mention it.

 - if the result of the above ends with zero or more ':', followed
   by zero or more '/', followed by ".git", drop the matching part.
   The above description sounds as if it will remove ":/:/:.git"
   from the end (and the code seems to have the same bug, as
   after_slash_or_colon won't allow the code to know if the previous
   character before ".git" was slash or colon).

 - if the result of the above has '/' or ':' in it, remove everything
   before it and '/' or ':' itself.

> +static char *guess_dir_name(const char *repo)
> +{
> +	const char *p, *start, *end, *limit;
> +	int after_slash_or_colon;
> +
> +	after_slash_or_colon = 0;
> +	limit = repo + strlen(repo);
> +	start = repo;
> +	end = limit;
> +	for (p = repo; p < limit; p++) {
> +		if (starts_with(p, ".git")) {
> +			/* strip trailing '[:/]*.git' */
> +			if (!after_slash_or_colon)
> +				end = p;
> +			p += 3;
> +		} else if (*p == '/' || *p == ':') {
> +			/* strip leading '.*[/:]' */
> +			if (end == limit)
> +				end = p;
> +			after_slash_or_colon = 1;
> +		} else if (after_slash_or_colon) {
> +			start = p;
> +			end = limit;
> +			after_slash_or_colon = 0;
> +		}
> +	}
> +	return xstrndup(start, end - start);

So, this looks quite bogus and unnatural.  Checking for ".git" at
every position in the string is meaningless.

I wonder if something along the following (beware: not even compile
tested or checked for off-by-ones) would be easier to follow and
more faithful conversion to the original.

	ep = repo + strlen(repo);

        /*
         * eat trailing slashes - a conversion less faithful to
         * the original may want to loop to cull duplicated trailing
	 * slashes, but we can leave it as user-error for now.
	 */
	if (repo < ep - 1 && ep[-1] == '/')
		ep--;

	/* eat ":*/*\.git" at the tail */
	if (repo < ep - 4 && !memcmp(".git", ep - 4, 4)) {
		ep -= 4;
		while (repo < ep - 1 && ep[-1] == '/')
			ep--;
		while (repo < ep - 1 && ep[-1] == ':')
			ep--;
	}

	/* find the last ':' or '/' */
	for (sp = ep - 1; repo <= sp; sp--) {
		if (*sp == '/' || *sp == ':')
			break;
	}
        sp++; /* exclude '/' or ':' itself */

        /* sp point at the beginning, and ep points at the end */
	return xmemdupz(sp, ep - sp);

> +}

That's it for now; I didn't look at the remainder of this patch
during this sitting before I have to move on, but I may revisit the
rest at some other time.

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C
  2020-10-07  7:45 ` [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
  2020-10-07 18:37   ` Junio C Hamano
@ 2020-10-07 22:19   ` Junio C Hamano
  2020-10-08 17:19   ` Junio C Hamano
  2020-10-09  5:09   ` Junio C Hamano
  3 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2020-10-07 22:19 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: git, christian.couder, kaartic.sivaraam, Johannes.Schindelin,
	liu.denton, Prathamesh Chavan, Christian Couder, Stefan Beller

Shourya Shukla <shouryashukla.oo@gmail.com> writes:

> +static void fprintf_submodule_remote(const char *str)

The fact that the helper happens to use fprintf() to do its job is
much less important than it writes to the standard error stream.
Name it after what it does than how it does so.  Is there a word
that explains at a higher-level concept than "print to stderr" that
this function tries to achieve?  

Same question for the name of the only caller of this function,
modify_remote_v().  That name does not mean anything to readers
other than that it futz with output from "remote -v" command, which
is the least interesting piece of information.  What does it try to
achieve by using "remote -v"?  Can we name the function after that?

> +{
> +	const char *p = str;
> +	const char *start;
> +	const char *end;
> +	char *name, *url;
> +
> +	start = p;
> +	while (*p != ' ')
> +		p++;
> +	end = p;
> +	name = xstrndup(start, end - start);
> +	while(*p == ' ')
> +		p++;

Perhaps make a small helper out of these seven lines, so that the
caller can say something like

    p = str;
    name = parse_token(&p);
    url = parse_token(&p);

This one you should be able to do without any extra allocation,
though.  Just write a parse_token() that finds start and length,
prepare "char *name; int namelen" and the same pair for URL,
and then

	fprintf(stderr, "  %.*s\t%.*s\n",
		namelen, name, urllen, url);

> +	fprintf(stderr, "  %s\t%s\n", name, url);
> +	free(name);
> +	free(url);
> +}

> +static int check_sm_exists(unsigned int force, const char *path) {
> +
> +	int cache_pos, dir_in_cache = 0;

Have a blank line here to separate decl and the first statement.

> +	if (read_cache() < 0)
> +		die(_("index file corrupt"));
> +
> +	cache_pos = cache_name_pos(path, strlen(path));
> +	if(cache_pos < 0 && (directory_exists_in_index(&the_index,
> +	   path, strlen(path)) == index_directory))
> +		dir_in_cache = 1;

Funny line wrapping.  Try to cut long line at an operator as close
to the root of the parse tree (in this case, &&) as possible, i.e.

	if (cache_pos < 0 &&
	    directory_exists_in_index(&the_index, path, strlen(path)) == index_directory)

It is OK to further wrap after == if the second line bothers you.

A bigger question.  Can the path be a regular file but at a higher
stage because we are in the middle of a conflicted merge?  We'd get
cache_pos that is negative in that case, too, and we definitely would
want to say the path already exists in the index in such a case, but ...

> +
> +	if (!force) {
> +		if (cache_pos >= 0 || dir_in_cache)
> +			die(_("'%s' already exists in the index"), path);

... the current code may not trigger this die() in such a case, no?

> +	} else {
> +		struct cache_entry *ce = NULL;
> +		if (cache_pos >= 0)
> +			ce = the_index.cache[cache_pos];
> +		if (dir_in_cache || (ce && !S_ISGITLINK(ce->ce_mode)))
> +			die(_("'%s' already exists in the index and is not a "
> +			      "submodule"), path);

Likewise here.  cache_pos < 0 does not automatically mean it does
not exist.  It tells you that it does not exist as a merged entry.

> +	}
> +	return 0;
> +}
> +
> +static void modify_remote_v(struct strbuf *sb)

This roughly corresponds to this part of the original

    grep '(fetch)' | sed -e s,^,"  ", -e s,' (fetch)',,

I actualy would suggest moving the "git remote -v" invocation and
capturing of its output to this helper function and name it after
what it does, which seems to be "show fetch remotes" to me.

> +{
> +	int i;
> +	for (i = 0; i < sb->len; i++) {
> +		const char *start = sb->buf + i;
> +		const char *end = start;
> +		while (sb->buf[i++] != '\n')
> +			end++;
> +		if (!strcmp("fetch", xstrndup(end - 6, 5)))

The original makes sure the 'fetch' appears inside "()" but this
does not.  Any reason why we want to do it differently?

> +			fprintf_submodule_remote(xstrndup(start, end - start - 7));

The result of xstrndup() is leaking here.

In any case, with a helper function like parse_token() suggested
before, you can get rid of the fprintf_submodule_remote() helper and
open code it here, without any temporary allocation and freeing.
You'd have the start of each line of "git remote -v" output (so you
know where it starts and it ends), and a parser that roughly does
this:

	/*
	 * cp points at the current location, and ep points the
	 * end of the buffer.  find the tail of the current string
	 * and store its length in *len, skip over whitespaces and
	 * return the location to be used as the new cp.
	 */
	const char *parse_token(char *cp, char *ep, int *len);

and make the latter half of this function (former half would be
spawning "remote -v" and capturing its output in sb) a loop whose
body may look like

	{
		char *end, *name, *url, *tail;
		int namelen, urllen;

		end = strchrnul(start, '\n');
		name = start;
		url = parse_token(name, end, &namelen);
		tail = parse_token(url, end, &urllen);
		if (!memcmp(tail, "(fetch)", 7))
			fprintf(stderr, ...); /* see above */
		start = *end ? end + 1 : end;
	}

> +static int add_submodule(struct add_data *info)
> +{
> +	/* perhaps the path exists and is already a git repo, else clone it */
> +	if (is_directory(info->sm_path)) {
> +		char *sub_git_path = xstrfmt("%s/.git", info->sm_path);
> +		if (is_directory(sub_git_path) || file_exists(sub_git_path))
> +			printf(_("Adding existing repo at '%s' to the index\n"),
> +				 info->sm_path);
> +		else
> +			die(_("'%s' already exists and is not a valid git repo"),
> +			      info->sm_path);
> +		free(sub_git_path);
> +	} else {
> +		struct strvec clone_args = STRVEC_INIT;
> +		struct child_process cp = CHILD_PROCESS_INIT;
> +		char *submodule_git_dir = xstrfmt(".git/modules/%s", info->sm_name);
> +
> +		if (is_directory(submodule_git_dir)) {
> +			if (!info->force) {

As I said, it would be a better organization to have a helper
function that does what is done from here ...

> +				struct child_process cp_rem = CHILD_PROCESS_INIT;
> +				struct strbuf sb_rem = STRBUF_INIT;
> +				cp_rem.git_cmd = 1;
> +				fprintf(stderr, _("A git directory for '%s' is "
> +					"found locally with remote(s):\n"),
> +					info->sm_name);
> +				strvec_pushf(&cp_rem.env_array,
> +					     "GIT_DIR=%s", submodule_git_dir);
> +				strvec_push(&cp_rem.env_array, "GIT_WORK_TREE=.");
> +				strvec_pushl(&cp_rem.args, "remote", "-v", NULL);
> +				if (!capture_command(&cp_rem, &sb_rem, 0)) {
> +					modify_remote_v(&sb_rem);
> +				}

... up to here.  Can you say what the purpose of that helper
function?  I'd say it is for a given git repository (specified by
submodule_git_dir, which we will pass as the parameter to that
helper), report the names and URLs of fetch remotes defined in that
repository.  So, perhaps its signature might be:

	static void report_fetch_remotes(FILE *output, const char *git_dir);

where we would make a call to it from here like so:

				report_fetch_remotes(stderr, submodule_git_dir);

> +				error(_("If you want to reuse this local git "
> +				      "directory instead of cloning again from\n "
> +				      "  %s\n"
> +				      "use the '--force' option. If the local "
> +				      "git directory is not the correct repo\n"
> +				      "or you are unsure what this means choose "
> +				      "another name with the '--name' option."),
> +				      info->realrepo);
> +				return 1;
> +			} else {
> +				printf(_("Reactivating local git directory for "
> +					 "submodule '%s'."), info->sm_path);
> +			}
> +		}
> +		free(submodule_git_dir);

I think I've reviewed up to this point this time around.

Thanks.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C
  2020-10-07  7:45 ` [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
  2020-10-07 18:37   ` Junio C Hamano
  2020-10-07 22:19   ` Junio C Hamano
@ 2020-10-08 17:19   ` Junio C Hamano
  2020-10-09  5:09   ` Junio C Hamano
  3 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2020-10-08 17:19 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: git, christian.couder, kaartic.sivaraam, Johannes.Schindelin,
	liu.denton, Prathamesh Chavan, Christian Couder, Stefan Beller

Shourya Shukla <shouryashukla.oo@gmail.com> writes:

> +static void config_added_submodule(struct add_data *info)
> +{

This one I may take a look at later, but won't review in this
message.

> +}
> +
> +static int module_add(int argc, const char **argv, const char *prefix)
> +{
> +	const char *branch = NULL, *custom_name = NULL, *realrepo = NULL;
> +	const char *reference_path = NULL, *repo = NULL, *name = NULL;
> +	char *path;
> +	int force = 0, quiet = 0, depth = -1, progress = 0, dissociate = 0;
> +	struct add_data info = ADD_DATA_INIT;
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	struct option options[] = {
> +		OPT_STRING('b', "branch", &branch, N_("branch"),
> +			   N_("branch of repository to add as submodule")),
> +		OPT_BOOL('f', "force", &force, N_("allow adding an otherwise "
> +						  "ignored submodule path")),
> +		OPT__QUIET(&quiet, N_("print only error messages")),
> +		OPT_BOOL(0, "progress", &progress, N_("force cloning progress")),
> +		OPT_STRING(0, "reference", &reference_path, N_("repository"),
> +			   N_("reference repository")),
> +		OPT_BOOL(0, "dissociate", &dissociate, N_("borrow the objects from reference repositories")),
> +		OPT_STRING(0, "name", &custom_name, N_("name"),
> +			   N_("sets the submodule’s name to the given string "
> +			      "instead of defaulting to its path")),
> +		OPT_INTEGER(0, "depth", &depth, N_("depth for shallow clones")),
> +		OPT_END()
> +	};
> +
> +	const char *const usage[] = {
> +		N_("git submodule--helper add [<options>] [--] [<path>]"),
> +		NULL
> +	};
> +
> +	argc = parse_options(argc, argv, prefix, options, usage, 0);
> +
> +	if (!is_writing_gitmodules_ok())
> +		die(_("please make sure that the .gitmodules file is in the working tree"));
> +
> +	if (reference_path && !is_absolute_path(reference_path) && prefix)

Checking "*prefix" lets us avoid an unnecessary allocation, i.e.

	if (prefix && *prefix &&
	    reference_path && !is_absolute_path(reference_path))

> +		reference_path = xstrfmt("%s%s", prefix, reference_path);
> +
> +	if (argc == 0 || argc > 2) {

Nice that you are checking excess args, which the original didn't do.

> +		usage_with_options(usage, options);
> +	} else if (argc == 1) {
> +		repo = argv[0];
> +		path = guess_dir_name(repo);

We've reviewed the function already.  Good.

> +	} else {
> +		repo = argv[0];
> +		path = xstrdup(argv[1]);

OK.  So after this if/else if/else cascade, path is an allocated
piece of memory we could later free() whichever branch is taken.

> +	}
> +
> +	if (!is_absolute_path(path) && prefix)
> +		path = xstrfmt("%s%s", prefix, path);

This also makes path freeable, but the original path is leaked.

	if (prefix && *prefix && !is_absolute_path(path)) {
		free(path);
		path = xstrfmt(...);
	}

Is there a reason (does not have to be a strong reason) why we use
'path', not 'sm_path', as the variable name that corresponds to
$sm_path in the original, by the way?

> +	/* assure repo is absolute or relative to parent */
> +	if (starts_with_dot_dot_slash(repo) || starts_with_dot_slash(repo)) {
> +		char *remote = get_default_remote();
> +		char *remoteurl;
> +		struct strbuf sb = STRBUF_INIT;
> +
> +		if (prefix)
> +			die(_("relative path can only be used from the toplevel "
> +			      "of the working tree"));

This is 'git submodule--helper resolve-relative-url "$repo"' in the
original.

> +		/* dereference source url relative to parent's url */
> +		strbuf_addf(&sb, "remote.%s.url", remote);
> +		if (git_config_get_string(sb.buf, &remoteurl))
> +			remoteurl = xgetcwd();
> +		realrepo = relative_url(remoteurl, repo, NULL);

And this is copied-and-pasted from resolve_relative_url() function
found in builtins/submodule--helper.c.

relative_url() returns an allocated memory so we can free() realrepo
if we took this branch.

> +		free(remoteurl);
> +		free(remote);
> +	} else if (is_dir_sep(repo[0]) || strchr(repo, ':')) {
> +		realrepo = repo;

This repo came from argv[0] so we cannot free realrepo if we took
this branch.  Are we willing to leak realrepo we obtained from the
other branch?

> +	} else {
> +		die(_("repo URL: '%s' must be absolute or begin with ./|../"),
> +		      repo);
> +	}
> +
> +	/*
> +	 * normalize path:
> +	 * multiple //; leading ./; /./; /../;
> +	 */
> +	normalize_path_copy(path, path);

It's nice that a handy (almost) equivalent helper is already
available ;-)

> +	/* strip trailing '/' */
> +	if (is_dir_sep(path[strlen(path) -1]))
> +		path[strlen(path) - 1] = '\0';

The original dealt with multiple trailing '/' but this one does not.
Shouldn't it loop starting at the end?

> +	if (check_sm_exists(force, path))
> +		return 1;

OK.  I think we reviewed the function.  Seeing it in the context of
the calling site makes us realize that it has a wrong name.  "check
submodule exists" sounds as if we expect a submodule to exist at the
path, and it is an error for a submodule not to be there, but that
is not what this caller (which is the only caller of the helper)
wants to check.  And more importantly, the helper reacts to anything
sitting at the path, not just submoudle.

So what does the helper really do?  I think it checks if it is OK to
create a submodule there.  IOW, "exists" part of the name is what
makes it a misnomer.  Perhaps "can_create_submodule()"?

> +	strbuf_addstr(&sb, path);
> +	if (is_nonbare_repository_dir(&sb)) {
> +		struct object_id oid;
> +		if (resolve_gitlink_ref(path, "HEAD", &oid) < 0)
> +			die(_("'%s' does not have a commit checked out"), path);
> +	}
> +
> +	if (!force) {
> +		struct strbuf sb = STRBUF_INIT;
> +		struct child_process cp = CHILD_PROCESS_INIT;
> +		cp.git_cmd = 1;
> +		cp.no_stdout = 1;
> +		strvec_pushl(&cp.args, "add", "--dry-run", "--ignore-missing",
> +			     "--no-warn-embedded-repo", path, NULL);
> +		if (pipe_command(&cp, NULL, 0, NULL, 0, &sb, 0)) {
> +			fprintf(stderr, _("%s"), sb.buf);

Sorry, but I cannot guess what this _("%s") is trying to achieve.
Shouldn't it be
			strbuf_complete_line(&sb);
			fputs(sb.buf, stderr);
instead?

> +			return 1;

The original honors the exit code from the dry-run and relays it to
the user.  Is this a regression, or nobody care what exit status
they get as long as it is not zero?

> +		}
> +		strbuf_release(&sb);
> +	}
> +
> +	name = custom_name ? custom_name : path;
> +	if (check_submodule_name(name))
> +		die(_("'%s' is not a valid submodule name"), name);
> +
> +	info.prefix = prefix;
> +	info.sm_name = name;
> +	info.sm_path = path;
> +	info.repo = repo;
> +	info.realrepo = realrepo;
> +	info.reference_path = reference_path;
> +	info.branch = branch;
> +	info.depth = depth;
> +	info.progress = !!progress;
> +	info.dissociate = !!dissociate;
> +	info.force = !!force;
> +	info.quiet = !!quiet;
> +
> +	if (add_submodule(&info))
> +		return 1;

I think we've reviewed this funciton already.

> +	config_added_submodule(&info);
> +
> +	free(path);

Looking a bit uneven wrt to leak handling.

> +	return 0;
> +}

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C
  2020-10-07  7:45 ` [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
                     ` (2 preceding siblings ...)
  2020-10-08 17:19   ` Junio C Hamano
@ 2020-10-09  5:09   ` Junio C Hamano
  2020-11-18 23:13     ` Jonathan Tan
  3 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2020-10-09  5:09 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: git, christian.couder, kaartic.sivaraam, Johannes.Schindelin,
	liu.denton, Prathamesh Chavan, Christian Couder, Stefan Beller

Shourya Shukla <shouryashukla.oo@gmail.com> writes:

> +static void config_added_submodule(struct add_data *info)
> +{
> +	char *key, *var = NULL;
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +
> +	key = xstrfmt("submodule.%s.url", info->sm_name);
> +	git_config_set_gently(key, info->realrepo);
> +	free(key);
> +
> +	cp.git_cmd = 1;
> +	strvec_pushl(&cp.args, "add", "--no-warn-embedded-repo", NULL);
> +	if (info->force)
> +		strvec_push(&cp.args, "--force");
> +	strvec_pushl(&cp.args, "--", info->sm_path, ".gitmodules", NULL);

Hmph, you lost me.  I think this ought to correspond to this part of
the original:

	git add --no-warn-embedded-repo $force "$sm_path" ||
	die "$(eval_gettext "Failed to add submodule '\$sm_path'")"

I can see that adding "--" before $sm_path may be an improvement,
but why do we also add .gitmodules here, and ...

> +	key = xstrfmt("submodule.%s.path", info->sm_name);
> +	git_config_set_in_file_gently(".gitmodules", key, info->sm_path);
> +	free(key);
> +	key = xstrfmt("submodule.%s.url", info->sm_name);
> +	git_config_set_in_file_gently(".gitmodules", key, info->repo);
> +	free(key);
> +	key = xstrfmt("submodule.%s.branch", info->sm_name);
> +	if (info->branch)
> +		git_config_set_in_file_gently(".gitmodules", key, info->branch);
> +	free(key);

... perform quite a lot of configuration writing, before actually
spawning the "git add" and make sure it succeeds?  The original
won't futz with any of these .gitmodules entries if "git add" of the
$sm_path fails and that is a good discipline to follow.

> +	if (run_command(&cp))
> +		die(_("failed to add submodule '%s'"), info->sm_path);
> +
> +	/*
> +	 * NEEDSWORK: In a multi-working-tree world, this needs to be
> +	 * set in the per-worktree config.
> +	 */
> +	if (!git_config_get_string("submodule.active", &var) && var) {

What if this were a valueless true ("[submodule] active\n" without
"= true")?  Wouldn't get_string() fail?

> +		/*
> +		 * If the submodule being adding isn't already covered by the
> +		 * current configured pathspec, set the submodule's active flag
> +		 */
> +		if (!is_submodule_active(the_repository, info->sm_path)) {
> +			key = xstrfmt("submodule.%s.active", info->sm_name);
> +			git_config_set_gently(key, "true");
> +			free(key);
> +		}
> +	} else {
> +		key = xstrfmt("submodule.%s.active", info->sm_name);
> +		git_config_set_gently(key, "true");
> +		free(key);
> +	}
> +}
> +
> + ...
> +	info.prefix = prefix;
> +	info.sm_name = name;
> +	info.sm_path = path;
> +	info.repo = repo;
> +	info.realrepo = realrepo;
> +	info.reference_path = reference_path;
> +	info.branch = branch;
> +	info.depth = depth;
> +	info.progress = !!progress;
> +	info.dissociate = !!dissociate;
> +	info.force = !!force;
> +	info.quiet = !!quiet;
> +
> +	if (add_submodule(&info))
> +		return 1;
> +	config_added_submodule(&info);

Whew.

This was way too big to be reviewed in a single sitting.  I do not
know offhand if there is a better way to structure the changes into
a more digestible pieces to help prevent reviewers from overlooking
potential mistakes, though.

Thanks.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()'
  2020-10-07 18:05   ` Junio C Hamano
@ 2020-10-12 10:11     ` Shourya Shukla
  0 siblings, 0 replies; 16+ messages in thread
From: Shourya Shukla @ 2020-10-12 10:11 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, christian.couder, kaartic.sivaraam, Johannes.Schindelin,
	liu.denton

On 07/10 11:05, Junio C Hamano wrote:
> Shourya Shukla <shouryashukla.oo@gmail.com> writes:
> 
> > diff --git a/dir.h b/dir.h
> > index a3c40dec51..e46f240528 100644
> > --- a/dir.h
> > +++ b/dir.h
> > @@ -370,6 +370,15 @@ int read_directory(struct dir_struct *, struct index_state *istate,
> >  		   const char *path, int len,
> >  		   const struct pathspec *pathspec);
> >  
> > +enum exist_status {
> > +	index_nonexistent = 0,
> > +	index_directory,
> > +	index_gitdir
> > +};
> 
> These were adequate as private names used within the wall of dir.c,
> but I doubt that they are named specific enough to stand out as
> public symbols.
> 
> Unlike say "index_state" (the name of a struct type), whose nature
> is quite global to any code that wants to access the in-core index,
> "index_directory" is *NOT* such a name.  It is only of interest to
> those who want to see "I have a directory name---does it appear as a
> directory in the index?".  It is not even interesting to those who
> want to ask similar and related questions like "I have this
> pathname---does it appear as anything in the index, and if so what
> type of entry is it?".  A worse part of this is that even if such a
> helper function file_exists_in_index() were to be written, the
> "exist_status" enum won't be usable to return the answer that
> question, but yet the enum squats on a perfectly good name to
> express "status" for the whole class that it does not represent.
> 
> So, NAK.  We need to come up with a better name for these symbols if
> we were to expose them to the outside world.  The only good name
> this patch makes public is "directory_exists_in_index()", which is
> specific enough.

Understandable. So, how would it be if the function I wrote in
'submodule--helper.c' could instead be written here (in dir.c) and
would help to check if the path is there in the index or not. It could be anything
ranging from a filename to a SM. Since the return type will be an
integer, this PATCH 1/3 can be eliminated and we won't need to change
the scope of the 'directory_exists_in_index()' function and the enum can
stay visible only in dir.c

What are your thoughts on this?


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C
  2020-10-09  5:09   ` Junio C Hamano
@ 2020-11-18 23:13     ` Jonathan Tan
  2020-11-19  7:44       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 16+ messages in thread
From: Jonathan Tan @ 2020-11-18 23:13 UTC (permalink / raw)
  To: gitster
  Cc: shouryashukla.oo, git, christian.couder, kaartic.sivaraam,
	Johannes.Schindelin, liu.denton, pc44800, chriscool, stefanbeller,
	Jonathan Tan

> Whew.
> 
> This was way too big to be reviewed in a single sitting.  I do not
> know offhand if there is a better way to structure the changes into
> a more digestible pieces to help prevent reviewers from overlooking
> potential mistakes, though.
> 
> Thanks.

I just took a look at this, and one thing that would have helped is if
you ported the end of the function first in a commit, and work your way
backwards (in one or more commits).

After reading through the whole thing, I saw that this is mostly a
straightforward start-to-finish port (besides factoring out code into
functions), but it would be much easier for reviewers to conceptualize
and discuss the different parts if they were already divided.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()'
  2020-10-07  7:45 ` [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
  2020-10-07 18:05   ` Junio C Hamano
@ 2020-11-18 23:25   ` Emily Shaffer
  1 sibling, 0 replies; 16+ messages in thread
From: Emily Shaffer @ 2020-11-18 23:25 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: git, gitster, christian.couder, kaartic.sivaraam,
	Johannes.Schindelin, liu.denton

Hi,

On Wed, Oct 07, 2020 at 01:15:36PM +0530, Shourya Shukla wrote:
> 
> Change the scope of the function 'directory_exists_in_index()' as well
> as declare it in 'dir.h'.
> 
> Since the return type of the function is the enumerator 'exist_status',
> change its scope as well and declare it in 'dir.h'.

I don't have comments about the diff itself beyond what Junio mentioned
- it's very simple. But I do think this commit message needs a rewrite.

Your commit message summarizes the diff - which isn't useful, because
the diff itself is very simple. But what it fails to do is what I'm a
lot more interested in, reading this change: *why* do you want to make
this function and enum reusable? I think you mention it in the cover
letter, but it's not explained at all here.

Explaining the motivation in the cover letter also would help us
understand whether it is better to make the enum public, like your diff
proposes, or to wrap or change the function and avoid exposing the enum,
like you suggested in reply to Junio's comment.

Lastly, saying something like "This change is needed so that git commit
can sort ducks by feather length" helps avoid
https://en.wikipedia.org/wiki/XY_problem - that is, maybe we already
have another tool which is more appropriate, and which you missed; and
knowing your motivation, someone can point you in that direction
instead.

The same comment holds true for your patch 3, as well.

Thanks for your effort on this series.

 - Emily

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 0/3] submodule: port subcommand add from shell to C
  2020-10-07  7:45 [PATCH v2 0/3] submodule: port subcommand add from shell to C Shourya Shukla
                   ` (2 preceding siblings ...)
  2020-10-07  7:45 ` [PATCH v2 3/3] t7400: add test to check 'submodule add' for tracked paths Shourya Shukla
@ 2020-11-19  0:03 ` Josh Steadmon
  3 siblings, 0 replies; 16+ messages in thread
From: Josh Steadmon @ 2020-11-19  0:03 UTC (permalink / raw)
  To: Shourya Shukla
  Cc: git, gitster, christian.couder, kaartic.sivaraam,
	Johannes.Schindelin, liu.denton

Hi Shourya,

Thank you for this series! Please see the comments below:


On 2020.10.07 13:15, Shourya Shukla wrote:
> Hello all,
> 
> This is the v2 of the patch with the same title, delivered more than a
> month ago as a part of my GSoC. Link to v1:
> https://lore.kernel.org/git/20200824090359.403944-1-shouryashukla.oo@gmail.com/

Since GSoC has ended for the year, I wanted to point out the
git-mentoring@googlegroups.com list, where you can find additional
mentors if you like.

> The changelog is as follows:
> 
>     1. Introduce PATCH[1/3](dir: change the scope of function
>        'directory_exists_in_index()', 2020-10-06). This was done since
>        the above mentioned function will be used in the patch that
>        follows.
> 
>     2. There are multiple changes in this commit:
> 
>             A. Improve the part which checks if the 'path' given as
>                argument exists or not. Implementing Kaartic's
>                suggestions on the patch, I had to make sure that the
>                case for checking if the path has tracked contents or
>                not also works.
> 
>             B. Also, wrap the aforementioned segment in a function
>                since it became very long. The function is called
>                'check_sm_exists()'.
> 
>             C. Also, use the function 'is_nonbare_repository_dir()'
>                instead of 'is_directory()' when trying to resolve
>                gitlink.
> 
>             D. Append keyword 'fatal' in front of the expected output of
>                test t7400.6 since the command die()s out in case of
>                absence of commits in a submodule.
> 
>             E. Remove the extra `#include "dir.h"` from
>                'submodule--helper.c'.
> 
>     3. Introduce PATCH[3/3] (t7400: add test to check 'submodule add'
>        for tracked paths, 2020-10-07). Kaartic pointed out that a test
>        for path with tracked contents did not exist and hence it was
>        necessary to write one. Therefore, this commit introduces a new
>        test 't7400.18: submodule add to path with tracked contents
>        fails'.

Generally, we want to avoid describing in detail what the code does;
hopefully, the code can speak for itself. It may be a better use of the
cover letter to describe the motivation for the series as a whole.
Reviewers will not necessarily have background on what you want to
accomplish. We came up with a few factors that might have inspired this
change, but we're not sure which you intended to address:

* Increase efficiency by reducing the number of processes forked and the
  use of the shell.

* Make the submodule code easier to maintain (since the project probably
  has more C experts than shell experts).

* Improve the user experience with submodules by giving the
  submodule-add code access to C internals, and vice versa.

Knowing what you want to accomplish can make it easier for reviewers. Of
course, you'll also want to include important context in your commit
messages as well, so that it's available in the history if future
debugging is necessary.


Thanks again for the series, and please feel free to follow up if you
have any questions
-- Josh

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C
  2020-11-18 23:13     ` Jonathan Tan
@ 2020-11-19  7:44       ` Ævar Arnfjörð Bjarmason
  2020-11-19 12:38         ` Johannes Schindelin
  0 siblings, 1 reply; 16+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2020-11-19  7:44 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: gitster, shouryashukla.oo, git, christian.couder,
	kaartic.sivaraam, Johannes.Schindelin, liu.denton, pc44800,
	chriscool, stefanbeller


On Thu, Nov 19 2020, Jonathan Tan wrote:

>> Whew.
>> 
>> This was way too big to be reviewed in a single sitting.  I do not
>> know offhand if there is a better way to structure the changes into
>> a more digestible pieces to help prevent reviewers from overlooking
>> potential mistakes, though.
>> 
>> Thanks.
>
> I just took a look at this, and one thing that would have helped is if
> you ported the end of the function first in a commit, and work your way
> backwards (in one or more commits).
>
> After reading through the whole thing, I saw that this is mostly a
> straightforward start-to-finish port (besides factoring out code into
> functions), but it would be much easier for reviewers to conceptualize
> and discuss the different parts if they were already divided.

Having done some minor changes to git-submodule.sh recently, I wondered
if we weren't at the point where it would be a nice approach to invert
the C/sh helper relationship.

I.e. write git-submodule.c, which would be the small entry point, it
would then mostly dispatch to a submodule--helper, which would in turn
mostly dispatch to a new submodule--helper-sh (containing most of the
current git-submodule.sh code), which in turn would re-dispatch to the C
submodule--helper (which as an aside, then sometimes calls itself via
process invocation).

It's quite a bit of spaghetti code, but means that there's a straighter
path to porting some of the setup code such as the "--check-writeable",
is_absolute_path() etc. being changed at the start of the change here to
git-submodule.sh.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C
  2020-11-19  7:44       ` Ævar Arnfjörð Bjarmason
@ 2020-11-19 12:38         ` Johannes Schindelin
  2020-11-19 20:37           ` Junio C Hamano
  0 siblings, 1 reply; 16+ messages in thread
From: Johannes Schindelin @ 2020-11-19 12:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jonathan Tan, gitster, shouryashukla.oo, git, christian.couder,
	kaartic.sivaraam, liu.denton, pc44800, chriscool, stefanbeller

[-- Attachment #1: Type: text/plain, Size: 2412 bytes --]

Hi Ævar,

On Thu, 19 Nov 2020, Ævar Arnfjörð Bjarmason wrote:

>
> On Thu, Nov 19 2020, Jonathan Tan wrote:
>
> >> Whew.
> >>
> >> This was way too big to be reviewed in a single sitting.  I do not
> >> know offhand if there is a better way to structure the changes into
> >> a more digestible pieces to help prevent reviewers from overlooking
> >> potential mistakes, though.
> >>
> >> Thanks.
> >
> > I just took a look at this, and one thing that would have helped is if
> > you ported the end of the function first in a commit, and work your way
> > backwards (in one or more commits).
> >
> > After reading through the whole thing, I saw that this is mostly a
> > straightforward start-to-finish port (besides factoring out code into
> > functions), but it would be much easier for reviewers to conceptualize
> > and discuss the different parts if they were already divided.
>
> Having done some minor changes to git-submodule.sh recently, I wondered
> if we weren't at the point where it would be a nice approach to invert
> the C/sh helper relationship.
>
> I.e. write git-submodule.c, which would be the small entry point, it
> would then mostly dispatch to a submodule--helper, which would in turn
> mostly dispatch to a new submodule--helper-sh (containing most of the
> current git-submodule.sh code), which in turn would re-dispatch to the C
> submodule--helper (which as an aside, then sometimes calls itself via
> process invocation).
>
> It's quite a bit of spaghetti code, but means that there's a straighter
> path to porting some of the setup code such as the "--check-writeable",
> is_absolute_path() etc. being changed at the start of the change here to
> git-submodule.sh.

Looking at
https://github.com/gitgitgadget/git/blob/ss/submodule-add-in-c/git-submodule.sh,
I see that while there are still 794 lines, most of it is just mostly
redundant option parsing. The only function left to convert is
`cmd_update()`, and the first half was already converted to C long ago,
via the `git submodule--helper update-clone` subcommand:
https://github.com/gitgitgadget/git/blob/ss/submodule-add-in-c/git-submodule.sh#L269-L530

At this stage I suspect that having a little more patience would make more
sense. After `cmd_update` is converted, we can simply finish the
conversion, without having to keep the shell script around.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C
  2020-11-19 12:38         ` Johannes Schindelin
@ 2020-11-19 20:37           ` Junio C Hamano
  0 siblings, 0 replies; 16+ messages in thread
From: Junio C Hamano @ 2020-11-19 20:37 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Ævar Arnfjörð Bjarmason, Jonathan Tan,
	shouryashukla.oo, git, christian.couder, kaartic.sivaraam,
	liu.denton, pc44800, chriscool, stefanbeller

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> At this stage I suspect that having a little more patience would make more
> sense. After `cmd_update` is converted, we can simply finish the
> conversion, without having to keep the shell script around.

That matches how I view the current state.  We seem to be reviewer
bandwidth limited and folks who took a look at this series to help
moving it forward certainly deserve a lot of gratitude.

Thanks.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2020-11-19 20:41 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-07  7:45 [PATCH v2 0/3] submodule: port subcommand add from shell to C Shourya Shukla
2020-10-07  7:45 ` [PATCH v2 1/3] dir: change the scope of function 'directory_exists_in_index()' Shourya Shukla
2020-10-07 18:05   ` Junio C Hamano
2020-10-12 10:11     ` Shourya Shukla
2020-11-18 23:25   ` Emily Shaffer
2020-10-07  7:45 ` [PATCH v2 2/3] submodule: port submodule subcommand 'add' from shell to C Shourya Shukla
2020-10-07 18:37   ` Junio C Hamano
2020-10-07 22:19   ` Junio C Hamano
2020-10-08 17:19   ` Junio C Hamano
2020-10-09  5:09   ` Junio C Hamano
2020-11-18 23:13     ` Jonathan Tan
2020-11-19  7:44       ` Ævar Arnfjörð Bjarmason
2020-11-19 12:38         ` Johannes Schindelin
2020-11-19 20:37           ` Junio C Hamano
2020-10-07  7:45 ` [PATCH v2 3/3] t7400: add test to check 'submodule add' for tracked paths Shourya Shukla
2020-11-19  0:03 ` [PATCH v2 0/3] submodule: port subcommand add from shell to C Josh Steadmon

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).