git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/3] Convert grep to recurse in-process
@ 2017-07-11 22:04 Brandon Williams
  2017-07-11 22:04 ` [PATCH 1/3] repo_read_index: don't discard the index Brandon Williams
                   ` (4 more replies)
  0 siblings, 5 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-11 22:04 UTC (permalink / raw)
  To: git; +Cc: sbeller, Brandon Williams

This series utilizes the new 'struct repository' in order to convert grep to be
able to recurse into submodules in-process much like how ls-files was converted
to recuse in-process.  The result is a much smaller code footprint due to not
needing to compile an argv array of options to be used when launched a process
for operating on a submodule.

Brandon Williams (3):
  repo_read_index: don't discard the index
  setup: have the_repository use the_index
  grep: recurse in-process using 'struct repository'

 Documentation/git-grep.txt |   7 -
 builtin/grep.c             | 390 +++++++++------------------------------------
 cache.h                    |   1 -
 git.c                      |   2 +-
 grep.c                     |  13 --
 grep.h                     |   1 -
 repository.c               |   2 -
 setup.c                    |  13 +-
 8 files changed, 82 insertions(+), 347 deletions(-)

-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH 1/3] repo_read_index: don't discard the index
  2017-07-11 22:04 [PATCH 0/3] Convert grep to recurse in-process Brandon Williams
@ 2017-07-11 22:04 ` Brandon Williams
  2017-07-11 23:51   ` Jonathan Nieder
  2017-07-11 23:58   ` Stefan Beller
  2017-07-11 22:04 ` [PATCH 2/3] setup: have the_repository use the_index Brandon Williams
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-11 22:04 UTC (permalink / raw)
  To: git; +Cc: sbeller, Brandon Williams

Have 'repo_read_index()' behave more like the other read_index family of
functions and don't discard the index if it has already been populated.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 repository.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/repository.c b/repository.c
index edca90740..8e60af1d5 100644
--- a/repository.c
+++ b/repository.c
@@ -235,8 +235,6 @@ int repo_read_index(struct repository *repo)
 {
 	if (!repo->index)
 		repo->index = xcalloc(1, sizeof(*repo->index));
-	else
-		discard_index(repo->index);
 
 	return read_index_from(repo->index, repo->index_file);
 }
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 2/3] setup: have the_repository use the_index
  2017-07-11 22:04 [PATCH 0/3] Convert grep to recurse in-process Brandon Williams
  2017-07-11 22:04 ` [PATCH 1/3] repo_read_index: don't discard the index Brandon Williams
@ 2017-07-11 22:04 ` Brandon Williams
  2017-07-12  0:00   ` Jonathan Nieder
  2017-07-12  0:11   ` Junio C Hamano
  2017-07-11 22:04 ` [PATCH 3/3] grep: recurse in-process using 'struct repository' Brandon Williams
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-11 22:04 UTC (permalink / raw)
  To: git; +Cc: sbeller, Brandon Williams

Have the index state which is stored in 'the_repository' be a pointer to
the in-core instead 'the_index'.  This makes it easier to begin
transitioning more parts of the code base to operate on a 'struct
repository'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 setup.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/setup.c b/setup.c
index 860507e1f..b370bf3c1 100644
--- a/setup.c
+++ b/setup.c
@@ -1123,6 +1123,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
 			setup_git_env();
 		}
 	}
+	the_repository->index = &the_index;
 
 	strbuf_release(&dir);
 	strbuf_release(&gitdir);
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH 3/3] grep: recurse in-process using 'struct repository'
  2017-07-11 22:04 [PATCH 0/3] Convert grep to recurse in-process Brandon Williams
  2017-07-11 22:04 ` [PATCH 1/3] repo_read_index: don't discard the index Brandon Williams
  2017-07-11 22:04 ` [PATCH 2/3] setup: have the_repository use the_index Brandon Williams
@ 2017-07-11 22:04 ` Brandon Williams
  2017-07-11 22:44   ` Jacob Keller
                     ` (2 more replies)
  2017-07-12  7:42 ` [PATCH 0/3] Convert grep to recurse in-process Jeff King
  2017-07-14 22:28 ` [PATCH v2 " Brandon Williams
  4 siblings, 3 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-11 22:04 UTC (permalink / raw)
  To: git; +Cc: sbeller, Brandon Williams

Convert grep to use 'struct repository' which enables recursing into
submodules to be handled in-process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/git-grep.txt |   7 -
 builtin/grep.c             | 390 +++++++++------------------------------------
 cache.h                    |   1 -
 git.c                      |   2 +-
 grep.c                     |  13 --
 grep.h                     |   1 -
 setup.c                    |  12 +-
 7 files changed, 81 insertions(+), 345 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 5033483db..720c7850e 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -95,13 +95,6 @@ OPTIONS
 	<tree> option the prefix of all submodule output will be the name of
 	the parent project's <tree> object.
 
---parent-basename <basename>::
-	For internal use only.  In order to produce uniform output with the
-	--recurse-submodules option, this option can be used to provide the
-	basename of a parent's <tree> object to a submodule so the submodule
-	can prefix its output with the parent's name rather than the SHA1 of
-	the submodule.
-
 -a::
 --text::
 	Process binary files as if they were text.
diff --git a/builtin/grep.c b/builtin/grep.c
index fa351c49f..0b0a8459e 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -28,13 +28,7 @@ static char const * const grep_usage[] = {
 	NULL
 };
 
-static const char *super_prefix;
 static int recurse_submodules;
-static struct argv_array submodule_options = ARGV_ARRAY_INIT;
-static const char *parent_basename;
-
-static int grep_submodule_launch(struct grep_opt *opt,
-				 const struct grep_source *gs);
 
 #define GREP_NUM_THREADS_DEFAULT 8
 static int num_threads;
@@ -186,10 +180,7 @@ static void *run(void *arg)
 			break;
 
 		opt->output_priv = w;
-		if (w->source.type == GREP_SOURCE_SUBMODULE)
-			hit |= grep_submodule_launch(opt, &w->source);
-		else
-			hit |= grep_source(opt, &w->source);
+		hit |= grep_source(opt, &w->source);
 		grep_source_clear_data(&w->source);
 		work_done(w);
 	}
@@ -327,21 +318,13 @@ static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
 {
 	struct strbuf pathbuf = STRBUF_INIT;
 
-	if (super_prefix) {
-		strbuf_add(&pathbuf, filename, tree_name_len);
-		strbuf_addstr(&pathbuf, super_prefix);
-		strbuf_addstr(&pathbuf, filename + tree_name_len);
+	if (opt->relative && opt->prefix_length) {
+		quote_path_relative(filename + tree_name_len, opt->prefix, &pathbuf);
+		strbuf_insert(&pathbuf, 0, filename, tree_name_len);
 	} else {
 		strbuf_addstr(&pathbuf, filename);
 	}
 
-	if (opt->relative && opt->prefix_length) {
-		char *name = strbuf_detach(&pathbuf, NULL);
-		quote_path_relative(name + tree_name_len, opt->prefix, &pathbuf);
-		strbuf_insert(&pathbuf, 0, name, tree_name_len);
-		free(name);
-	}
-
 #ifndef NO_PTHREADS
 	if (num_threads) {
 		add_work(opt, GREP_SOURCE_OID, pathbuf.buf, path, oid);
@@ -366,14 +349,10 @@ static int grep_file(struct grep_opt *opt, const char *filename)
 {
 	struct strbuf buf = STRBUF_INIT;
 
-	if (super_prefix)
-		strbuf_addstr(&buf, super_prefix);
-	strbuf_addstr(&buf, filename);
-
 	if (opt->relative && opt->prefix_length) {
-		char *name = strbuf_detach(&buf, NULL);
-		quote_path_relative(name, opt->prefix, &buf);
-		free(name);
+		quote_path_relative(filename, opt->prefix, &buf);
+	} else {
+		strbuf_addstr(&buf, filename);
 	}
 
 #ifndef NO_PTHREADS
@@ -421,284 +400,80 @@ static void run_pager(struct grep_opt *opt, const char *prefix)
 		exit(status);
 }
 
-static void compile_submodule_options(const struct grep_opt *opt,
-				      const char **argv,
-				      int cached, int untracked,
-				      int opt_exclude, int use_index,
-				      int pattern_type_arg)
-{
-	struct grep_pat *pattern;
-
-	if (recurse_submodules)
-		argv_array_push(&submodule_options, "--recurse-submodules");
-
-	if (cached)
-		argv_array_push(&submodule_options, "--cached");
-	if (!use_index)
-		argv_array_push(&submodule_options, "--no-index");
-	if (untracked)
-		argv_array_push(&submodule_options, "--untracked");
-	if (opt_exclude > 0)
-		argv_array_push(&submodule_options, "--exclude-standard");
-
-	if (opt->invert)
-		argv_array_push(&submodule_options, "-v");
-	if (opt->ignore_case)
-		argv_array_push(&submodule_options, "-i");
-	if (opt->word_regexp)
-		argv_array_push(&submodule_options, "-w");
-	switch (opt->binary) {
-	case GREP_BINARY_NOMATCH:
-		argv_array_push(&submodule_options, "-I");
-		break;
-	case GREP_BINARY_TEXT:
-		argv_array_push(&submodule_options, "-a");
-		break;
-	default:
-		break;
-	}
-	if (opt->allow_textconv)
-		argv_array_push(&submodule_options, "--textconv");
-	if (opt->max_depth != -1)
-		argv_array_pushf(&submodule_options, "--max-depth=%d",
-				 opt->max_depth);
-	if (opt->linenum)
-		argv_array_push(&submodule_options, "-n");
-	if (!opt->pathname)
-		argv_array_push(&submodule_options, "-h");
-	if (!opt->relative)
-		argv_array_push(&submodule_options, "--full-name");
-	if (opt->name_only)
-		argv_array_push(&submodule_options, "-l");
-	if (opt->unmatch_name_only)
-		argv_array_push(&submodule_options, "-L");
-	if (opt->null_following_name)
-		argv_array_push(&submodule_options, "-z");
-	if (opt->count)
-		argv_array_push(&submodule_options, "-c");
-	if (opt->file_break)
-		argv_array_push(&submodule_options, "--break");
-	if (opt->heading)
-		argv_array_push(&submodule_options, "--heading");
-	if (opt->pre_context)
-		argv_array_pushf(&submodule_options, "--before-context=%d",
-				 opt->pre_context);
-	if (opt->post_context)
-		argv_array_pushf(&submodule_options, "--after-context=%d",
-				 opt->post_context);
-	if (opt->funcname)
-		argv_array_push(&submodule_options, "-p");
-	if (opt->funcbody)
-		argv_array_push(&submodule_options, "-W");
-	if (opt->all_match)
-		argv_array_push(&submodule_options, "--all-match");
-	if (opt->debug)
-		argv_array_push(&submodule_options, "--debug");
-	if (opt->status_only)
-		argv_array_push(&submodule_options, "-q");
-
-	switch (pattern_type_arg) {
-	case GREP_PATTERN_TYPE_BRE:
-		argv_array_push(&submodule_options, "-G");
-		break;
-	case GREP_PATTERN_TYPE_ERE:
-		argv_array_push(&submodule_options, "-E");
-		break;
-	case GREP_PATTERN_TYPE_FIXED:
-		argv_array_push(&submodule_options, "-F");
-		break;
-	case GREP_PATTERN_TYPE_PCRE:
-		argv_array_push(&submodule_options, "-P");
-		break;
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		break;
-	default:
-		die("BUG: Added a new grep pattern type without updating switch statement");
-	}
-
-	for (pattern = opt->pattern_list; pattern != NULL;
-	     pattern = pattern->next) {
-		switch (pattern->token) {
-		case GREP_PATTERN:
-			argv_array_pushf(&submodule_options, "-e%s",
-					 pattern->pattern);
-			break;
-		case GREP_AND:
-		case GREP_OPEN_PAREN:
-		case GREP_CLOSE_PAREN:
-		case GREP_NOT:
-		case GREP_OR:
-			argv_array_push(&submodule_options, pattern->pattern);
-			break;
-		/* BODY and HEAD are not used by git-grep */
-		case GREP_PATTERN_BODY:
-		case GREP_PATTERN_HEAD:
-			break;
-		}
-	}
-
-	/*
-	 * Limit number of threads for child process to use.
-	 * This is to prevent potential fork-bomb behavior of git-grep as each
-	 * submodule process has its own thread pool.
-	 */
-	argv_array_pushf(&submodule_options, "--threads=%d",
-			 (num_threads + 1) / 2);
-
-	/* Add Pathspecs */
-	argv_array_push(&submodule_options, "--");
-	for (; *argv; argv++)
-		argv_array_push(&submodule_options, *argv);
-}
+static int grep_cache(struct grep_opt *opt, struct repository *repo,
+		      const struct pathspec *pathspec, int cached);
+static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
+		     struct tree_desc *tree, struct strbuf *base, int tn_len,
+		     int check_attr, struct repository *repo);
 
-/*
- * Launch child process to grep contents of a submodule
- */
-static int grep_submodule_launch(struct grep_opt *opt,
-				 const struct grep_source *gs)
+static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
+			  const struct pathspec *pathspec,
+			  const struct object_id *oid,
+			  const char *filename, const char *path)
 {
-	struct child_process cp = CHILD_PROCESS_INIT;
-	int status, i;
-	const char *end_of_base;
-	const char *name;
-	struct strbuf child_output = STRBUF_INIT;
-
-	end_of_base = strchr(gs->name, ':');
-	if (gs->identifier && end_of_base)
-		name = end_of_base + 1;
-	else
-		name = gs->name;
-
-	prepare_submodule_repo_env(&cp.env_array);
-	argv_array_push(&cp.env_array, GIT_DIR_ENVIRONMENT);
-
-	if (opt->relative && opt->prefix_length)
-		argv_array_pushf(&cp.env_array, "%s=%s",
-				 GIT_TOPLEVEL_PREFIX_ENVIRONMENT,
-				 opt->prefix);
-
-	/* Add super prefix */
-	argv_array_pushf(&cp.args, "--super-prefix=%s%s/",
-			 super_prefix ? super_prefix : "",
-			 name);
-	argv_array_push(&cp.args, "grep");
+	struct repository submodule;
+	int hit;
 
-	/*
-	 * Add basename of parent project
-	 * When performing grep on a tree object the filename is prefixed
-	 * with the object's name: 'tree-name:filename'.  In order to
-	 * provide uniformity of output we want to pass the name of the
-	 * parent project's object name to the submodule so the submodule can
-	 * prefix its output with the parent's name and not its own OID.
-	 */
-	if (gs->identifier && end_of_base)
-		argv_array_pushf(&cp.args, "--parent-basename=%.*s",
-				 (int) (end_of_base - gs->name),
-				 gs->name);
+	if (!is_submodule_active(superproject, path))
+		return 0;
 
-	/* Add options */
-	for (i = 0; i < submodule_options.argc; i++) {
-		/*
-		 * If there is a tree identifier for the submodule, add the
-		 * rev after adding the submodule options but before the
-		 * pathspecs.  To do this we listen for the '--' and insert the
-		 * oid before pushing the '--' onto the child process argv
-		 * array.
-		 */
-		if (gs->identifier &&
-		    !strcmp("--", submodule_options.argv[i])) {
-			argv_array_push(&cp.args, oid_to_hex(gs->identifier));
-		}
+	if (repo_submodule_init(&submodule, superproject, path))
+		return 0;
 
-		argv_array_push(&cp.args, submodule_options.argv[i]);
-	}
+	repo_read_gitmodules(&submodule);
 
-	cp.git_cmd = 1;
-	cp.dir = gs->path;
+	/* add objects to alternates */
+	add_to_alternates_memory(submodule.objectdir);
 
-	/*
-	 * Capture output to output buffer and check the return code from the
-	 * child process.  A '0' indicates a hit, a '1' indicates no hit and
-	 * anything else is an error.
-	 */
-	status = capture_command(&cp, &child_output, 0);
-	if (status && (status != 1)) {
-		/* flush the buffer */
-		write_or_die(1, child_output.buf, child_output.len);
-		die("process for submodule '%s' failed with exit code: %d",
-		    gs->name, status);
-	}
+	if (oid) {
+		struct object *object;
+		struct tree_desc tree;
+		void *data;
+		unsigned long size;
+		struct strbuf base = STRBUF_INIT;
 
-	opt->output(opt, child_output.buf, child_output.len);
-	strbuf_release(&child_output);
-	/* invert the return code to make a hit equal to 1 */
-	return !status;
-}
+		object = parse_object_or_die(oid, oid_to_hex(oid));
 
-/*
- * Prep grep structures for a submodule grep
- * oid: the oid of the submodule or NULL if using the working tree
- * filename: name of the submodule including tree name of parent
- * path: location of the submodule
- */
-static int grep_submodule(struct grep_opt *opt, const struct object_id *oid,
-			  const char *filename, const char *path)
-{
-	if (!is_submodule_active(the_repository, path))
-		return 0;
-	if (!is_submodule_populated_gently(path, NULL)) {
-		/*
-		 * If searching history, check for the presence of the
-		 * submodule's gitdir before skipping the submodule.
-		 */
-		if (oid) {
-			const struct submodule *sub =
-					submodule_from_path(null_sha1, path);
-			if (sub)
-				path = git_path("modules/%s", sub->name);
-
-			if (!(is_directory(path) && is_git_directory(path)))
-				return 0;
-		} else {
-			return 0;
-		}
-	}
+		grep_read_lock();
+		data = read_object_with_reference(object->oid.hash, tree_type,
+						  &size, NULL);
+		grep_read_unlock();
 
-#ifndef NO_PTHREADS
-	if (num_threads) {
-		add_work(opt, GREP_SOURCE_SUBMODULE, filename, path, oid);
-		return 0;
-	} else
-#endif
-	{
-		struct grep_source gs;
-		int hit;
+		if (!data)
+			die(_("unable to read tree (%s)"), oid_to_hex(&object->oid));
 
-		grep_source_init(&gs, GREP_SOURCE_SUBMODULE,
-				 filename, path, oid);
-		hit = grep_submodule_launch(opt, &gs);
+		strbuf_addstr(&base, filename);
+		strbuf_addch(&base, '/');
 
-		grep_source_clear(&gs);
-		return hit;
+		init_tree_desc(&tree, data, size);
+		hit = grep_tree(opt, pathspec, &tree, &base, base.len,
+				object->type == OBJ_COMMIT, &submodule);
+		strbuf_release(&base);
+		free(data);
+	} else {
+		hit = grep_cache(opt, &submodule, pathspec, 1);
 	}
+
+	repo_clear(&submodule);
+	return hit;
 }
 
-static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
-		      int cached)
+static int grep_cache(struct grep_opt *opt, struct repository *repo,
+		      const struct pathspec *pathspec, int cached)
 {
 	int hit = 0;
 	int nr;
 	struct strbuf name = STRBUF_INIT;
 	int name_base_len = 0;
-	if (super_prefix) {
-		name_base_len = strlen(super_prefix);
-		strbuf_addstr(&name, super_prefix);
+	if (repo->submodule_prefix) {
+		name_base_len = strlen(repo->submodule_prefix);
+		strbuf_addstr(&name, repo->submodule_prefix);
 	}
 
-	read_cache();
+	repo_read_index(repo);
 
-	for (nr = 0; nr < active_nr; nr++) {
-		const struct cache_entry *ce = active_cache[nr];
+	for (nr = 0; nr < repo->index->cache_nr; nr++) {
+		const struct cache_entry *ce = repo->index->cache[nr];
 		strbuf_setlen(&name, name_base_len);
 		strbuf_addstr(&name, ce->name);
 
@@ -715,14 +490,14 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 			    ce_skip_worktree(ce)) {
 				if (ce_stage(ce) || ce_intent_to_add(ce))
 					continue;
-				hit |= grep_oid(opt, &ce->oid, ce->name,
-						 0, ce->name);
+				hit |= grep_oid(opt, &ce->oid, name.buf,
+						 0, name.buf);
 			} else {
-				hit |= grep_file(opt, ce->name);
+				hit |= grep_file(opt, name.buf);
 			}
 		} else if (recurse_submodules && S_ISGITLINK(ce->ce_mode) &&
 			   submodule_path_match(pathspec, name.buf, NULL)) {
-			hit |= grep_submodule(opt, NULL, ce->name, ce->name);
+			hit |= grep_submodule(opt, repo, pathspec, NULL, ce->name, ce->name);
 		} else {
 			continue;
 		}
@@ -730,8 +505,8 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 		if (ce_stage(ce)) {
 			do {
 				nr++;
-			} while (nr < active_nr &&
-				 !strcmp(ce->name, active_cache[nr]->name));
+			} while (nr < repo->index->cache_nr &&
+				 !strcmp(ce->name, repo->index->cache[nr]->name));
 			nr--; /* compensate for loop control */
 		}
 		if (hit && opt->status_only)
@@ -744,7 +519,7 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 
 static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 		     struct tree_desc *tree, struct strbuf *base, int tn_len,
-		     int check_attr)
+		     int check_attr, struct repository *repo)
 {
 	int hit = 0;
 	enum interesting match = entry_not_interesting;
@@ -752,8 +527,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 	int old_baselen = base->len;
 	struct strbuf name = STRBUF_INIT;
 	int name_base_len = 0;
-	if (super_prefix) {
-		strbuf_addstr(&name, super_prefix);
+	if (repo->submodule_prefix) {
+		strbuf_addstr(&name, repo->submodule_prefix);
 		name_base_len = name.len;
 	}
 
@@ -791,11 +566,11 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 			strbuf_addch(base, '/');
 			init_tree_desc(&sub, data, size);
 			hit |= grep_tree(opt, pathspec, &sub, base, tn_len,
-					 check_attr);
+					 check_attr, repo);
 			free(data);
 		} else if (recurse_submodules && S_ISGITLINK(entry.mode)) {
-			hit |= grep_submodule(opt, entry.oid, base->buf,
-					      base->buf + tn_len);
+			hit |= grep_submodule(opt, repo, pathspec, entry.oid,
+					      base->buf, base->buf + tn_len);
 		}
 
 		strbuf_setlen(base, old_baselen);
@@ -809,7 +584,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
-		       struct object *obj, const char *name, const char *path)
+		       struct object *obj, const char *name, const char *path,
+		       struct repository *repo)
 {
 	if (obj->type == OBJ_BLOB)
 		return grep_oid(opt, &obj->oid, name, 0, path);
@@ -828,10 +604,6 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid));
 
-		/* Use parent's name as base when recursing submodules */
-		if (recurse_submodules && parent_basename)
-			name = parent_basename;
-
 		len = name ? strlen(name) : 0;
 		strbuf_init(&base, PATH_MAX + len + 1);
 		if (len) {
@@ -840,7 +612,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		}
 		init_tree_desc(&tree, data, size);
 		hit = grep_tree(opt, pathspec, &tree, &base, base.len,
-				obj->type == OBJ_COMMIT);
+				obj->type == OBJ_COMMIT, repo);
 		strbuf_release(&base);
 		free(data);
 		return hit;
@@ -849,6 +621,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
+			struct repository *repo,
 			const struct object_array *list)
 {
 	unsigned int i;
@@ -864,7 +637,8 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 			submodule_free();
 			gitmodules_config_sha1(real_obj->oid.hash);
 		}
-		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path)) {
+		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path,
+				repo)) {
 			hit = 1;
 			if (opt->status_only)
 				break;
@@ -1005,9 +779,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			    N_("ignore files specified via '.gitignore'"), 1),
 		OPT_BOOL(0, "recurse-submodules", &recurse_submodules,
 			 N_("recursively search in each submodule")),
-		OPT_STRING(0, "parent-basename", &parent_basename,
-			   N_("basename"),
-			   N_("prepend parent project's basename to output")),
 		OPT_GROUP(""),
 		OPT_BOOL('v', "invert-match", &opt.invert,
 			N_("show non-matching lines")),
@@ -1112,7 +883,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	init_grep_defaults();
 	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, prefix);
-	super_prefix = get_super_prefix();
 
 	/*
 	 * If there is no -- then the paths must exist in the working
@@ -1274,9 +1044,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 
 	if (recurse_submodules) {
 		gitmodules_config();
-		compile_submodule_options(&opt, argv + i, cached, untracked,
-					  opt_exclude, use_index,
-					  pattern_type_arg);
 	}
 
 	if (show_in_pager && (cached || list.nr))
@@ -1320,11 +1087,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		if (!cached)
 			setup_work_tree();
 
-		hit = grep_cache(&opt, &pathspec, cached);
+		hit = grep_cache(&opt, the_repository, &pathspec, cached);
 	} else {
 		if (cached)
 			die(_("both --cached and trees are given."));
-		hit = grep_objects(&opt, &pathspec, &list);
+
+		hit = grep_objects(&opt, &pathspec, the_repository, &list);
 	}
 
 	if (num_threads)
diff --git a/cache.h b/cache.h
index 71fe09264..71af91c43 100644
--- a/cache.h
+++ b/cache.h
@@ -417,7 +417,6 @@ static inline enum object_type object_type(unsigned int mode)
 #define GIT_WORK_TREE_ENVIRONMENT "GIT_WORK_TREE"
 #define GIT_PREFIX_ENVIRONMENT "GIT_PREFIX"
 #define GIT_SUPER_PREFIX_ENVIRONMENT "GIT_INTERNAL_SUPER_PREFIX"
-#define GIT_TOPLEVEL_PREFIX_ENVIRONMENT "GIT_INTERNAL_TOPLEVEL_PREFIX"
 #define DEFAULT_GIT_DIR_ENVIRONMENT ".git"
 #define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY"
 #define INDEX_ENVIRONMENT "GIT_INDEX_FILE"
diff --git a/git.c b/git.c
index 489aab4d8..9dd9aead6 100644
--- a/git.c
+++ b/git.c
@@ -392,7 +392,7 @@ static struct cmd_struct commands[] = {
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
 	{ "gc", cmd_gc, RUN_SETUP },
 	{ "get-tar-commit-id", cmd_get_tar_commit_id },
-	{ "grep", cmd_grep, RUN_SETUP_GENTLY | SUPPORT_SUPER_PREFIX },
+	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY },
diff --git a/grep.c b/grep.c
index 98733db62..78680da5c 100644
--- a/grep.c
+++ b/grep.c
@@ -1919,16 +1919,6 @@ void grep_source_init(struct grep_source *gs, enum grep_source_type type,
 	case GREP_SOURCE_FILE:
 		gs->identifier = xstrdup(identifier);
 		break;
-	case GREP_SOURCE_SUBMODULE:
-		if (!identifier) {
-			gs->identifier = NULL;
-			break;
-		}
-		/*
-		 * FALL THROUGH
-		 * If the identifier is non-NULL (in the submodule case) it
-		 * will be a SHA1 that needs to be copied.
-		 */
 	case GREP_SOURCE_OID:
 		gs->identifier = oiddup(identifier);
 		break;
@@ -1951,7 +1941,6 @@ void grep_source_clear_data(struct grep_source *gs)
 	switch (gs->type) {
 	case GREP_SOURCE_FILE:
 	case GREP_SOURCE_OID:
-	case GREP_SOURCE_SUBMODULE:
 		FREE_AND_NULL(gs->buf);
 		gs->size = 0;
 		break;
@@ -2022,8 +2011,6 @@ static int grep_source_load(struct grep_source *gs)
 		return grep_source_load_oid(gs);
 	case GREP_SOURCE_BUF:
 		return gs->buf ? 0 : -1;
-	case GREP_SOURCE_SUBMODULE:
-		break;
 	}
 	die("BUG: invalid grep_source type to load");
 }
diff --git a/grep.h b/grep.h
index b8f93bfc2..d405e568f 100644
--- a/grep.h
+++ b/grep.h
@@ -194,7 +194,6 @@ struct grep_source {
 		GREP_SOURCE_OID,
 		GREP_SOURCE_FILE,
 		GREP_SOURCE_BUF,
-		GREP_SOURCE_SUBMODULE,
 	} type;
 	void *identifier;
 
diff --git a/setup.c b/setup.c
index b370bf3c1..27266b0ac 100644
--- a/setup.c
+++ b/setup.c
@@ -1027,7 +1027,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
 {
 	static struct strbuf cwd = STRBUF_INIT;
 	struct strbuf dir = STRBUF_INIT, gitdir = STRBUF_INIT;
-	const char *prefix, *env_prefix;
+	const char *prefix;
 
 	/*
 	 * We may have read an incomplete configuration before
@@ -1085,16 +1085,6 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		die("BUG: unhandled setup_git_directory_1() result");
 	}
 
-	/*
-	 * NEEDSWORK: This was a hack in order to get ls-files and grep to have
-	 * properly formated output when recursing submodules.  Once ls-files
-	 * and grep have been changed to perform this recursing in-process this
-	 * needs to be removed.
-	 */
-	env_prefix = getenv(GIT_TOPLEVEL_PREFIX_ENVIRONMENT);
-	if (env_prefix)
-		prefix = env_prefix;
-
 	if (prefix)
 		setenv(GIT_PREFIX_ENVIRONMENT, prefix, 1);
 	else
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 3/3] grep: recurse in-process using 'struct repository'
  2017-07-11 22:04 ` [PATCH 3/3] grep: recurse in-process using 'struct repository' Brandon Williams
@ 2017-07-11 22:44   ` Jacob Keller
  2017-07-12 18:54     ` Brandon Williams
  2017-07-12  0:04   ` Stefan Beller
  2017-07-12  0:25   ` Jonathan Nieder
  2 siblings, 1 reply; 68+ messages in thread
From: Jacob Keller @ 2017-07-11 22:44 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Git mailing list, Stefan Beller

On Tue, Jul 11, 2017 at 3:04 PM, Brandon Williams <bmwill@google.com> wrote:
> Convert grep to use 'struct repository' which enables recursing into
> submodules to be handled in-process.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  Documentation/git-grep.txt |   7 -
>  builtin/grep.c             | 390 +++++++++------------------------------------
>  cache.h                    |   1 -
>  git.c                      |   2 +-
>  grep.c                     |  13 --
>  grep.h                     |   1 -
>  setup.c                    |  12 +-
>  7 files changed, 81 insertions(+), 345 deletions(-)
>

No real indepth comments here, but it's nice to see how much code
reduction this has enabled!

Thanks,
Jake

> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index 5033483db..720c7850e 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -95,13 +95,6 @@ OPTIONS
>         <tree> option the prefix of all submodule output will be the name of
>         the parent project's <tree> object.
>
> ---parent-basename <basename>::
> -       For internal use only.  In order to produce uniform output with the
> -       --recurse-submodules option, this option can be used to provide the
> -       basename of a parent's <tree> object to a submodule so the submodule
> -       can prefix its output with the parent's name rather than the SHA1 of
> -       the submodule.
> -
>  -a::
>  --text::
>         Process binary files as if they were text.
> diff --git a/builtin/grep.c b/builtin/grep.c
> index fa351c49f..0b0a8459e 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -28,13 +28,7 @@ static char const * const grep_usage[] = {
>         NULL
>  };
>
> -static const char *super_prefix;
>  static int recurse_submodules;
> -static struct argv_array submodule_options = ARGV_ARRAY_INIT;
> -static const char *parent_basename;
> -
> -static int grep_submodule_launch(struct grep_opt *opt,
> -                                const struct grep_source *gs);
>
>  #define GREP_NUM_THREADS_DEFAULT 8
>  static int num_threads;
> @@ -186,10 +180,7 @@ static void *run(void *arg)
>                         break;
>
>                 opt->output_priv = w;
> -               if (w->source.type == GREP_SOURCE_SUBMODULE)
> -                       hit |= grep_submodule_launch(opt, &w->source);
> -               else
> -                       hit |= grep_source(opt, &w->source);
> +               hit |= grep_source(opt, &w->source);
>                 grep_source_clear_data(&w->source);
>                 work_done(w);
>         }
> @@ -327,21 +318,13 @@ static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
>  {
>         struct strbuf pathbuf = STRBUF_INIT;
>
> -       if (super_prefix) {
> -               strbuf_add(&pathbuf, filename, tree_name_len);
> -               strbuf_addstr(&pathbuf, super_prefix);
> -               strbuf_addstr(&pathbuf, filename + tree_name_len);
> +       if (opt->relative && opt->prefix_length) {
> +               quote_path_relative(filename + tree_name_len, opt->prefix, &pathbuf);
> +               strbuf_insert(&pathbuf, 0, filename, tree_name_len);
>         } else {
>                 strbuf_addstr(&pathbuf, filename);
>         }
>
> -       if (opt->relative && opt->prefix_length) {
> -               char *name = strbuf_detach(&pathbuf, NULL);
> -               quote_path_relative(name + tree_name_len, opt->prefix, &pathbuf);
> -               strbuf_insert(&pathbuf, 0, name, tree_name_len);
> -               free(name);
> -       }
> -
>  #ifndef NO_PTHREADS
>         if (num_threads) {
>                 add_work(opt, GREP_SOURCE_OID, pathbuf.buf, path, oid);
> @@ -366,14 +349,10 @@ static int grep_file(struct grep_opt *opt, const char *filename)
>  {
>         struct strbuf buf = STRBUF_INIT;
>
> -       if (super_prefix)
> -               strbuf_addstr(&buf, super_prefix);
> -       strbuf_addstr(&buf, filename);
> -
>         if (opt->relative && opt->prefix_length) {
> -               char *name = strbuf_detach(&buf, NULL);
> -               quote_path_relative(name, opt->prefix, &buf);
> -               free(name);
> +               quote_path_relative(filename, opt->prefix, &buf);
> +       } else {
> +               strbuf_addstr(&buf, filename);
>         }
>
>  #ifndef NO_PTHREADS
> @@ -421,284 +400,80 @@ static void run_pager(struct grep_opt *opt, const char *prefix)
>                 exit(status);
>  }
>
> -static void compile_submodule_options(const struct grep_opt *opt,
> -                                     const char **argv,
> -                                     int cached, int untracked,
> -                                     int opt_exclude, int use_index,
> -                                     int pattern_type_arg)
> -{
> -       struct grep_pat *pattern;
> -
> -       if (recurse_submodules)
> -               argv_array_push(&submodule_options, "--recurse-submodules");
> -
> -       if (cached)
> -               argv_array_push(&submodule_options, "--cached");
> -       if (!use_index)
> -               argv_array_push(&submodule_options, "--no-index");
> -       if (untracked)
> -               argv_array_push(&submodule_options, "--untracked");
> -       if (opt_exclude > 0)
> -               argv_array_push(&submodule_options, "--exclude-standard");
> -
> -       if (opt->invert)
> -               argv_array_push(&submodule_options, "-v");
> -       if (opt->ignore_case)
> -               argv_array_push(&submodule_options, "-i");
> -       if (opt->word_regexp)
> -               argv_array_push(&submodule_options, "-w");
> -       switch (opt->binary) {
> -       case GREP_BINARY_NOMATCH:
> -               argv_array_push(&submodule_options, "-I");
> -               break;
> -       case GREP_BINARY_TEXT:
> -               argv_array_push(&submodule_options, "-a");
> -               break;
> -       default:
> -               break;
> -       }
> -       if (opt->allow_textconv)
> -               argv_array_push(&submodule_options, "--textconv");
> -       if (opt->max_depth != -1)
> -               argv_array_pushf(&submodule_options, "--max-depth=%d",
> -                                opt->max_depth);
> -       if (opt->linenum)
> -               argv_array_push(&submodule_options, "-n");
> -       if (!opt->pathname)
> -               argv_array_push(&submodule_options, "-h");
> -       if (!opt->relative)
> -               argv_array_push(&submodule_options, "--full-name");
> -       if (opt->name_only)
> -               argv_array_push(&submodule_options, "-l");
> -       if (opt->unmatch_name_only)
> -               argv_array_push(&submodule_options, "-L");
> -       if (opt->null_following_name)
> -               argv_array_push(&submodule_options, "-z");
> -       if (opt->count)
> -               argv_array_push(&submodule_options, "-c");
> -       if (opt->file_break)
> -               argv_array_push(&submodule_options, "--break");
> -       if (opt->heading)
> -               argv_array_push(&submodule_options, "--heading");
> -       if (opt->pre_context)
> -               argv_array_pushf(&submodule_options, "--before-context=%d",
> -                                opt->pre_context);
> -       if (opt->post_context)
> -               argv_array_pushf(&submodule_options, "--after-context=%d",
> -                                opt->post_context);
> -       if (opt->funcname)
> -               argv_array_push(&submodule_options, "-p");
> -       if (opt->funcbody)
> -               argv_array_push(&submodule_options, "-W");
> -       if (opt->all_match)
> -               argv_array_push(&submodule_options, "--all-match");
> -       if (opt->debug)
> -               argv_array_push(&submodule_options, "--debug");
> -       if (opt->status_only)
> -               argv_array_push(&submodule_options, "-q");
> -
> -       switch (pattern_type_arg) {
> -       case GREP_PATTERN_TYPE_BRE:
> -               argv_array_push(&submodule_options, "-G");
> -               break;
> -       case GREP_PATTERN_TYPE_ERE:
> -               argv_array_push(&submodule_options, "-E");
> -               break;
> -       case GREP_PATTERN_TYPE_FIXED:
> -               argv_array_push(&submodule_options, "-F");
> -               break;
> -       case GREP_PATTERN_TYPE_PCRE:
> -               argv_array_push(&submodule_options, "-P");
> -               break;
> -       case GREP_PATTERN_TYPE_UNSPECIFIED:
> -               break;
> -       default:
> -               die("BUG: Added a new grep pattern type without updating switch statement");
> -       }
> -
> -       for (pattern = opt->pattern_list; pattern != NULL;
> -            pattern = pattern->next) {
> -               switch (pattern->token) {
> -               case GREP_PATTERN:
> -                       argv_array_pushf(&submodule_options, "-e%s",
> -                                        pattern->pattern);
> -                       break;
> -               case GREP_AND:
> -               case GREP_OPEN_PAREN:
> -               case GREP_CLOSE_PAREN:
> -               case GREP_NOT:
> -               case GREP_OR:
> -                       argv_array_push(&submodule_options, pattern->pattern);
> -                       break;
> -               /* BODY and HEAD are not used by git-grep */
> -               case GREP_PATTERN_BODY:
> -               case GREP_PATTERN_HEAD:
> -                       break;
> -               }
> -       }
> -
> -       /*
> -        * Limit number of threads for child process to use.
> -        * This is to prevent potential fork-bomb behavior of git-grep as each
> -        * submodule process has its own thread pool.
> -        */
> -       argv_array_pushf(&submodule_options, "--threads=%d",
> -                        (num_threads + 1) / 2);
> -
> -       /* Add Pathspecs */
> -       argv_array_push(&submodule_options, "--");
> -       for (; *argv; argv++)
> -               argv_array_push(&submodule_options, *argv);
> -}
> +static int grep_cache(struct grep_opt *opt, struct repository *repo,
> +                     const struct pathspec *pathspec, int cached);
> +static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
> +                    struct tree_desc *tree, struct strbuf *base, int tn_len,
> +                    int check_attr, struct repository *repo);
>
> -/*
> - * Launch child process to grep contents of a submodule
> - */
> -static int grep_submodule_launch(struct grep_opt *opt,
> -                                const struct grep_source *gs)
> +static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
> +                         const struct pathspec *pathspec,
> +                         const struct object_id *oid,
> +                         const char *filename, const char *path)
>  {
> -       struct child_process cp = CHILD_PROCESS_INIT;
> -       int status, i;
> -       const char *end_of_base;
> -       const char *name;
> -       struct strbuf child_output = STRBUF_INIT;
> -
> -       end_of_base = strchr(gs->name, ':');
> -       if (gs->identifier && end_of_base)
> -               name = end_of_base + 1;
> -       else
> -               name = gs->name;
> -
> -       prepare_submodule_repo_env(&cp.env_array);
> -       argv_array_push(&cp.env_array, GIT_DIR_ENVIRONMENT);
> -
> -       if (opt->relative && opt->prefix_length)
> -               argv_array_pushf(&cp.env_array, "%s=%s",
> -                                GIT_TOPLEVEL_PREFIX_ENVIRONMENT,
> -                                opt->prefix);
> -
> -       /* Add super prefix */
> -       argv_array_pushf(&cp.args, "--super-prefix=%s%s/",
> -                        super_prefix ? super_prefix : "",
> -                        name);
> -       argv_array_push(&cp.args, "grep");
> +       struct repository submodule;
> +       int hit;
>
> -       /*
> -        * Add basename of parent project
> -        * When performing grep on a tree object the filename is prefixed
> -        * with the object's name: 'tree-name:filename'.  In order to
> -        * provide uniformity of output we want to pass the name of the
> -        * parent project's object name to the submodule so the submodule can
> -        * prefix its output with the parent's name and not its own OID.
> -        */
> -       if (gs->identifier && end_of_base)
> -               argv_array_pushf(&cp.args, "--parent-basename=%.*s",
> -                                (int) (end_of_base - gs->name),
> -                                gs->name);
> +       if (!is_submodule_active(superproject, path))
> +               return 0;
>
> -       /* Add options */
> -       for (i = 0; i < submodule_options.argc; i++) {
> -               /*
> -                * If there is a tree identifier for the submodule, add the
> -                * rev after adding the submodule options but before the
> -                * pathspecs.  To do this we listen for the '--' and insert the
> -                * oid before pushing the '--' onto the child process argv
> -                * array.
> -                */
> -               if (gs->identifier &&
> -                   !strcmp("--", submodule_options.argv[i])) {
> -                       argv_array_push(&cp.args, oid_to_hex(gs->identifier));
> -               }
> +       if (repo_submodule_init(&submodule, superproject, path))
> +               return 0;
>
> -               argv_array_push(&cp.args, submodule_options.argv[i]);
> -       }
> +       repo_read_gitmodules(&submodule);
>
> -       cp.git_cmd = 1;
> -       cp.dir = gs->path;
> +       /* add objects to alternates */
> +       add_to_alternates_memory(submodule.objectdir);
>
> -       /*
> -        * Capture output to output buffer and check the return code from the
> -        * child process.  A '0' indicates a hit, a '1' indicates no hit and
> -        * anything else is an error.
> -        */
> -       status = capture_command(&cp, &child_output, 0);
> -       if (status && (status != 1)) {
> -               /* flush the buffer */
> -               write_or_die(1, child_output.buf, child_output.len);
> -               die("process for submodule '%s' failed with exit code: %d",
> -                   gs->name, status);
> -       }
> +       if (oid) {
> +               struct object *object;
> +               struct tree_desc tree;
> +               void *data;
> +               unsigned long size;
> +               struct strbuf base = STRBUF_INIT;
>
> -       opt->output(opt, child_output.buf, child_output.len);
> -       strbuf_release(&child_output);
> -       /* invert the return code to make a hit equal to 1 */
> -       return !status;
> -}
> +               object = parse_object_or_die(oid, oid_to_hex(oid));
>
> -/*
> - * Prep grep structures for a submodule grep
> - * oid: the oid of the submodule or NULL if using the working tree
> - * filename: name of the submodule including tree name of parent
> - * path: location of the submodule
> - */
> -static int grep_submodule(struct grep_opt *opt, const struct object_id *oid,
> -                         const char *filename, const char *path)
> -{
> -       if (!is_submodule_active(the_repository, path))
> -               return 0;
> -       if (!is_submodule_populated_gently(path, NULL)) {
> -               /*
> -                * If searching history, check for the presence of the
> -                * submodule's gitdir before skipping the submodule.
> -                */
> -               if (oid) {
> -                       const struct submodule *sub =
> -                                       submodule_from_path(null_sha1, path);
> -                       if (sub)
> -                               path = git_path("modules/%s", sub->name);
> -
> -                       if (!(is_directory(path) && is_git_directory(path)))
> -                               return 0;
> -               } else {
> -                       return 0;
> -               }
> -       }
> +               grep_read_lock();
> +               data = read_object_with_reference(object->oid.hash, tree_type,
> +                                                 &size, NULL);
> +               grep_read_unlock();
>
> -#ifndef NO_PTHREADS
> -       if (num_threads) {
> -               add_work(opt, GREP_SOURCE_SUBMODULE, filename, path, oid);
> -               return 0;
> -       } else
> -#endif
> -       {
> -               struct grep_source gs;
> -               int hit;
> +               if (!data)
> +                       die(_("unable to read tree (%s)"), oid_to_hex(&object->oid));
>
> -               grep_source_init(&gs, GREP_SOURCE_SUBMODULE,
> -                                filename, path, oid);
> -               hit = grep_submodule_launch(opt, &gs);
> +               strbuf_addstr(&base, filename);
> +               strbuf_addch(&base, '/');
>
> -               grep_source_clear(&gs);
> -               return hit;
> +               init_tree_desc(&tree, data, size);
> +               hit = grep_tree(opt, pathspec, &tree, &base, base.len,
> +                               object->type == OBJ_COMMIT, &submodule);
> +               strbuf_release(&base);
> +               free(data);
> +       } else {
> +               hit = grep_cache(opt, &submodule, pathspec, 1);
>         }
> +
> +       repo_clear(&submodule);
> +       return hit;
>  }
>
> -static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
> -                     int cached)
> +static int grep_cache(struct grep_opt *opt, struct repository *repo,
> +                     const struct pathspec *pathspec, int cached)
>  {
>         int hit = 0;
>         int nr;
>         struct strbuf name = STRBUF_INIT;
>         int name_base_len = 0;
> -       if (super_prefix) {
> -               name_base_len = strlen(super_prefix);
> -               strbuf_addstr(&name, super_prefix);
> +       if (repo->submodule_prefix) {
> +               name_base_len = strlen(repo->submodule_prefix);
> +               strbuf_addstr(&name, repo->submodule_prefix);
>         }
>
> -       read_cache();
> +       repo_read_index(repo);
>
> -       for (nr = 0; nr < active_nr; nr++) {
> -               const struct cache_entry *ce = active_cache[nr];
> +       for (nr = 0; nr < repo->index->cache_nr; nr++) {
> +               const struct cache_entry *ce = repo->index->cache[nr];
>                 strbuf_setlen(&name, name_base_len);
>                 strbuf_addstr(&name, ce->name);
>
> @@ -715,14 +490,14 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
>                             ce_skip_worktree(ce)) {
>                                 if (ce_stage(ce) || ce_intent_to_add(ce))
>                                         continue;
> -                               hit |= grep_oid(opt, &ce->oid, ce->name,
> -                                                0, ce->name);
> +                               hit |= grep_oid(opt, &ce->oid, name.buf,
> +                                                0, name.buf);
>                         } else {
> -                               hit |= grep_file(opt, ce->name);
> +                               hit |= grep_file(opt, name.buf);
>                         }
>                 } else if (recurse_submodules && S_ISGITLINK(ce->ce_mode) &&
>                            submodule_path_match(pathspec, name.buf, NULL)) {
> -                       hit |= grep_submodule(opt, NULL, ce->name, ce->name);
> +                       hit |= grep_submodule(opt, repo, pathspec, NULL, ce->name, ce->name);
>                 } else {
>                         continue;
>                 }
> @@ -730,8 +505,8 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
>                 if (ce_stage(ce)) {
>                         do {
>                                 nr++;
> -                       } while (nr < active_nr &&
> -                                !strcmp(ce->name, active_cache[nr]->name));
> +                       } while (nr < repo->index->cache_nr &&
> +                                !strcmp(ce->name, repo->index->cache[nr]->name));
>                         nr--; /* compensate for loop control */
>                 }
>                 if (hit && opt->status_only)
> @@ -744,7 +519,7 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
>
>  static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
>                      struct tree_desc *tree, struct strbuf *base, int tn_len,
> -                    int check_attr)
> +                    int check_attr, struct repository *repo)
>  {
>         int hit = 0;
>         enum interesting match = entry_not_interesting;
> @@ -752,8 +527,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
>         int old_baselen = base->len;
>         struct strbuf name = STRBUF_INIT;
>         int name_base_len = 0;
> -       if (super_prefix) {
> -               strbuf_addstr(&name, super_prefix);
> +       if (repo->submodule_prefix) {
> +               strbuf_addstr(&name, repo->submodule_prefix);
>                 name_base_len = name.len;
>         }
>
> @@ -791,11 +566,11 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
>                         strbuf_addch(base, '/');
>                         init_tree_desc(&sub, data, size);
>                         hit |= grep_tree(opt, pathspec, &sub, base, tn_len,
> -                                        check_attr);
> +                                        check_attr, repo);
>                         free(data);
>                 } else if (recurse_submodules && S_ISGITLINK(entry.mode)) {
> -                       hit |= grep_submodule(opt, entry.oid, base->buf,
> -                                             base->buf + tn_len);
> +                       hit |= grep_submodule(opt, repo, pathspec, entry.oid,
> +                                             base->buf, base->buf + tn_len);
>                 }
>
>                 strbuf_setlen(base, old_baselen);
> @@ -809,7 +584,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
>  }
>
>  static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
> -                      struct object *obj, const char *name, const char *path)
> +                      struct object *obj, const char *name, const char *path,
> +                      struct repository *repo)
>  {
>         if (obj->type == OBJ_BLOB)
>                 return grep_oid(opt, &obj->oid, name, 0, path);
> @@ -828,10 +604,6 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
>                 if (!data)
>                         die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid));
>
> -               /* Use parent's name as base when recursing submodules */
> -               if (recurse_submodules && parent_basename)
> -                       name = parent_basename;
> -
>                 len = name ? strlen(name) : 0;
>                 strbuf_init(&base, PATH_MAX + len + 1);
>                 if (len) {
> @@ -840,7 +612,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
>                 }
>                 init_tree_desc(&tree, data, size);
>                 hit = grep_tree(opt, pathspec, &tree, &base, base.len,
> -                               obj->type == OBJ_COMMIT);
> +                               obj->type == OBJ_COMMIT, repo);
>                 strbuf_release(&base);
>                 free(data);
>                 return hit;
> @@ -849,6 +621,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
>  }
>
>  static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
> +                       struct repository *repo,
>                         const struct object_array *list)
>  {
>         unsigned int i;
> @@ -864,7 +637,8 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
>                         submodule_free();
>                         gitmodules_config_sha1(real_obj->oid.hash);
>                 }
> -               if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path)) {
> +               if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path,
> +                               repo)) {
>                         hit = 1;
>                         if (opt->status_only)
>                                 break;
> @@ -1005,9 +779,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>                             N_("ignore files specified via '.gitignore'"), 1),
>                 OPT_BOOL(0, "recurse-submodules", &recurse_submodules,
>                          N_("recursively search in each submodule")),
> -               OPT_STRING(0, "parent-basename", &parent_basename,
> -                          N_("basename"),
> -                          N_("prepend parent project's basename to output")),
>                 OPT_GROUP(""),
>                 OPT_BOOL('v', "invert-match", &opt.invert,
>                         N_("show non-matching lines")),
> @@ -1112,7 +883,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>         init_grep_defaults();
>         git_config(grep_cmd_config, NULL);
>         grep_init(&opt, prefix);
> -       super_prefix = get_super_prefix();
>
>         /*
>          * If there is no -- then the paths must exist in the working
> @@ -1274,9 +1044,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>
>         if (recurse_submodules) {
>                 gitmodules_config();
> -               compile_submodule_options(&opt, argv + i, cached, untracked,
> -                                         opt_exclude, use_index,
> -                                         pattern_type_arg);
>         }
>
>         if (show_in_pager && (cached || list.nr))
> @@ -1320,11 +1087,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>                 if (!cached)
>                         setup_work_tree();
>
> -               hit = grep_cache(&opt, &pathspec, cached);
> +               hit = grep_cache(&opt, the_repository, &pathspec, cached);
>         } else {
>                 if (cached)
>                         die(_("both --cached and trees are given."));
> -               hit = grep_objects(&opt, &pathspec, &list);
> +
> +               hit = grep_objects(&opt, &pathspec, the_repository, &list);
>         }
>
>         if (num_threads)
> diff --git a/cache.h b/cache.h
> index 71fe09264..71af91c43 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -417,7 +417,6 @@ static inline enum object_type object_type(unsigned int mode)
>  #define GIT_WORK_TREE_ENVIRONMENT "GIT_WORK_TREE"
>  #define GIT_PREFIX_ENVIRONMENT "GIT_PREFIX"
>  #define GIT_SUPER_PREFIX_ENVIRONMENT "GIT_INTERNAL_SUPER_PREFIX"
> -#define GIT_TOPLEVEL_PREFIX_ENVIRONMENT "GIT_INTERNAL_TOPLEVEL_PREFIX"
>  #define DEFAULT_GIT_DIR_ENVIRONMENT ".git"
>  #define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY"
>  #define INDEX_ENVIRONMENT "GIT_INDEX_FILE"
> diff --git a/git.c b/git.c
> index 489aab4d8..9dd9aead6 100644
> --- a/git.c
> +++ b/git.c
> @@ -392,7 +392,7 @@ static struct cmd_struct commands[] = {
>         { "fsck-objects", cmd_fsck, RUN_SETUP },
>         { "gc", cmd_gc, RUN_SETUP },
>         { "get-tar-commit-id", cmd_get_tar_commit_id },
> -       { "grep", cmd_grep, RUN_SETUP_GENTLY | SUPPORT_SUPER_PREFIX },
> +       { "grep", cmd_grep, RUN_SETUP_GENTLY },
>         { "hash-object", cmd_hash_object },
>         { "help", cmd_help },
>         { "index-pack", cmd_index_pack, RUN_SETUP_GENTLY },
> diff --git a/grep.c b/grep.c
> index 98733db62..78680da5c 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -1919,16 +1919,6 @@ void grep_source_init(struct grep_source *gs, enum grep_source_type type,
>         case GREP_SOURCE_FILE:
>                 gs->identifier = xstrdup(identifier);
>                 break;
> -       case GREP_SOURCE_SUBMODULE:
> -               if (!identifier) {
> -                       gs->identifier = NULL;
> -                       break;
> -               }
> -               /*
> -                * FALL THROUGH
> -                * If the identifier is non-NULL (in the submodule case) it
> -                * will be a SHA1 that needs to be copied.
> -                */
>         case GREP_SOURCE_OID:
>                 gs->identifier = oiddup(identifier);
>                 break;
> @@ -1951,7 +1941,6 @@ void grep_source_clear_data(struct grep_source *gs)
>         switch (gs->type) {
>         case GREP_SOURCE_FILE:
>         case GREP_SOURCE_OID:
> -       case GREP_SOURCE_SUBMODULE:
>                 FREE_AND_NULL(gs->buf);
>                 gs->size = 0;
>                 break;
> @@ -2022,8 +2011,6 @@ static int grep_source_load(struct grep_source *gs)
>                 return grep_source_load_oid(gs);
>         case GREP_SOURCE_BUF:
>                 return gs->buf ? 0 : -1;
> -       case GREP_SOURCE_SUBMODULE:
> -               break;
>         }
>         die("BUG: invalid grep_source type to load");
>  }
> diff --git a/grep.h b/grep.h
> index b8f93bfc2..d405e568f 100644
> --- a/grep.h
> +++ b/grep.h
> @@ -194,7 +194,6 @@ struct grep_source {
>                 GREP_SOURCE_OID,
>                 GREP_SOURCE_FILE,
>                 GREP_SOURCE_BUF,
> -               GREP_SOURCE_SUBMODULE,
>         } type;
>         void *identifier;
>
> diff --git a/setup.c b/setup.c
> index b370bf3c1..27266b0ac 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -1027,7 +1027,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
>  {
>         static struct strbuf cwd = STRBUF_INIT;
>         struct strbuf dir = STRBUF_INIT, gitdir = STRBUF_INIT;
> -       const char *prefix, *env_prefix;
> +       const char *prefix;
>
>         /*
>          * We may have read an incomplete configuration before
> @@ -1085,16 +1085,6 @@ const char *setup_git_directory_gently(int *nongit_ok)
>                 die("BUG: unhandled setup_git_directory_1() result");
>         }
>
> -       /*
> -        * NEEDSWORK: This was a hack in order to get ls-files and grep to have
> -        * properly formated output when recursing submodules.  Once ls-files
> -        * and grep have been changed to perform this recursing in-process this
> -        * needs to be removed.
> -        */
> -       env_prefix = getenv(GIT_TOPLEVEL_PREFIX_ENVIRONMENT);
> -       if (env_prefix)
> -               prefix = env_prefix;
> -
>         if (prefix)
>                 setenv(GIT_PREFIX_ENVIRONMENT, prefix, 1);
>         else
> --
> 2.13.2.932.g7449e964c-goog
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/3] repo_read_index: don't discard the index
  2017-07-11 22:04 ` [PATCH 1/3] repo_read_index: don't discard the index Brandon Williams
@ 2017-07-11 23:51   ` Jonathan Nieder
  2017-07-12 17:27     ` Brandon Williams
  2017-07-11 23:58   ` Stefan Beller
  1 sibling, 1 reply; 68+ messages in thread
From: Jonathan Nieder @ 2017-07-11 23:51 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller

Brandon Williams wrote:

> Have 'repo_read_index()' behave more like the other read_index family of
> functions and don't discard the index if it has already been populated.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  repository.c | 2 --
>  1 file changed, 2 deletions(-)

How did you discover this?  E.g. was it from code inspection or does
this make the function more convenient to use for some kinds of callers?

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/3] repo_read_index: don't discard the index
  2017-07-11 22:04 ` [PATCH 1/3] repo_read_index: don't discard the index Brandon Williams
  2017-07-11 23:51   ` Jonathan Nieder
@ 2017-07-11 23:58   ` Stefan Beller
  2017-07-12 17:23     ` Brandon Williams
  1 sibling, 1 reply; 68+ messages in thread
From: Stefan Beller @ 2017-07-11 23:58 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git@vger.kernel.org

On Tue, Jul 11, 2017 at 3:04 PM, Brandon Williams <bmwill@google.com> wrote:
> Have 'repo_read_index()' behave more like the other read_index family of
> functions and don't discard the index if it has already been populated.

instead rely on the quick return of read_index_from which has

    /* istate->initialized covers both .git/index and .git/sharedindex.xxx */
    if (istate->initialized)
        return istate->cache_nr;

such that we do not have memory leaks or other issues. Currently
we do not have a lot of callers, such that we can change the contract
of the 'repo_read_index' function easily. However going through all
the callers and then looking at the implementation, may hint at a
desire to have repo_read_index documented in repository.h
(There is a hint in struct repository, that its field index can be initialized
using repo_read_index, but what does repo_read_index actually do?)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/3] setup: have the_repository use the_index
  2017-07-11 22:04 ` [PATCH 2/3] setup: have the_repository use the_index Brandon Williams
@ 2017-07-12  0:00   ` Jonathan Nieder
  2017-07-12  0:07     ` Stefan Beller
  2017-07-12 17:30     ` Brandon Williams
  2017-07-12  0:11   ` Junio C Hamano
  1 sibling, 2 replies; 68+ messages in thread
From: Jonathan Nieder @ 2017-07-12  0:00 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller

Hi,

Brandon Williams wrote:

> Have the index state which is stored in 'the_repository' be a pointer to
> the in-core instead 'the_index'.  This makes it easier to begin
> transitioning more parts of the code base to operate on a 'struct
> repository'.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  setup.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/setup.c b/setup.c
> index 860507e1f..b370bf3c1 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -1123,6 +1123,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
>  			setup_git_env();
>  		}
>  	}
> +	the_repository->index = &the_index;

I wonder if this can be done sooner.  For example, does the following
work?  This way, 'the_repository->index == &the_index' would be an
invariant that always holds, even in the early setup stage before
setup_git_directory_gently has run completely.

Thanks,
Jonathan

diff --git i/repository.c w/repository.c
index edca907404..bdc1f93282 100644
--- i/repository.c
+++ w/repository.c
@@ -4,7 +4,7 @@
 #include "submodule-config.h"
 
 /* The main repository */
-static struct repository the_repo;
+static struct repository the_repo = { .index = &the_index };
 struct repository *the_repository = &the_repo;
 
 static char *git_path_from_env(const char *envvar, const char *git_dir,

^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH 3/3] grep: recurse in-process using 'struct repository'
  2017-07-11 22:04 ` [PATCH 3/3] grep: recurse in-process using 'struct repository' Brandon Williams
  2017-07-11 22:44   ` Jacob Keller
@ 2017-07-12  0:04   ` Stefan Beller
  2017-07-12 18:56     ` Brandon Williams
  2017-07-12  0:25   ` Jonathan Nieder
  2 siblings, 1 reply; 68+ messages in thread
From: Stefan Beller @ 2017-07-12  0:04 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git@vger.kernel.org

On Tue, Jul 11, 2017 at 3:04 PM, Brandon Williams <bmwill@google.com> wrote:

> +       if (repo_submodule_init(&submodule, superproject, path))
> +               return 0;

What happens if we go through the "return 0", do we rather want to
print an error ?

> +       /* add objects to alternates */
> +       add_to_alternates_memory(submodule.objectdir);

Not trying to make my object series more important than it is... but
we really don't want to spread this add_to_alternates_memory hack. :/

I agree with Jacob that a patch with such a diffstat is a joy to review. :)

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/3] setup: have the_repository use the_index
  2017-07-12  0:00   ` Jonathan Nieder
@ 2017-07-12  0:07     ` Stefan Beller
  2017-07-12 17:30     ` Brandon Williams
  1 sibling, 0 replies; 68+ messages in thread
From: Stefan Beller @ 2017-07-12  0:07 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, git@vger.kernel.org

On Tue, Jul 11, 2017 at 5:00 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:

>  /* The main repository */
> -static struct repository the_repo;
> +static struct repository the_repo = { .index = &the_index };

https://public-inbox.org/git/20170710070342.txmlwwq6gvjkwtw7@sigill.intra.peff.net/
specifically said we'd not use all the features today
but want to have the test balloon long enough up in
the air? (So this is just a critique of the syntax, I agree on the content)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/3] setup: have the_repository use the_index
  2017-07-11 22:04 ` [PATCH 2/3] setup: have the_repository use the_index Brandon Williams
  2017-07-12  0:00   ` Jonathan Nieder
@ 2017-07-12  0:11   ` Junio C Hamano
  2017-07-12 18:01     ` Brandon Williams
  1 sibling, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2017-07-12  0:11 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller

Brandon Williams <bmwill@google.com> writes:

> Have the index state which is stored in 'the_repository' be a pointer to
> the in-core instead 'the_index'.  This makes it easier to begin
> transitioning more parts of the code base to operate on a 'struct
> repository'.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  setup.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/setup.c b/setup.c
> index 860507e1f..b370bf3c1 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -1123,6 +1123,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
>  			setup_git_env();
>  		}
>  	}
> +	the_repository->index = &the_index;
>  
>  	strbuf_release(&dir);
>  	strbuf_release(&gitdir);

I would have expected this to be going in the different direction,
i.e. there is an embedded instance of index_state in a repository
object, and the_repository.index is defined to be the old the_index,
i.e.

	#define the_index (the_repository.index)

When a Git command that recurses into submodules in-core using
the_repository that represents the top-level superproject and
another repository object tht represents a submodule, don't we want
the repository object for the submodule also have its own default
index without having to allocate one and point at it with the index
field?

I dunno.  Being able to leave the index field NULL lets you say
"this is a bare repository and there is no place for the index file
for it", but even if we never write out the in-core index to an
index file on disk, being able to populate the in-core index that is
default for the repository object from a tree-ish and iterating over
it (e.g.  running in-core merge of trees) is a useful thing to do,
so I do not consider "index field can be set to NULL to signify a
bare repository" a very strong plus.


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 3/3] grep: recurse in-process using 'struct repository'
  2017-07-11 22:04 ` [PATCH 3/3] grep: recurse in-process using 'struct repository' Brandon Williams
  2017-07-11 22:44   ` Jacob Keller
  2017-07-12  0:04   ` Stefan Beller
@ 2017-07-12  0:25   ` Jonathan Nieder
  2017-07-12 18:49     ` Brandon Williams
  2 siblings, 1 reply; 68+ messages in thread
From: Jonathan Nieder @ 2017-07-12  0:25 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller

Hi,

Brandon Williams wrote:

> Convert grep to use 'struct repository' which enables recursing into
> submodules to be handled in-process.

\o/

This will be even nicer with the changes described at
https://public-inbox.org/git/20170706202739.6056-1-sbeller@google.com/.
Until then, I fear it will cause a regression --- see (*) below.

[...]
>  Documentation/git-grep.txt |   7 -
>  builtin/grep.c             | 390 +++++++++------------------------------------
>  cache.h                    |   1 -
>  git.c                      |   2 +-
>  grep.c                     |  13 --
>  grep.h                     |   1 -
>  setup.c                    |  12 +-
>  7 files changed, 81 insertions(+), 345 deletions(-)

Yay, tests still pass.

[..]
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -95,13 +95,6 @@ OPTIONS
>  	<tree> option the prefix of all submodule output will be the name of
>  	the parent project's <tree> object.
>  
> ---parent-basename <basename>::
> -	For internal use only.  In order to produce uniform output with the
> -	--recurse-submodules option, this option can be used to provide the
> -	basename of a parent's <tree> object to a submodule so the submodule
> -	can prefix its output with the parent's name rather than the SHA1 of
> -	the submodule.

Being able to get rid of this is a very nice change.

[...]
> +++ b/builtin/grep.c
[...]
> @@ -366,14 +349,10 @@ static int grep_file(struct grep_opt *opt, const char *filename)
>  {
>  	struct strbuf buf = STRBUF_INIT;
>  
> -	if (super_prefix)
> -		strbuf_addstr(&buf, super_prefix);
> -	strbuf_addstr(&buf, filename);
> -
>  	if (opt->relative && opt->prefix_length) {
> -		char *name = strbuf_detach(&buf, NULL);
> -		quote_path_relative(name, opt->prefix, &buf);
> -		free(name);
> +		quote_path_relative(filename, opt->prefix, &buf);
> +	} else {
> +		strbuf_addstr(&buf, filename);
>  	}

style micronit: can avoid these braces since both branches are
single-line.

[...]
> @@ -421,284 +400,80 @@ static void run_pager(struct grep_opt *opt, const char *prefix)
>  		exit(status);
>  }
>  
> -static void compile_submodule_options(const struct grep_opt *opt,
> -				      const char **argv,
> -				      int cached, int untracked,
> -				      int opt_exclude, int use_index,
> -				      int pattern_type_arg)
> -{
[...]
> -	/*
> -	 * Limit number of threads for child process to use.
> -	 * This is to prevent potential fork-bomb behavior of git-grep as each
> -	 * submodule process has its own thread pool.
> -	 */
> -	argv_array_pushf(&submodule_options, "--threads=%d",
> -			 (num_threads + 1) / 2);

Being able to get rid of this is another very nice change.

[...]
> +	/* add objects to alternates */
> +	add_to_alternates_memory(submodule.objectdir);

(*) This sets up a single in-memory object store with all the
processed submodules.  Processed objects are never freed.
This means that if I run a command like

	git grep --recurse-submodules -e neverfound HEAD

in a project with many submodules then memory consumption scales in
the same way as if the project were all one repository.  By contrast,
without this patch, git is able to take advantage of the implicit
free() when each child exits to limit its memory usage.

Worse, this increases the number of pack files git has to pay
attention to the sum of the numbers of pack files in all the
repositories processed so far.  A single object lookup can take
O(number of packs * log(number of objects in each pack)) time.  That
means performance is likely to suffer as the number of submodules
increases (n^2 performance) even on systems with a lot of memory.

Once the object store is part of the repository struct and freeable,
those problems go away and this patch becomes a no-brainer.

What should happen until then?  Should this go in "next" so we can get
experience with it but with care not to let it graduate to "master"?

Aside from those two concerns, this patch looks very good from a quick
skim, though I haven't reviewed it closely line-by-line.  Once we know
how to go forward, I'm happy to look at it again.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 0/3] Convert grep to recurse in-process
  2017-07-11 22:04 [PATCH 0/3] Convert grep to recurse in-process Brandon Williams
                   ` (2 preceding siblings ...)
  2017-07-11 22:04 ` [PATCH 3/3] grep: recurse in-process using 'struct repository' Brandon Williams
@ 2017-07-12  7:42 ` Jeff King
  2017-07-12 18:06   ` Brandon Williams
  2017-07-12 18:09   ` Jonathan Nieder
  2017-07-14 22:28 ` [PATCH v2 " Brandon Williams
  4 siblings, 2 replies; 68+ messages in thread
From: Jeff King @ 2017-07-12  7:42 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller

On Tue, Jul 11, 2017 at 03:04:05PM -0700, Brandon Williams wrote:

> This series utilizes the new 'struct repository' in order to convert grep to be
> able to recurse into submodules in-process much like how ls-files was converted
> to recuse in-process.  The result is a much smaller code footprint due to not
> needing to compile an argv array of options to be used when launched a process
> for operating on a submodule.

I didn't follow the rest of the "struct repository" series closely, but
I don't feel like we ever reached a resolution on how config would be
handled. I notice that the in-process "ls-files" behaves differently
than the old one when config differs between the submodule and the
parent repository. As we convert more commands (that use more config)
this will become more likely to be noticed by somebody.

Do we have a plan for dealing with this? Is our solution just "recursed
operations always respect the parent config, deal with it"?

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/3] repo_read_index: don't discard the index
  2017-07-11 23:58   ` Stefan Beller
@ 2017-07-12 17:23     ` Brandon Williams
  0 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-12 17:23 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org

On 07/11, Stefan Beller wrote:
> On Tue, Jul 11, 2017 at 3:04 PM, Brandon Williams <bmwill@google.com> wrote:
> > Have 'repo_read_index()' behave more like the other read_index family of
> > functions and don't discard the index if it has already been populated.
> 
> instead rely on the quick return of read_index_from which has
> 
>     /* istate->initialized covers both .git/index and .git/sharedindex.xxx */
>     if (istate->initialized)
>         return istate->cache_nr;
> 

I can include this in the commit msg.


> such that we do not have memory leaks or other issues. Currently
> we do not have a lot of callers, such that we can change the contract
> of the 'repo_read_index' function easily. However going through all
> the callers and then looking at the implementation, may hint at a
> desire to have repo_read_index documented in repository.h
> (There is a hint in struct repository, that its field index can be initialized
> using repo_read_index, but what does repo_read_index actually do?)

And I can document the function better in repository.h :)

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 1/3] repo_read_index: don't discard the index
  2017-07-11 23:51   ` Jonathan Nieder
@ 2017-07-12 17:27     ` Brandon Williams
  0 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-12 17:27 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller

On 07/11, Jonathan Nieder wrote:
> Brandon Williams wrote:
> 
> > Have 'repo_read_index()' behave more like the other read_index family of
> > functions and don't discard the index if it has already been populated.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  repository.c | 2 --
> >  1 file changed, 2 deletions(-)
> 
> How did you discover this?  E.g. was it from code inspection or does
> this make the function more convenient to use for some kinds of callers?

When working on another series I realized that some code paths may end
up calling read_index() a bunch and I want to prevent discarding and
then re-reading the same index over and over again if/when those calls
are migrated to using repo_read_index.

> 
> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/3] setup: have the_repository use the_index
  2017-07-12  0:00   ` Jonathan Nieder
  2017-07-12  0:07     ` Stefan Beller
@ 2017-07-12 17:30     ` Brandon Williams
  1 sibling, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-12 17:30 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller

On 07/11, Jonathan Nieder wrote:
> Hi,
> 
> Brandon Williams wrote:
> 
> > Have the index state which is stored in 'the_repository' be a pointer to
> > the in-core instead 'the_index'.  This makes it easier to begin
> > transitioning more parts of the code base to operate on a 'struct
> > repository'.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  setup.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/setup.c b/setup.c
> > index 860507e1f..b370bf3c1 100644
> > --- a/setup.c
> > +++ b/setup.c
> > @@ -1123,6 +1123,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
> >  			setup_git_env();
> >  		}
> >  	}
> > +	the_repository->index = &the_index;
> 
> I wonder if this can be done sooner.  For example, does the following
> work?  This way, 'the_repository->index == &the_index' would be an
> invariant that always holds, even in the early setup stage before
> setup_git_directory_gently has run completely.
> 
> Thanks,
> Jonathan
> 
> diff --git i/repository.c w/repository.c
> index edca907404..bdc1f93282 100644
> --- i/repository.c
> +++ w/repository.c
> @@ -4,7 +4,7 @@
>  #include "submodule-config.h"
>  
>  /* The main repository */
> -static struct repository the_repo;
> +static struct repository the_repo = { .index = &the_index };
>  struct repository *the_repository = &the_repo;
>  
>  static char *git_path_from_env(const char *envvar, const char *git_dir,

I agree with your approach, though as stefan pointed out we may not be
able to use the syntax just yet...but we should still be able to use the
bulky old syntax for the time being.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/3] setup: have the_repository use the_index
  2017-07-12  0:11   ` Junio C Hamano
@ 2017-07-12 18:01     ` Brandon Williams
  2017-07-12 20:38       ` Junio C Hamano
  0 siblings, 1 reply; 68+ messages in thread
From: Brandon Williams @ 2017-07-12 18:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sbeller

On 07/11, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > Have the index state which is stored in 'the_repository' be a pointer to
> > the in-core instead 'the_index'.  This makes it easier to begin
> > transitioning more parts of the code base to operate on a 'struct
> > repository'.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  setup.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/setup.c b/setup.c
> > index 860507e1f..b370bf3c1 100644
> > --- a/setup.c
> > +++ b/setup.c
> > @@ -1123,6 +1123,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
> >  			setup_git_env();
> >  		}
> >  	}
> > +	the_repository->index = &the_index;
> >  
> >  	strbuf_release(&dir);
> >  	strbuf_release(&gitdir);
> 
> I would have expected this to be going in the different direction,
> i.e. there is an embedded instance of index_state in a repository
> object, and the_repository.index is defined to be the old the_index,
> i.e.
> 
> 	#define the_index (the_repository.index)
> 
> When a Git command that recurses into submodules in-core using
> the_repository that represents the top-level superproject and
> another repository object tht represents a submodule, don't we want
> the repository object for the submodule also have its own default
> index without having to allocate one and point at it with the index
> field?

For all intents and purposes the index struct that is stored in 'struct
repository' is an embedded instance, its just stored as a pointer
instead of being a direct part of the struct itself.  As far as
submodules are concerned, thats why 'repo_read_index' exists since it
will allocate the index struct if needed and then populate it with the
index file associated with that repository.  So the 'struct repository'
owns that instance of 'struct index_state'.

Since it is a pointer then using a '#define' to replace 'the_index'
(which is not a pointer) would be a little more challenging.

> 
> I dunno.  Being able to leave the index field NULL lets you say
> "this is a bare repository and there is no place for the index file
> for it", but even if we never write out the in-core index to an
> index file on disk, being able to populate the in-core index that is
> default for the repository object from a tree-ish and iterating over
> it (e.g.  running in-core merge of trees) is a useful thing to do,
> so I do not consider "index field can be set to NULL to signify a
> bare repository" a very strong plus.
> 

I'd probably say that we'd eventually need a bit field in 'struct
repository' which indicates if a repository is bare.  I think we already
have a global variable somewhere which stores this sort of information.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 0/3] Convert grep to recurse in-process
  2017-07-12  7:42 ` [PATCH 0/3] Convert grep to recurse in-process Jeff King
@ 2017-07-12 18:06   ` Brandon Williams
  2017-07-12 18:17     ` Jeff King
  2017-07-12 18:09   ` Jonathan Nieder
  1 sibling, 1 reply; 68+ messages in thread
From: Brandon Williams @ 2017-07-12 18:06 UTC (permalink / raw)
  To: Jeff King; +Cc: git, sbeller

On 07/12, Jeff King wrote:
> On Tue, Jul 11, 2017 at 03:04:05PM -0700, Brandon Williams wrote:
> 
> > This series utilizes the new 'struct repository' in order to convert grep to be
> > able to recurse into submodules in-process much like how ls-files was converted
> > to recuse in-process.  The result is a much smaller code footprint due to not
> > needing to compile an argv array of options to be used when launched a process
> > for operating on a submodule.
> 
> I didn't follow the rest of the "struct repository" series closely, but
> I don't feel like we ever reached a resolution on how config would be
> handled. I notice that the in-process "ls-files" behaves differently
> than the old one when config differs between the submodule and the
> parent repository. As we convert more commands (that use more config)
> this will become more likely to be noticed by somebody.
> 
> Do we have a plan for dealing with this? Is our solution just "recursed
> operations always respect the parent config, deal with it"?

Each 'struct repository' does have its own config so we could
potentially want a config in a submodule to override some config in the
superproject.  Though for right now it may be simpler to not worry about
doing this overriding, mostly because you would only want to allow
overriding of some configuration and not all configuration.  One example
would be the number of threads allowed in grep, it doesn't make much
sense to let a submodule's configuration of this to trump the
superproject's since the command was invoked in the context of the
superproject.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 0/3] Convert grep to recurse in-process
  2017-07-12  7:42 ` [PATCH 0/3] Convert grep to recurse in-process Jeff King
  2017-07-12 18:06   ` Brandon Williams
@ 2017-07-12 18:09   ` Jonathan Nieder
  2017-07-12 18:17     ` Stefan Beller
  2017-07-12 18:27     ` Jeff King
  1 sibling, 2 replies; 68+ messages in thread
From: Jonathan Nieder @ 2017-07-12 18:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, git, sbeller

Hi,

Jeff King wrote:

> I didn't follow the rest of the "struct repository" series closely, but
> I don't feel like we ever reached a resolution on how config would be
> handled. I notice that the in-process "ls-files" behaves differently
> than the old one when config differs between the submodule and the
> parent repository. As we convert more commands (that use more config)
> this will become more likely to be noticed by somebody.
>
> Do we have a plan for dealing with this? Is our solution just "recursed
> operations always respect the parent config, deal with it"?

For settings like branch.<name>.remote, I don't think anyone would
disagree that the right thing to do is to use the per-repository
config of the submodule.  The repository object is already able to
handle per-repository config, so this just involves callers being
careful not to cache values locally in a way that conflates
repositories.  It should be pretty straightforward (for commands like
"git fetch --recurse-submodules", for example).

For settings like grep.patternType, on the other hand, it would be
very strange for the behavior to change when grep crosses the
submodule boundary.  So I think using the parent project config is the
right thing to do and the old behavior was simply wrong.  In other
words, I don't think this is so much a case of "deal with it" as
"sorry we got the behavior so wrong before --- we've finally fixed it
now".

But this is subtle.  Maybe some notes in the config documentation for
relevant settings would help.  That would make the intended behavior
clearer and make debugging easier for users.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 0/3] Convert grep to recurse in-process
  2017-07-12 18:09   ` Jonathan Nieder
@ 2017-07-12 18:17     ` Stefan Beller
  2017-07-12 18:27     ` Jeff King
  1 sibling, 0 replies; 68+ messages in thread
From: Stefan Beller @ 2017-07-12 18:17 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Jeff King, Brandon Williams, git@vger.kernel.org

On Wed, Jul 12, 2017 at 11:09 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi,
>
> Jeff King wrote:
>
>> I didn't follow the rest of the "struct repository" series closely, but
>> I don't feel like we ever reached a resolution on how config would be
>> handled. I notice that the in-process "ls-files" behaves differently
>> than the old one when config differs between the submodule and the
>> parent repository. As we convert more commands (that use more config)
>> this will become more likely to be noticed by somebody.
>>
>> Do we have a plan for dealing with this? Is our solution just "recursed
>> operations always respect the parent config, deal with it"?
>
> For settings like branch.<name>.remote, I don't think anyone would
> disagree that the right thing to do is to use the per-repository
> config of the submodule.  The repository object is already able to
> handle per-repository config, so this just involves callers being
> careful not to cache values locally in a way that conflates
> repositories.  It should be pretty straightforward (for commands like
> "git fetch --recurse-submodules", for example).
>
> For settings like grep.patternType, on the other hand, it would be
> very strange for the behavior to change when grep crosses the
> submodule boundary.

That is because this option relates to the input given.
Other options relate to the items to be processed, such as grep.color.*
which I would not find strange if they were respected in the submodule.

> So I think using the parent project config is the
> right thing to do and the old behavior was simply wrong.  In other
> words, I don't think this is so much a case of "deal with it" as
> "sorry we got the behavior so wrong before --- we've finally fixed it
> now".

In an ideal world we should pay partial attention to the config
in the submodule, and for each config key we'd have to determine
if it rather relates to the general runtime (grep.threads), the command
line options (grep.patternType) or to the specific content inside the repo
(coloring, whether we show the line number).

For now and in reality we can ignore the content specific settings
and claim that any setting here is related to runtime and command line
options, both of which a user may expect the superproject to win in
case of differing configurations.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 0/3] Convert grep to recurse in-process
  2017-07-12 18:06   ` Brandon Williams
@ 2017-07-12 18:17     ` Jeff King
  2017-07-12 18:24       ` Jonathan Nieder
  0 siblings, 1 reply; 68+ messages in thread
From: Jeff King @ 2017-07-12 18:17 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller

On Wed, Jul 12, 2017 at 11:06:03AM -0700, Brandon Williams wrote:

> > I didn't follow the rest of the "struct repository" series closely, but
> > I don't feel like we ever reached a resolution on how config would be
> > handled. I notice that the in-process "ls-files" behaves differently
> > than the old one when config differs between the submodule and the
> > parent repository. As we convert more commands (that use more config)
> > this will become more likely to be noticed by somebody.
> > 
> > Do we have a plan for dealing with this? Is our solution just "recursed
> > operations always respect the parent config, deal with it"?
> 
> Each 'struct repository' does have its own config so we could
> potentially want a config in a submodule to override some config in the
> superproject.  Though for right now it may be simpler to not worry about
> doing this overriding, mostly because you would only want to allow
> overriding of some configuration and not all configuration.  One example
> would be the number of threads allowed in grep, it doesn't make much
> sense to let a submodule's configuration of this to trump the
> superproject's since the command was invoked in the context of the
> superproject.

I'm not sure I agree 100% with that example. What makes threads special,
I think, is not the config but the total count spread across all of the
recursive processes. So it's not that we don't want to respect submodule
config so much as we want to take the submodule config into account, but
throttle it based on what other threads are running.

So if your superproject says "1" and the submodule says "8", I'd expect
"8" threads to run in the submodule. If you're already running another 3
threads on behalf of another submodule, I think it would be reasonable
to do some job control and only give the submodule 5 slots. But I don't
think that should happen at the config layer. It should probably happen
when the submodule decides to spawn threads, and it should ask of the
superproject "how many slots am I allowed?".

I think that's probably one of the more complicated cases, and I don't
think it really needs to be done on day one. Setting differing thread
counts is even more unlikely than the rest of the config. I suspect
doing job management in general would come up first, because people
don't want to fork-bomb themselves.

Anyway, that got pretty far afield. What I was trying to say is that I
think you can treat the config uniformly, without making special
exceptions for things like grep.threads.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 0/3] Convert grep to recurse in-process
  2017-07-12 18:17     ` Jeff King
@ 2017-07-12 18:24       ` Jonathan Nieder
  2017-07-12 18:33         ` Jeff King
  0 siblings, 1 reply; 68+ messages in thread
From: Jonathan Nieder @ 2017-07-12 18:24 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, git, sbeller

Jeff King wrote:
> On Wed, Jul 12, 2017 at 11:06:03AM -0700, Brandon Williams wrote:

>> Each 'struct repository' does have its own config so we could
>> potentially want a config in a submodule to override some config in the
>> superproject.  Though for right now it may be simpler to not worry about
>> doing this overriding, mostly because you would only want to allow
>> overriding of some configuration and not all configuration.  One example
>> would be the number of threads allowed in grep,
[...]
> I think that's probably one of the more complicated cases, and I don't
> think it really needs to be done on day one.

That's fair.  Could you give an example of a simpler case?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 0/3] Convert grep to recurse in-process
  2017-07-12 18:09   ` Jonathan Nieder
  2017-07-12 18:17     ` Stefan Beller
@ 2017-07-12 18:27     ` Jeff King
  1 sibling, 0 replies; 68+ messages in thread
From: Jeff King @ 2017-07-12 18:27 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, git, sbeller

On Wed, Jul 12, 2017 at 11:09:23AM -0700, Jonathan Nieder wrote:

> > I didn't follow the rest of the "struct repository" series closely, but
> > I don't feel like we ever reached a resolution on how config would be
> > handled. I notice that the in-process "ls-files" behaves differently
> > than the old one when config differs between the submodule and the
> > parent repository. As we convert more commands (that use more config)
> > this will become more likely to be noticed by somebody.
> >
> > Do we have a plan for dealing with this? Is our solution just "recursed
> > operations always respect the parent config, deal with it"?
> 
> For settings like branch.<name>.remote, I don't think anyone would
> disagree that the right thing to do is to use the per-repository
> config of the submodule.  The repository object is already able to
> handle per-repository config, so this just involves callers being
> careful not to cache values locally in a way that conflates
> repositories.  It should be pretty straightforward (for commands like
> "git fetch --recurse-submodules", for example).

I agree that's the right approach. What I'm worried about is that I see
in-process work proceeding without the "callers being careful" part
being audited, which can lead to regressions (e.g., ls-files with
core.quotepath is "broken" in next right now). Though at least the
regression would be limited to people using submodules.

> For settings like grep.patternType, on the other hand, it would be
> very strange for the behavior to change when grep crosses the
> submodule boundary.  So I think using the parent project config is the
> right thing to do and the old behavior was simply wrong.  In other
> words, I don't think this is so much a case of "deal with it" as
> "sorry we got the behavior so wrong before --- we've finally fixed it
> now".

I think that's not actually about the parent project's config, as much
as it is about the parameters of the current operation. I.e., the
argument is that this particular grep operation is using a particular
pattern type, no matter how we arrived at that decision, and it should
be used for all of the recursive bits. So whether the superproject has
its own grep.patternType set, or whether the user used "-P" on the
command line, the result is the same: we need to tell the submodule to
ignore any config and use the parameters we feed it.

In a multi-process model, that should happen by converting all of the
bits in "struct grep_opt" back into command-line parameters and feeding
them to the recursive processes (which would then give them precedence
over any config). But I'm pretty sure we don't do that.

In the in-process model, that would hopefully be a bit simpler, as we'd
just pass in a pre-made grep_opt.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 0/3] Convert grep to recurse in-process
  2017-07-12 18:24       ` Jonathan Nieder
@ 2017-07-12 18:33         ` Jeff King
  0 siblings, 0 replies; 68+ messages in thread
From: Jeff King @ 2017-07-12 18:33 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, git, sbeller

On Wed, Jul 12, 2017 at 11:24:47AM -0700, Jonathan Nieder wrote:

> Jeff King wrote:
> > On Wed, Jul 12, 2017 at 11:06:03AM -0700, Brandon Williams wrote:
> 
> >> Each 'struct repository' does have its own config so we could
> >> potentially want a config in a submodule to override some config in the
> >> superproject.  Though for right now it may be simpler to not worry about
> >> doing this overriding, mostly because you would only want to allow
> >> overriding of some configuration and not all configuration.  One example
> >> would be the number of threads allowed in grep,
> [...]
> > I think that's probably one of the more complicated cases, and I don't
> > think it really needs to be done on day one.
> 
> That's fair.  Could you give an example of a simpler case?

I think core.quotepath that I gave is one example that I'd expect to
work differently than it does in 'next'. Though I really think that's a
fairly uninteresting one, as it's extremely unlikely to be set on a
per-repo basis.

Something like color.grep.* is in the same boat (I think it ought to
respect submodule config, but I'd be surprised if anybody cares).

I know those are kind of lame examples, and if they remain broken
forever, I don't know if anybody would care. I'm more concerned about us
missing a case that causes a regression (and unlike some regressions,
where you say "oops, a bug" and fix it, I'd worry that we'll have made a
big jump to an in-process model, and the only fixes are "implement a
complete split-config solution" or "revert back to multiple processes").

I do agree that not having concrete examples makes it harder to reason
about the desired semantics.

-Peff

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 3/3] grep: recurse in-process using 'struct repository'
  2017-07-12  0:25   ` Jonathan Nieder
@ 2017-07-12 18:49     ` Brandon Williams
  0 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-12 18:49 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller

On 07/11, Jonathan Nieder wrote:
> Hi,
> 
> Brandon Williams wrote:
> 
> > Convert grep to use 'struct repository' which enables recursing into
> > submodules to be handled in-process.
> 
> \o/
> 
> This will be even nicer with the changes described at
> https://public-inbox.org/git/20170706202739.6056-1-sbeller@google.com/.
> Until then, I fear it will cause a regression --- see (*) below.
> 
> [...]
> >  Documentation/git-grep.txt |   7 -
> >  builtin/grep.c             | 390 +++++++++------------------------------------
> >  cache.h                    |   1 -
> >  git.c                      |   2 +-
> >  grep.c                     |  13 --
> >  grep.h                     |   1 -
> >  setup.c                    |  12 +-
> >  7 files changed, 81 insertions(+), 345 deletions(-)
> 
> Yay, tests still pass.
> 
> [..]
> > --- a/Documentation/git-grep.txt
> > +++ b/Documentation/git-grep.txt
> > @@ -95,13 +95,6 @@ OPTIONS
> >  	<tree> option the prefix of all submodule output will be the name of
> >  	the parent project's <tree> object.
> >  
> > ---parent-basename <basename>::
> > -	For internal use only.  In order to produce uniform output with the
> > -	--recurse-submodules option, this option can be used to provide the
> > -	basename of a parent's <tree> object to a submodule so the submodule
> > -	can prefix its output with the parent's name rather than the SHA1 of
> > -	the submodule.
> 
> Being able to get rid of this is a very nice change.
> 
> [...]
> > +++ b/builtin/grep.c
> [...]
> > @@ -366,14 +349,10 @@ static int grep_file(struct grep_opt *opt, const char *filename)
> >  {
> >  	struct strbuf buf = STRBUF_INIT;
> >  
> > -	if (super_prefix)
> > -		strbuf_addstr(&buf, super_prefix);
> > -	strbuf_addstr(&buf, filename);
> > -
> >  	if (opt->relative && opt->prefix_length) {
> > -		char *name = strbuf_detach(&buf, NULL);
> > -		quote_path_relative(name, opt->prefix, &buf);
> > -		free(name);
> > +		quote_path_relative(filename, opt->prefix, &buf);
> > +	} else {
> > +		strbuf_addstr(&buf, filename);
> >  	}
> 
> style micronit: can avoid these braces since both branches are
> single-line.

Didn't realize that with all the deleted lines, I'll fix for the next
version.

> 
> [...]
> > @@ -421,284 +400,80 @@ static void run_pager(struct grep_opt *opt, const char *prefix)
> >  		exit(status);
> >  }
> >  
> > -static void compile_submodule_options(const struct grep_opt *opt,
> > -				      const char **argv,
> > -				      int cached, int untracked,
> > -				      int opt_exclude, int use_index,
> > -				      int pattern_type_arg)
> > -{
> [...]
> > -	/*
> > -	 * Limit number of threads for child process to use.
> > -	 * This is to prevent potential fork-bomb behavior of git-grep as each
> > -	 * submodule process has its own thread pool.
> > -	 */
> > -	argv_array_pushf(&submodule_options, "--threads=%d",
> > -			 (num_threads + 1) / 2);
> 
> Being able to get rid of this is another very nice change.
> 
> [...]
> > +	/* add objects to alternates */
> > +	add_to_alternates_memory(submodule.objectdir);
> 
> (*) This sets up a single in-memory object store with all the
> processed submodules.  Processed objects are never freed.
> This means that if I run a command like
> 
> 	git grep --recurse-submodules -e neverfound HEAD
> 
> in a project with many submodules then memory consumption scales in
> the same way as if the project were all one repository.  By contrast,
> without this patch, git is able to take advantage of the implicit
> free() when each child exits to limit its memory usage.
> 
> Worse, this increases the number of pack files git has to pay
> attention to the sum of the numbers of pack files in all the
> repositories processed so far.  A single object lookup can take
> O(number of packs * log(number of objects in each pack)) time.  That
> means performance is likely to suffer as the number of submodules
> increases (n^2 performance) even on systems with a lot of memory.
> 
> Once the object store is part of the repository struct and freeable,
> those problems go away and this patch becomes a no-brainer.
> 
> What should happen until then?  Should this go in "next" so we can get
> experience with it but with care not to let it graduate to "master"?

I agree that this is an issue and that we need to address by having
an object store per repository.  While that is being worked on (by
Stefan) I don't know how long it would take to have it be a reality.
So the question ends up being do we care more about the state of the
code and cleaning up a lot of 'hacks' that I introduced to get grep
working with submodules, or do we care about the performance more.  I
don't know which is the right answer but I'd personally like to see the
hacks I added to be removed sooner rather than later.  That and I think
that with the code in this sate it would make it easier to transition
once we have per-repository object-stores.

Either way I should add a NEEDSWORK comment here to indicate that it
should be removed once per-repo object-stores exist.

> 
> Aside from those two concerns, this patch looks very good from a quick
> skim, though I haven't reviewed it closely line-by-line.  Once we know
> how to go forward, I'm happy to look at it again.
> 
> Thanks,
> Jonathan

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 3/3] grep: recurse in-process using 'struct repository'
  2017-07-11 22:44   ` Jacob Keller
@ 2017-07-12 18:54     ` Brandon Williams
  0 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-12 18:54 UTC (permalink / raw)
  To: Jacob Keller; +Cc: Git mailing list, Stefan Beller

On 07/11, Jacob Keller wrote:
> On Tue, Jul 11, 2017 at 3:04 PM, Brandon Williams <bmwill@google.com> wrote:
> > Convert grep to use 'struct repository' which enables recursing into
> > submodules to be handled in-process.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  Documentation/git-grep.txt |   7 -
> >  builtin/grep.c             | 390 +++++++++------------------------------------
> >  cache.h                    |   1 -
> >  git.c                      |   2 +-
> >  grep.c                     |  13 --
> >  grep.h                     |   1 -
> >  setup.c                    |  12 +-
> >  7 files changed, 81 insertions(+), 345 deletions(-)
> >
> 
> No real indepth comments here, but it's nice to see how much code
> reduction this has enabled!

Yeah overall, with this and the ls-files conversion, I'm really pleased
with how much cleaner the code looks moving to working in-process.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 3/3] grep: recurse in-process using 'struct repository'
  2017-07-12  0:04   ` Stefan Beller
@ 2017-07-12 18:56     ` Brandon Williams
  0 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-12 18:56 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org

On 07/11, Stefan Beller wrote:
> On Tue, Jul 11, 2017 at 3:04 PM, Brandon Williams <bmwill@google.com> wrote:
> 
> > +       if (repo_submodule_init(&submodule, superproject, path))
> > +               return 0;
> 
> What happens if we go through the "return 0", do we rather want to
> print an error ?

Should just indicate that there is no hit in the submodule, but if we
couldn't init the submodule maybe you're right and we should issue a
warning.

> 
> > +       /* add objects to alternates */
> > +       add_to_alternates_memory(submodule.objectdir);
> 
> Not trying to make my object series more important than it is... but
> we really don't want to spread this add_to_alternates_memory hack. :/

Nope your object series is definitely important IMO.  As I commented in
my reply to Jonathan, I'm not sure if we want to wait till that becomes
a reality or not.

> 
> I agree with Jacob that a patch with such a diffstat is a joy to review. :)
> 
> Thanks,
> Stefan

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/3] setup: have the_repository use the_index
  2017-07-12 18:01     ` Brandon Williams
@ 2017-07-12 20:38       ` Junio C Hamano
  2017-07-12 21:33         ` Jonathan Nieder
  0 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2017-07-12 20:38 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller

Brandon Williams <bmwill@google.com> writes:

> For all intents and purposes the index struct that is stored in 'struct
> repository' is an embedded instance, its just stored as a pointer
> instead of being a direct part of the struct itself.

The question really is this.  In order to realize the intents and
purposes to have an embeded instance, the most natural way to
implement it is to actually embed an instance.  There must be some
advantage of substituting that most natural implementation with a
pointer that needs to point at a separately allocated memory;
otherwise the implementation chosen is a poor imitation of the real
thing.  One downside of not actually embedding an instance is that
the code needs to do something different from what it has always
done to answer "is the index already populated?".  In the normal
codepath, it would look at istate.initialized but for the index
state that is emulatedly-embedded in the repository, it would also
have to see if the pointer is NULL.

And I do not see why a pointer to an allocated struct was chosen,
and what advantage we wanted to extract from that design decision.

> As far as
> submodules are concerned, thats why 'repo_read_index' exists since it
> will allocate the index struct if needed and then populate it with the
> index file associated with that repository.  So the 'struct repository'
> owns that instance of 'struct index_state'.

All of the above merely makes an excuse why you can work around the
fact that the index field is a pointer.  It does not say why it is
better not to embed the real thing at all.

> Since it is a pointer then using a '#define' to replace 'the_index'
> (which is not a pointer) would be a little more challenging.

The above is merely realizing another downside that stems from the
earlier design decision that the index field is not a real embedded
structure, but is a pointer.  It does not explain why it is better
to have a pointer to an allocated structure in the first place.

I am not (yet) telling you to fix the design to have a pointer
"index" by replacing it with an embedded structure.  I may actually
do so later, but I am first trying to find out if it is a right
design decision with some advantage.  If there is some advantage to
have it as a pointer to an allocated structure, then perhaps we may
even want to do the conversion the other way by declaring the_index
is always a pointer to an allocated structure that _could_ be NULL.
We can even lose the istate.initialized bit if we did so, as the way
to answer "is the index already populated?" in the new world order
would be to see if it is NULL.

But if there isn't any advantage, then I _would_ tell you that the
design to have it as a pointer in the repository structure _is_ a
mistake.  But I do not think I was given sufficient information to
decide it yet.

Thanks.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/3] setup: have the_repository use the_index
  2017-07-12 20:38       ` Junio C Hamano
@ 2017-07-12 21:33         ` Jonathan Nieder
  2017-07-12 21:40           ` Junio C Hamano
  0 siblings, 1 reply; 68+ messages in thread
From: Jonathan Nieder @ 2017-07-12 21:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Brandon Williams, git, sbeller

Hi,

Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:

>> Since it is a pointer then using a '#define' to replace 'the_index'
>> (which is not a pointer) would be a little more challenging.
>
> The above is merely realizing another downside that stems from the
> earlier design decision that the index field is not a real embedded
> structure, but is a pointer.  It does not explain why it is better
> to have a pointer to an allocated structure in the first place.
>
> I am not (yet) telling you to fix the design to have a pointer
> "index" by replacing it with an embedded structure.  I may actually
> do so later, but I am first trying to find out if it is a right
> design decision with some advantage.

Consider a command that doesn't need to access the index at all (e.g.,
"git grep --recurse-submodules -e foo HEAD").

In favor of using an embedding instead of a pointer, there is the
advantage that it makes initialization simpler.  (It also involves a
tiny speedup by avoiding a pointer indirection on access, but that's
more negligible.)  For that reason it was a good choice when there was
only one repository in memory: using such a small bounded portion of
.bss space in exchange for some convenience is a good trade.

When a process has multiple repositories in memory (for example one
per thread), the trade-off becomes different.  Instead of .bss, the
unused embedded index is on the stack or heap.  Using embedding would
mean that instead of an unused extra word in the per-repository
structure we get an unused ~24 words.

An argument could be made that we wouldn't want to waste either 1 word
or 24 words per in-memory repository object --- we'd want to waste 0
words and separately keep a map from repositories to index_state that
only gets populated when needed.  That complicates index access a bit
too much for my taste.  1 word instead of 0 or 24 seems like a
sensible compromise.

All that said, I don't have a strong opinion on this.  Both the 1-word
approach (a pointer) and 24-word approach (embedding) are tolerable
and there are reasons to prefer each.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/3] setup: have the_repository use the_index
  2017-07-12 21:33         ` Jonathan Nieder
@ 2017-07-12 21:40           ` Junio C Hamano
  2017-07-18 21:34             ` Junio C Hamano
  0 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2017-07-12 21:40 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, git, sbeller

Jonathan Nieder <jrnieder@gmail.com> writes:

>
> All that said, I don't have a strong opinion on this.  Both the 1-word
> approach (a pointer) and 24-word approach (embedding) are tolerable
> and there are reasons to prefer each.

I do not care too much about 24-word wastage.  If this were not "a
pointer pretending to be embedded object", the fix in 1/3 wouldn't
have been necessary.  I am worried about this being an invitations
for such unnecesasry bugs.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 0/3] Convert grep to recurse in-process
  2017-07-11 22:04 [PATCH 0/3] Convert grep to recurse in-process Brandon Williams
                   ` (3 preceding siblings ...)
  2017-07-12  7:42 ` [PATCH 0/3] Convert grep to recurse in-process Jeff King
@ 2017-07-14 22:28 ` Brandon Williams
  2017-07-14 22:28   ` [PATCH v2 1/3] repo_read_index: don't discard the index Brandon Williams
                     ` (3 more replies)
  4 siblings, 4 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-14 22:28 UTC (permalink / raw)
  To: git; +Cc: sbeller, peff, gitster, jrnieder, jacob.keller, Brandon Williams

Changes in v2:
* small style nits fixed.
* Comment describing function contract for repo_read_index
* NEEDSWORK comment in grep to describe the issue with adding the submodule's
  object store as an alternate.
* the_repository->index = &the_index was removed from setup.c and instead this
  is done as a static initializer.

Brandon Williams (3):
  repo_read_index: don't discard the index
  repository: have the_repository use the_index
  grep: recurse in-process using 'struct repository'

 Documentation/git-grep.txt |   7 -
 builtin/grep.c             | 396 ++++++++++-----------------------------------
 cache.h                    |   1 -
 git.c                      |   2 +-
 grep.c                     |  13 --
 grep.h                     |   1 -
 repository.c               |   6 +-
 repository.h               |   8 +
 setup.c                    |  12 +-
 9 files changed, 99 insertions(+), 347 deletions(-)

-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v2 1/3] repo_read_index: don't discard the index
  2017-07-14 22:28 ` [PATCH v2 " Brandon Williams
@ 2017-07-14 22:28   ` Brandon Williams
  2017-07-14 22:28   ` [PATCH v2 2/3] repository: have the_repository use the_index Brandon Williams
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-14 22:28 UTC (permalink / raw)
  To: git; +Cc: sbeller, peff, gitster, jrnieder, jacob.keller, Brandon Williams

Have 'repo_read_index()' behave more like the other read_index family of
functions and don't discard the index if it has already been populated
and instead rely on the quick return of read_index_from which has:

  /* istate->initialized covers both .git/index and .git/sharedindex.xxx */
  if (istate->initialized)
    return istate->cache_nr;

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 repository.c | 2 --
 repository.h | 8 ++++++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/repository.c b/repository.c
index edca90740..8e60af1d5 100644
--- a/repository.c
+++ b/repository.c
@@ -235,8 +235,6 @@ int repo_read_index(struct repository *repo)
 {
 	if (!repo->index)
 		repo->index = xcalloc(1, sizeof(*repo->index));
-	else
-		discard_index(repo->index);
 
 	return read_index_from(repo->index, repo->index_file);
 }
diff --git a/repository.h b/repository.h
index 417787f3e..7f5e24a0a 100644
--- a/repository.h
+++ b/repository.h
@@ -92,6 +92,14 @@ extern int repo_submodule_init(struct repository *submodule,
 			       const char *path);
 extern void repo_clear(struct repository *repo);
 
+/*
+ * Populates the repository's index from its index_file, an index struct will
+ * be allocated if needed.
+ *
+ * Return the number of index entries in the populated index or a value less
+ * than zero if an error occured.  If the repository's index has already been
+ * populated then the number of entries will simply be returned.
+ */
 extern int repo_read_index(struct repository *repo);
 
 #endif /* REPOSITORY_H */
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v2 2/3] repository: have the_repository use the_index
  2017-07-14 22:28 ` [PATCH v2 " Brandon Williams
  2017-07-14 22:28   ` [PATCH v2 1/3] repo_read_index: don't discard the index Brandon Williams
@ 2017-07-14 22:28   ` Brandon Williams
  2017-07-14 22:28   ` [PATCH v2 3/3] grep: recurse in-process using 'struct repository' Brandon Williams
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
  3 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-14 22:28 UTC (permalink / raw)
  To: git; +Cc: sbeller, peff, gitster, jrnieder, jacob.keller, Brandon Williams

Have the index state which is stored in 'the_repository' be a pointer to
the in-core index 'the_index'.  This makes it easier to begin
transitioning more parts of the code base to operate on a 'struct
repository'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 repository.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index 8e60af1d5..c0e0e0e7e 100644
--- a/repository.c
+++ b/repository.c
@@ -4,7 +4,9 @@
 #include "submodule-config.h"
 
 /* The main repository */
-static struct repository the_repo;
+static struct repository the_repo = {
+	NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, &the_index, 0, 0
+};
 struct repository *the_repository = &the_repo;
 
 static char *git_path_from_env(const char *envvar, const char *git_dir,
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v2 3/3] grep: recurse in-process using 'struct repository'
  2017-07-14 22:28 ` [PATCH v2 " Brandon Williams
  2017-07-14 22:28   ` [PATCH v2 1/3] repo_read_index: don't discard the index Brandon Williams
  2017-07-14 22:28   ` [PATCH v2 2/3] repository: have the_repository use the_index Brandon Williams
@ 2017-07-14 22:28   ` Brandon Williams
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
  3 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-14 22:28 UTC (permalink / raw)
  To: git; +Cc: sbeller, peff, gitster, jrnieder, jacob.keller, Brandon Williams

Convert grep to use 'struct repository' which enables recursing into
submodules to be handled in-process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/git-grep.txt |   7 -
 builtin/grep.c             | 396 ++++++++++-----------------------------------
 cache.h                    |   1 -
 git.c                      |   2 +-
 grep.c                     |  13 --
 grep.h                     |   1 -
 setup.c                    |  12 +-
 7 files changed, 88 insertions(+), 344 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 5033483db..720c7850e 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -95,13 +95,6 @@ OPTIONS
 	<tree> option the prefix of all submodule output will be the name of
 	the parent project's <tree> object.
 
---parent-basename <basename>::
-	For internal use only.  In order to produce uniform output with the
-	--recurse-submodules option, this option can be used to provide the
-	basename of a parent's <tree> object to a submodule so the submodule
-	can prefix its output with the parent's name rather than the SHA1 of
-	the submodule.
-
 -a::
 --text::
 	Process binary files as if they were text.
diff --git a/builtin/grep.c b/builtin/grep.c
index fa351c49f..728755d6d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -28,13 +28,7 @@ static char const * const grep_usage[] = {
 	NULL
 };
 
-static const char *super_prefix;
 static int recurse_submodules;
-static struct argv_array submodule_options = ARGV_ARRAY_INIT;
-static const char *parent_basename;
-
-static int grep_submodule_launch(struct grep_opt *opt,
-				 const struct grep_source *gs);
 
 #define GREP_NUM_THREADS_DEFAULT 8
 static int num_threads;
@@ -186,10 +180,7 @@ static void *run(void *arg)
 			break;
 
 		opt->output_priv = w;
-		if (w->source.type == GREP_SOURCE_SUBMODULE)
-			hit |= grep_submodule_launch(opt, &w->source);
-		else
-			hit |= grep_source(opt, &w->source);
+		hit |= grep_source(opt, &w->source);
 		grep_source_clear_data(&w->source);
 		work_done(w);
 	}
@@ -327,21 +318,13 @@ static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
 {
 	struct strbuf pathbuf = STRBUF_INIT;
 
-	if (super_prefix) {
-		strbuf_add(&pathbuf, filename, tree_name_len);
-		strbuf_addstr(&pathbuf, super_prefix);
-		strbuf_addstr(&pathbuf, filename + tree_name_len);
+	if (opt->relative && opt->prefix_length) {
+		quote_path_relative(filename + tree_name_len, opt->prefix, &pathbuf);
+		strbuf_insert(&pathbuf, 0, filename, tree_name_len);
 	} else {
 		strbuf_addstr(&pathbuf, filename);
 	}
 
-	if (opt->relative && opt->prefix_length) {
-		char *name = strbuf_detach(&pathbuf, NULL);
-		quote_path_relative(name + tree_name_len, opt->prefix, &pathbuf);
-		strbuf_insert(&pathbuf, 0, name, tree_name_len);
-		free(name);
-	}
-
 #ifndef NO_PTHREADS
 	if (num_threads) {
 		add_work(opt, GREP_SOURCE_OID, pathbuf.buf, path, oid);
@@ -366,15 +349,10 @@ static int grep_file(struct grep_opt *opt, const char *filename)
 {
 	struct strbuf buf = STRBUF_INIT;
 
-	if (super_prefix)
-		strbuf_addstr(&buf, super_prefix);
-	strbuf_addstr(&buf, filename);
-
-	if (opt->relative && opt->prefix_length) {
-		char *name = strbuf_detach(&buf, NULL);
-		quote_path_relative(name, opt->prefix, &buf);
-		free(name);
-	}
+	if (opt->relative && opt->prefix_length)
+		quote_path_relative(filename, opt->prefix, &buf);
+	else
+		strbuf_addstr(&buf, filename);
 
 #ifndef NO_PTHREADS
 	if (num_threads) {
@@ -421,284 +399,89 @@ static void run_pager(struct grep_opt *opt, const char *prefix)
 		exit(status);
 }
 
-static void compile_submodule_options(const struct grep_opt *opt,
-				      const char **argv,
-				      int cached, int untracked,
-				      int opt_exclude, int use_index,
-				      int pattern_type_arg)
-{
-	struct grep_pat *pattern;
-
-	if (recurse_submodules)
-		argv_array_push(&submodule_options, "--recurse-submodules");
-
-	if (cached)
-		argv_array_push(&submodule_options, "--cached");
-	if (!use_index)
-		argv_array_push(&submodule_options, "--no-index");
-	if (untracked)
-		argv_array_push(&submodule_options, "--untracked");
-	if (opt_exclude > 0)
-		argv_array_push(&submodule_options, "--exclude-standard");
-
-	if (opt->invert)
-		argv_array_push(&submodule_options, "-v");
-	if (opt->ignore_case)
-		argv_array_push(&submodule_options, "-i");
-	if (opt->word_regexp)
-		argv_array_push(&submodule_options, "-w");
-	switch (opt->binary) {
-	case GREP_BINARY_NOMATCH:
-		argv_array_push(&submodule_options, "-I");
-		break;
-	case GREP_BINARY_TEXT:
-		argv_array_push(&submodule_options, "-a");
-		break;
-	default:
-		break;
-	}
-	if (opt->allow_textconv)
-		argv_array_push(&submodule_options, "--textconv");
-	if (opt->max_depth != -1)
-		argv_array_pushf(&submodule_options, "--max-depth=%d",
-				 opt->max_depth);
-	if (opt->linenum)
-		argv_array_push(&submodule_options, "-n");
-	if (!opt->pathname)
-		argv_array_push(&submodule_options, "-h");
-	if (!opt->relative)
-		argv_array_push(&submodule_options, "--full-name");
-	if (opt->name_only)
-		argv_array_push(&submodule_options, "-l");
-	if (opt->unmatch_name_only)
-		argv_array_push(&submodule_options, "-L");
-	if (opt->null_following_name)
-		argv_array_push(&submodule_options, "-z");
-	if (opt->count)
-		argv_array_push(&submodule_options, "-c");
-	if (opt->file_break)
-		argv_array_push(&submodule_options, "--break");
-	if (opt->heading)
-		argv_array_push(&submodule_options, "--heading");
-	if (opt->pre_context)
-		argv_array_pushf(&submodule_options, "--before-context=%d",
-				 opt->pre_context);
-	if (opt->post_context)
-		argv_array_pushf(&submodule_options, "--after-context=%d",
-				 opt->post_context);
-	if (opt->funcname)
-		argv_array_push(&submodule_options, "-p");
-	if (opt->funcbody)
-		argv_array_push(&submodule_options, "-W");
-	if (opt->all_match)
-		argv_array_push(&submodule_options, "--all-match");
-	if (opt->debug)
-		argv_array_push(&submodule_options, "--debug");
-	if (opt->status_only)
-		argv_array_push(&submodule_options, "-q");
-
-	switch (pattern_type_arg) {
-	case GREP_PATTERN_TYPE_BRE:
-		argv_array_push(&submodule_options, "-G");
-		break;
-	case GREP_PATTERN_TYPE_ERE:
-		argv_array_push(&submodule_options, "-E");
-		break;
-	case GREP_PATTERN_TYPE_FIXED:
-		argv_array_push(&submodule_options, "-F");
-		break;
-	case GREP_PATTERN_TYPE_PCRE:
-		argv_array_push(&submodule_options, "-P");
-		break;
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		break;
-	default:
-		die("BUG: Added a new grep pattern type without updating switch statement");
-	}
-
-	for (pattern = opt->pattern_list; pattern != NULL;
-	     pattern = pattern->next) {
-		switch (pattern->token) {
-		case GREP_PATTERN:
-			argv_array_pushf(&submodule_options, "-e%s",
-					 pattern->pattern);
-			break;
-		case GREP_AND:
-		case GREP_OPEN_PAREN:
-		case GREP_CLOSE_PAREN:
-		case GREP_NOT:
-		case GREP_OR:
-			argv_array_push(&submodule_options, pattern->pattern);
-			break;
-		/* BODY and HEAD are not used by git-grep */
-		case GREP_PATTERN_BODY:
-		case GREP_PATTERN_HEAD:
-			break;
-		}
-	}
-
-	/*
-	 * Limit number of threads for child process to use.
-	 * This is to prevent potential fork-bomb behavior of git-grep as each
-	 * submodule process has its own thread pool.
-	 */
-	argv_array_pushf(&submodule_options, "--threads=%d",
-			 (num_threads + 1) / 2);
-
-	/* Add Pathspecs */
-	argv_array_push(&submodule_options, "--");
-	for (; *argv; argv++)
-		argv_array_push(&submodule_options, *argv);
-}
+static int grep_cache(struct grep_opt *opt, struct repository *repo,
+		      const struct pathspec *pathspec, int cached);
+static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
+		     struct tree_desc *tree, struct strbuf *base, int tn_len,
+		     int check_attr, struct repository *repo);
 
-/*
- * Launch child process to grep contents of a submodule
- */
-static int grep_submodule_launch(struct grep_opt *opt,
-				 const struct grep_source *gs)
+static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
+			  const struct pathspec *pathspec,
+			  const struct object_id *oid,
+			  const char *filename, const char *path)
 {
-	struct child_process cp = CHILD_PROCESS_INIT;
-	int status, i;
-	const char *end_of_base;
-	const char *name;
-	struct strbuf child_output = STRBUF_INIT;
-
-	end_of_base = strchr(gs->name, ':');
-	if (gs->identifier && end_of_base)
-		name = end_of_base + 1;
-	else
-		name = gs->name;
+	struct repository submodule;
+	int hit;
 
-	prepare_submodule_repo_env(&cp.env_array);
-	argv_array_push(&cp.env_array, GIT_DIR_ENVIRONMENT);
+	if (!is_submodule_active(superproject, path))
+		return 0;
 
-	if (opt->relative && opt->prefix_length)
-		argv_array_pushf(&cp.env_array, "%s=%s",
-				 GIT_TOPLEVEL_PREFIX_ENVIRONMENT,
-				 opt->prefix);
+	if (repo_submodule_init(&submodule, superproject, path))
+		return 0;
 
-	/* Add super prefix */
-	argv_array_pushf(&cp.args, "--super-prefix=%s%s/",
-			 super_prefix ? super_prefix : "",
-			 name);
-	argv_array_push(&cp.args, "grep");
+	repo_read_gitmodules(&submodule);
 
 	/*
-	 * Add basename of parent project
-	 * When performing grep on a tree object the filename is prefixed
-	 * with the object's name: 'tree-name:filename'.  In order to
-	 * provide uniformity of output we want to pass the name of the
-	 * parent project's object name to the submodule so the submodule can
-	 * prefix its output with the parent's name and not its own OID.
+	 * NEEDSWORK: This adds the submodule's object directory to the list of
+	 * alternates for the single in-memory object store.  This has some bad
+	 * consequences for memory (processed objects will never be freed) and
+	 * performance (this increases the number of pack files git has to pay
+	 * attention to, to the sum of the number of pack files in all the
+	 * repositories processed so far).  This can be removed once the object
+	 * store is no longer global and instead is a member of the repository
+	 * object.
 	 */
-	if (gs->identifier && end_of_base)
-		argv_array_pushf(&cp.args, "--parent-basename=%.*s",
-				 (int) (end_of_base - gs->name),
-				 gs->name);
+	add_to_alternates_memory(submodule.objectdir);
 
-	/* Add options */
-	for (i = 0; i < submodule_options.argc; i++) {
-		/*
-		 * If there is a tree identifier for the submodule, add the
-		 * rev after adding the submodule options but before the
-		 * pathspecs.  To do this we listen for the '--' and insert the
-		 * oid before pushing the '--' onto the child process argv
-		 * array.
-		 */
-		if (gs->identifier &&
-		    !strcmp("--", submodule_options.argv[i])) {
-			argv_array_push(&cp.args, oid_to_hex(gs->identifier));
-		}
+	if (oid) {
+		struct object *object;
+		struct tree_desc tree;
+		void *data;
+		unsigned long size;
+		struct strbuf base = STRBUF_INIT;
 
-		argv_array_push(&cp.args, submodule_options.argv[i]);
-	}
+		object = parse_object_or_die(oid, oid_to_hex(oid));
 
-	cp.git_cmd = 1;
-	cp.dir = gs->path;
+		grep_read_lock();
+		data = read_object_with_reference(object->oid.hash, tree_type,
+						  &size, NULL);
+		grep_read_unlock();
 
-	/*
-	 * Capture output to output buffer and check the return code from the
-	 * child process.  A '0' indicates a hit, a '1' indicates no hit and
-	 * anything else is an error.
-	 */
-	status = capture_command(&cp, &child_output, 0);
-	if (status && (status != 1)) {
-		/* flush the buffer */
-		write_or_die(1, child_output.buf, child_output.len);
-		die("process for submodule '%s' failed with exit code: %d",
-		    gs->name, status);
-	}
+		if (!data)
+			die(_("unable to read tree (%s)"), oid_to_hex(&object->oid));
 
-	opt->output(opt, child_output.buf, child_output.len);
-	strbuf_release(&child_output);
-	/* invert the return code to make a hit equal to 1 */
-	return !status;
-}
+		strbuf_addstr(&base, filename);
+		strbuf_addch(&base, '/');
 
-/*
- * Prep grep structures for a submodule grep
- * oid: the oid of the submodule or NULL if using the working tree
- * filename: name of the submodule including tree name of parent
- * path: location of the submodule
- */
-static int grep_submodule(struct grep_opt *opt, const struct object_id *oid,
-			  const char *filename, const char *path)
-{
-	if (!is_submodule_active(the_repository, path))
-		return 0;
-	if (!is_submodule_populated_gently(path, NULL)) {
-		/*
-		 * If searching history, check for the presence of the
-		 * submodule's gitdir before skipping the submodule.
-		 */
-		if (oid) {
-			const struct submodule *sub =
-					submodule_from_path(null_sha1, path);
-			if (sub)
-				path = git_path("modules/%s", sub->name);
-
-			if (!(is_directory(path) && is_git_directory(path)))
-				return 0;
-		} else {
-			return 0;
-		}
+		init_tree_desc(&tree, data, size);
+		hit = grep_tree(opt, pathspec, &tree, &base, base.len,
+				object->type == OBJ_COMMIT, &submodule);
+		strbuf_release(&base);
+		free(data);
+	} else {
+		hit = grep_cache(opt, &submodule, pathspec, 1);
 	}
 
-#ifndef NO_PTHREADS
-	if (num_threads) {
-		add_work(opt, GREP_SOURCE_SUBMODULE, filename, path, oid);
-		return 0;
-	} else
-#endif
-	{
-		struct grep_source gs;
-		int hit;
-
-		grep_source_init(&gs, GREP_SOURCE_SUBMODULE,
-				 filename, path, oid);
-		hit = grep_submodule_launch(opt, &gs);
-
-		grep_source_clear(&gs);
-		return hit;
-	}
+	repo_clear(&submodule);
+	return hit;
 }
 
-static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
-		      int cached)
+static int grep_cache(struct grep_opt *opt, struct repository *repo,
+		      const struct pathspec *pathspec, int cached)
 {
 	int hit = 0;
 	int nr;
 	struct strbuf name = STRBUF_INIT;
 	int name_base_len = 0;
-	if (super_prefix) {
-		name_base_len = strlen(super_prefix);
-		strbuf_addstr(&name, super_prefix);
+	if (repo->submodule_prefix) {
+		name_base_len = strlen(repo->submodule_prefix);
+		strbuf_addstr(&name, repo->submodule_prefix);
 	}
 
-	read_cache();
+	repo_read_index(repo);
 
-	for (nr = 0; nr < active_nr; nr++) {
-		const struct cache_entry *ce = active_cache[nr];
+	for (nr = 0; nr < repo->index->cache_nr; nr++) {
+		const struct cache_entry *ce = repo->index->cache[nr];
 		strbuf_setlen(&name, name_base_len);
 		strbuf_addstr(&name, ce->name);
 
@@ -715,14 +498,14 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 			    ce_skip_worktree(ce)) {
 				if (ce_stage(ce) || ce_intent_to_add(ce))
 					continue;
-				hit |= grep_oid(opt, &ce->oid, ce->name,
-						 0, ce->name);
+				hit |= grep_oid(opt, &ce->oid, name.buf,
+						 0, name.buf);
 			} else {
-				hit |= grep_file(opt, ce->name);
+				hit |= grep_file(opt, name.buf);
 			}
 		} else if (recurse_submodules && S_ISGITLINK(ce->ce_mode) &&
 			   submodule_path_match(pathspec, name.buf, NULL)) {
-			hit |= grep_submodule(opt, NULL, ce->name, ce->name);
+			hit |= grep_submodule(opt, repo, pathspec, NULL, ce->name, ce->name);
 		} else {
 			continue;
 		}
@@ -730,8 +513,8 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 		if (ce_stage(ce)) {
 			do {
 				nr++;
-			} while (nr < active_nr &&
-				 !strcmp(ce->name, active_cache[nr]->name));
+			} while (nr < repo->index->cache_nr &&
+				 !strcmp(ce->name, repo->index->cache[nr]->name));
 			nr--; /* compensate for loop control */
 		}
 		if (hit && opt->status_only)
@@ -744,7 +527,7 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 
 static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 		     struct tree_desc *tree, struct strbuf *base, int tn_len,
-		     int check_attr)
+		     int check_attr, struct repository *repo)
 {
 	int hit = 0;
 	enum interesting match = entry_not_interesting;
@@ -752,8 +535,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 	int old_baselen = base->len;
 	struct strbuf name = STRBUF_INIT;
 	int name_base_len = 0;
-	if (super_prefix) {
-		strbuf_addstr(&name, super_prefix);
+	if (repo->submodule_prefix) {
+		strbuf_addstr(&name, repo->submodule_prefix);
 		name_base_len = name.len;
 	}
 
@@ -791,11 +574,11 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 			strbuf_addch(base, '/');
 			init_tree_desc(&sub, data, size);
 			hit |= grep_tree(opt, pathspec, &sub, base, tn_len,
-					 check_attr);
+					 check_attr, repo);
 			free(data);
 		} else if (recurse_submodules && S_ISGITLINK(entry.mode)) {
-			hit |= grep_submodule(opt, entry.oid, base->buf,
-					      base->buf + tn_len);
+			hit |= grep_submodule(opt, repo, pathspec, entry.oid,
+					      base->buf, base->buf + tn_len);
 		}
 
 		strbuf_setlen(base, old_baselen);
@@ -809,7 +592,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
-		       struct object *obj, const char *name, const char *path)
+		       struct object *obj, const char *name, const char *path,
+		       struct repository *repo)
 {
 	if (obj->type == OBJ_BLOB)
 		return grep_oid(opt, &obj->oid, name, 0, path);
@@ -828,10 +612,6 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid));
 
-		/* Use parent's name as base when recursing submodules */
-		if (recurse_submodules && parent_basename)
-			name = parent_basename;
-
 		len = name ? strlen(name) : 0;
 		strbuf_init(&base, PATH_MAX + len + 1);
 		if (len) {
@@ -840,7 +620,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		}
 		init_tree_desc(&tree, data, size);
 		hit = grep_tree(opt, pathspec, &tree, &base, base.len,
-				obj->type == OBJ_COMMIT);
+				obj->type == OBJ_COMMIT, repo);
 		strbuf_release(&base);
 		free(data);
 		return hit;
@@ -849,6 +629,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
+			struct repository *repo,
 			const struct object_array *list)
 {
 	unsigned int i;
@@ -864,7 +645,8 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 			submodule_free();
 			gitmodules_config_sha1(real_obj->oid.hash);
 		}
-		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path)) {
+		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path,
+				repo)) {
 			hit = 1;
 			if (opt->status_only)
 				break;
@@ -1005,9 +787,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			    N_("ignore files specified via '.gitignore'"), 1),
 		OPT_BOOL(0, "recurse-submodules", &recurse_submodules,
 			 N_("recursively search in each submodule")),
-		OPT_STRING(0, "parent-basename", &parent_basename,
-			   N_("basename"),
-			   N_("prepend parent project's basename to output")),
 		OPT_GROUP(""),
 		OPT_BOOL('v', "invert-match", &opt.invert,
 			N_("show non-matching lines")),
@@ -1112,7 +891,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	init_grep_defaults();
 	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, prefix);
-	super_prefix = get_super_prefix();
 
 	/*
 	 * If there is no -- then the paths must exist in the working
@@ -1274,9 +1052,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 
 	if (recurse_submodules) {
 		gitmodules_config();
-		compile_submodule_options(&opt, argv + i, cached, untracked,
-					  opt_exclude, use_index,
-					  pattern_type_arg);
 	}
 
 	if (show_in_pager && (cached || list.nr))
@@ -1320,11 +1095,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		if (!cached)
 			setup_work_tree();
 
-		hit = grep_cache(&opt, &pathspec, cached);
+		hit = grep_cache(&opt, the_repository, &pathspec, cached);
 	} else {
 		if (cached)
 			die(_("both --cached and trees are given."));
-		hit = grep_objects(&opt, &pathspec, &list);
+
+		hit = grep_objects(&opt, &pathspec, the_repository, &list);
 	}
 
 	if (num_threads)
diff --git a/cache.h b/cache.h
index 71fe09264..71af91c43 100644
--- a/cache.h
+++ b/cache.h
@@ -417,7 +417,6 @@ static inline enum object_type object_type(unsigned int mode)
 #define GIT_WORK_TREE_ENVIRONMENT "GIT_WORK_TREE"
 #define GIT_PREFIX_ENVIRONMENT "GIT_PREFIX"
 #define GIT_SUPER_PREFIX_ENVIRONMENT "GIT_INTERNAL_SUPER_PREFIX"
-#define GIT_TOPLEVEL_PREFIX_ENVIRONMENT "GIT_INTERNAL_TOPLEVEL_PREFIX"
 #define DEFAULT_GIT_DIR_ENVIRONMENT ".git"
 #define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY"
 #define INDEX_ENVIRONMENT "GIT_INDEX_FILE"
diff --git a/git.c b/git.c
index 489aab4d8..9dd9aead6 100644
--- a/git.c
+++ b/git.c
@@ -392,7 +392,7 @@ static struct cmd_struct commands[] = {
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
 	{ "gc", cmd_gc, RUN_SETUP },
 	{ "get-tar-commit-id", cmd_get_tar_commit_id },
-	{ "grep", cmd_grep, RUN_SETUP_GENTLY | SUPPORT_SUPER_PREFIX },
+	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY },
diff --git a/grep.c b/grep.c
index 98733db62..78680da5c 100644
--- a/grep.c
+++ b/grep.c
@@ -1919,16 +1919,6 @@ void grep_source_init(struct grep_source *gs, enum grep_source_type type,
 	case GREP_SOURCE_FILE:
 		gs->identifier = xstrdup(identifier);
 		break;
-	case GREP_SOURCE_SUBMODULE:
-		if (!identifier) {
-			gs->identifier = NULL;
-			break;
-		}
-		/*
-		 * FALL THROUGH
-		 * If the identifier is non-NULL (in the submodule case) it
-		 * will be a SHA1 that needs to be copied.
-		 */
 	case GREP_SOURCE_OID:
 		gs->identifier = oiddup(identifier);
 		break;
@@ -1951,7 +1941,6 @@ void grep_source_clear_data(struct grep_source *gs)
 	switch (gs->type) {
 	case GREP_SOURCE_FILE:
 	case GREP_SOURCE_OID:
-	case GREP_SOURCE_SUBMODULE:
 		FREE_AND_NULL(gs->buf);
 		gs->size = 0;
 		break;
@@ -2022,8 +2011,6 @@ static int grep_source_load(struct grep_source *gs)
 		return grep_source_load_oid(gs);
 	case GREP_SOURCE_BUF:
 		return gs->buf ? 0 : -1;
-	case GREP_SOURCE_SUBMODULE:
-		break;
 	}
 	die("BUG: invalid grep_source type to load");
 }
diff --git a/grep.h b/grep.h
index b8f93bfc2..d405e568f 100644
--- a/grep.h
+++ b/grep.h
@@ -194,7 +194,6 @@ struct grep_source {
 		GREP_SOURCE_OID,
 		GREP_SOURCE_FILE,
 		GREP_SOURCE_BUF,
-		GREP_SOURCE_SUBMODULE,
 	} type;
 	void *identifier;
 
diff --git a/setup.c b/setup.c
index 860507e1f..23950173f 100644
--- a/setup.c
+++ b/setup.c
@@ -1027,7 +1027,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
 {
 	static struct strbuf cwd = STRBUF_INIT;
 	struct strbuf dir = STRBUF_INIT, gitdir = STRBUF_INIT;
-	const char *prefix, *env_prefix;
+	const char *prefix;
 
 	/*
 	 * We may have read an incomplete configuration before
@@ -1085,16 +1085,6 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		die("BUG: unhandled setup_git_directory_1() result");
 	}
 
-	/*
-	 * NEEDSWORK: This was a hack in order to get ls-files and grep to have
-	 * properly formated output when recursing submodules.  Once ls-files
-	 * and grep have been changed to perform this recursing in-process this
-	 * needs to be removed.
-	 */
-	env_prefix = getenv(GIT_TOPLEVEL_PREFIX_ENVIRONMENT);
-	if (env_prefix)
-		prefix = env_prefix;
-
 	if (prefix)
 		setenv(GIT_PREFIX_ENVIRONMENT, prefix, 1);
 	else
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 00/10] Convert grep to recurse in-process
  2017-07-14 22:28 ` [PATCH v2 " Brandon Williams
                     ` (2 preceding siblings ...)
  2017-07-14 22:28   ` [PATCH v2 3/3] grep: recurse in-process using 'struct repository' Brandon Williams
@ 2017-07-18 19:05   ` Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 01/10] repo_read_index: don't discard the index Brandon Williams
                       ` (11 more replies)
  3 siblings, 12 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Changes in v3:
 * Fixes a bug with repo_read_gitmodules() where it was possible to
   segfault when a repository didn't have a worktree.  
 * In order to fix the above bug repo_read_gitmodules() and gitmodules_config()
   were merged so that there won't be any duplicate logic.  In order to merge
   these functions the parsing of submodule.fetchjobs and
   fetch.recursesubmodules were removed from the submodule-config parsing logic
   and instead moved into fetch and update-clone.  This also makes it easier to
   ensure that no additonal non-submodule specific configuration like this will
   be added to .gitmodules in the future.

Brandon Williams (10):
  repo_read_index: don't discard the index
  repository: have the_repository use the_index
  cache.h: add GITMODULES_FILE macro
  config: add config_from_gitmodules
  submodule: remove submodule.fetchjobs from submodule-config parsing
  submodule: remove fetch.recursesubmodules from submodule-config
    parsing
  submodule: check for unstaged .gitmodules outside of config parsing
  submodule: check for unmerged .gitmodules outside of config parsing
  submodule: merge repo_read_gitmodules and gitmodules_config
  grep: recurse in-process using 'struct repository'

 Documentation/git-grep.txt  |   7 -
 builtin/fetch.c             |  26 ++-
 builtin/grep.c              | 396 ++++++++++----------------------------------
 builtin/mv.c                |   2 +-
 builtin/rm.c                |   2 +-
 builtin/submodule--helper.c |  17 +-
 cache.h                     |   2 +-
 config.c                    |  17 ++
 config.h                    |  10 ++
 git.c                       |   2 +-
 grep.c                      |  13 --
 grep.h                      |   1 -
 repository.c                |   6 +-
 repository.h                |   8 +
 setup.c                     |  12 +-
 submodule-config.c          |   8 +
 submodule-config.h          |   1 +
 submodule.c                 | 147 +++++++---------
 submodule.h                 |   6 +-
 19 files changed, 240 insertions(+), 443 deletions(-)

-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v3 01/10] repo_read_index: don't discard the index
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
@ 2017-07-18 19:05     ` Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 02/10] repository: have the_repository use the_index Brandon Williams
                       ` (10 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Have 'repo_read_index()' behave more like the other read_index family of
functions and don't discard the index if it has already been populated
and instead rely on the quick return of read_index_from which has:

  /* istate->initialized covers both .git/index and .git/sharedindex.xxx */
  if (istate->initialized)
    return istate->cache_nr;

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 repository.c | 2 --
 repository.h | 8 ++++++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/repository.c b/repository.c
index edca90740..8e60af1d5 100644
--- a/repository.c
+++ b/repository.c
@@ -235,8 +235,6 @@ int repo_read_index(struct repository *repo)
 {
 	if (!repo->index)
 		repo->index = xcalloc(1, sizeof(*repo->index));
-	else
-		discard_index(repo->index);
 
 	return read_index_from(repo->index, repo->index_file);
 }
diff --git a/repository.h b/repository.h
index 417787f3e..7f5e24a0a 100644
--- a/repository.h
+++ b/repository.h
@@ -92,6 +92,14 @@ extern int repo_submodule_init(struct repository *submodule,
 			       const char *path);
 extern void repo_clear(struct repository *repo);
 
+/*
+ * Populates the repository's index from its index_file, an index struct will
+ * be allocated if needed.
+ *
+ * Return the number of index entries in the populated index or a value less
+ * than zero if an error occured.  If the repository's index has already been
+ * populated then the number of entries will simply be returned.
+ */
 extern int repo_read_index(struct repository *repo);
 
 #endif /* REPOSITORY_H */
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 02/10] repository: have the_repository use the_index
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 01/10] repo_read_index: don't discard the index Brandon Williams
@ 2017-07-18 19:05     ` Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 03/10] cache.h: add GITMODULES_FILE macro Brandon Williams
                       ` (9 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Have the index state which is stored in 'the_repository' be a pointer to
the in-core index 'the_index'.  This makes it easier to begin
transitioning more parts of the code base to operate on a 'struct
repository'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 repository.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index 8e60af1d5..c0e0e0e7e 100644
--- a/repository.c
+++ b/repository.c
@@ -4,7 +4,9 @@
 #include "submodule-config.h"
 
 /* The main repository */
-static struct repository the_repo;
+static struct repository the_repo = {
+	NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, &the_index, 0, 0
+};
 struct repository *the_repository = &the_repo;
 
 static char *git_path_from_env(const char *envvar, const char *git_dir,
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 03/10] cache.h: add GITMODULES_FILE macro
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 01/10] repo_read_index: don't discard the index Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 02/10] repository: have the_repository use the_index Brandon Williams
@ 2017-07-18 19:05     ` Brandon Williams
  2017-07-31 23:11       ` [PATCH] convert any hard coded .gitmodules file string to the MACRO Stefan Beller
  2017-07-18 19:05     ` [PATCH v3 04/10] config: add config_from_gitmodules Brandon Williams
                       ` (8 subsequent siblings)
  11 siblings, 1 reply; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Add a macro to be used when specifying the '.gitmodules' file.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 cache.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/cache.h b/cache.h
index 71fe09264..d59f767e2 100644
--- a/cache.h
+++ b/cache.h
@@ -433,6 +433,7 @@ static inline enum object_type object_type(unsigned int mode)
 #define GITATTRIBUTES_FILE ".gitattributes"
 #define INFOATTRIBUTES_FILE "info/attributes"
 #define ATTRIBUTE_MACRO_PREFIX "[attr]"
+#define GITMODULES_FILE ".gitmodules"
 #define GIT_NOTES_REF_ENVIRONMENT "GIT_NOTES_REF"
 #define GIT_NOTES_DEFAULT_REF "refs/notes/commits"
 #define GIT_NOTES_DISPLAY_REF_ENVIRONMENT "GIT_NOTES_DISPLAY_REF"
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 04/10] config: add config_from_gitmodules
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
                       ` (2 preceding siblings ...)
  2017-07-18 19:05     ` [PATCH v3 03/10] cache.h: add GITMODULES_FILE macro Brandon Williams
@ 2017-07-18 19:05     ` Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 05/10] submodule: remove submodule.fetchjobs from submodule-config parsing Brandon Williams
                       ` (7 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Add 'config_from_gitmodules()' function which can be used by 'fetch' and
'update_clone' in order to maintain backwards compatibility with
configuration being stored in .gitmodules' since a future patch will
remove reading these values in the submodule-config.

This function should not be used anywhere other than in 'fetch' and
'update_clone'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 config.c | 17 +++++++++++++++++
 config.h | 10 ++++++++++
 2 files changed, 27 insertions(+)

diff --git a/config.c b/config.c
index 231f9a750..06645a325 100644
--- a/config.c
+++ b/config.c
@@ -2053,6 +2053,23 @@ int git_config_get_pathname(const char *key, const char **dest)
 	return repo_config_get_pathname(the_repository, key, dest);
 }
 
+/*
+ * Note: This function exists solely to maintain backward compatibility with
+ * 'fetch' and 'update_clone' storing configuration in '.gitmodules' and should
+ * NOT be used anywhere else.
+ *
+ * Runs the provided config function on the '.gitmodules' file found in the
+ * working directory.
+ */
+void config_from_gitmodules(config_fn_t fn, void *data)
+{
+	if (the_repository->worktree) {
+		char *file = repo_worktree_path(the_repository, GITMODULES_FILE);
+		git_config_from_file(fn, file, data);
+		free(file);
+	}
+}
+
 int git_config_get_expiry(const char *key, const char **output)
 {
 	int ret = git_config_get_string_const(key, output);
diff --git a/config.h b/config.h
index 0352da117..6998e6645 100644
--- a/config.h
+++ b/config.h
@@ -187,6 +187,16 @@ extern int repo_config_get_maybe_bool(struct repository *repo,
 extern int repo_config_get_pathname(struct repository *repo,
 				    const char *key, const char **dest);
 
+/*
+ * Note: This function exists solely to maintain backward compatibility with
+ * 'fetch' and 'update_clone' storing configuration in '.gitmodules' and should
+ * NOT be used anywhere else.
+ *
+ * Runs the provided config function on the '.gitmodules' file found in the
+ * working directory.
+ */
+extern void config_from_gitmodules(config_fn_t fn, void *data);
+
 extern int git_config_get_value(const char *key, const char **value);
 extern const struct string_list *git_config_get_value_multi(const char *key);
 extern void git_config_clear(void);
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 05/10] submodule: remove submodule.fetchjobs from submodule-config parsing
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
                       ` (3 preceding siblings ...)
  2017-07-18 19:05     ` [PATCH v3 04/10] config: add config_from_gitmodules Brandon Williams
@ 2017-07-18 19:05     ` Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 06/10] submodule: remove fetch.recursesubmodules " Brandon Williams
                       ` (6 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

The '.gitmodules' file should only contain information pertinent to
configuring individual submodules (name to path mapping, URL where to
obtain the submodule, etc.) while other configuration like the number of
jobs to use when fetching submodules should be a part of the
repository's config.

Remove the 'submodule.fetchjobs' configuration option from the general
submodule-config parsing and instead rely on using the
'config_from_gitmodules()' in order to maintain backwards compatibility
with this config being placed in the '.gitmodules' file.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c             | 18 +++++++++++++++++-
 builtin/submodule--helper.c | 17 +++++++++++++----
 submodule-config.c          |  8 ++++++++
 submodule-config.h          |  1 +
 submodule.c                 | 16 +---------------
 submodule.h                 |  1 -
 6 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index c87e59f3b..ade092bf8 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -39,7 +39,7 @@ static int prune = -1; /* unspecified */
 static int all, append, dry_run, force, keep, multiple, update_head_ok, verbosity, deepen_relative;
 static int progress = -1;
 static int tags = TAGS_DEFAULT, unshallow, update_shallow, deepen;
-static int max_children = -1;
+static int max_children = 1;
 static enum transport_family family;
 static const char *depth;
 static const char *deepen_since;
@@ -68,9 +68,24 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
 		recurse_submodules = r;
 	}
 
+	if (!strcmp(k, "submodule.fetchjobs")) {
+		max_children = parse_submodule_fetchjobs(k, v);
+		return 0;
+	}
+
 	return git_default_config(k, v, cb);
 }
 
+static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "submodule.fetchjobs")) {
+		max_children = parse_submodule_fetchjobs(var, value);
+		return 0;
+	}
+
+	return 0;
+}
+
 static int parse_refmap_arg(const struct option *opt, const char *arg, int unset)
 {
 	ALLOC_GROW(refmap_array, refmap_nr + 1, refmap_alloc);
@@ -1311,6 +1326,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	for (i = 1; i < argc; i++)
 		strbuf_addf(&default_rla, " %s", argv[i]);
 
+	config_from_gitmodules(gitmodules_fetch_config, NULL);
 	git_config(git_fetch_config, NULL);
 
 	argc = parse_options(argc, argv, prefix,
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6abdad329..6d9600d4f 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -960,10 +960,19 @@ static int update_clone_task_finished(int result,
 	return 0;
 }
 
+static int gitmodules_update_clone_config(const char *var, const char *value,
+					  void *cb)
+{
+	int *max_jobs = cb;
+	if (!strcmp(var, "submodule.fetchjobs"))
+		*max_jobs = parse_submodule_fetchjobs(var, value);
+	return 0;
+}
+
 static int update_clone(int argc, const char **argv, const char *prefix)
 {
 	const char *update = NULL;
-	int max_jobs = -1;
+	int max_jobs = 1;
 	struct string_list_item *item;
 	struct pathspec pathspec;
 	struct submodule_update_clone suc = SUBMODULE_UPDATE_CLONE_INIT;
@@ -1000,6 +1009,9 @@ static int update_clone(int argc, const char **argv, const char *prefix)
 	};
 	suc.prefix = prefix;
 
+	config_from_gitmodules(gitmodules_update_clone_config, &max_jobs);
+	git_config(gitmodules_update_clone_config, &max_jobs);
+
 	argc = parse_options(argc, argv, prefix, module_update_clone_options,
 			     git_submodule_helper_usage, 0);
 
@@ -1017,9 +1029,6 @@ static int update_clone(int argc, const char **argv, const char *prefix)
 	gitmodules_config();
 	git_config(submodule_config, NULL);
 
-	if (max_jobs < 0)
-		max_jobs = parallel_submodules();
-
 	run_processes_parallel(max_jobs,
 			       update_clone_get_next_task,
 			       update_clone_start_failure,
diff --git a/submodule-config.c b/submodule-config.c
index 5fe2d0787..70400f553 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -248,6 +248,14 @@ static int parse_fetch_recurse(const char *opt, const char *arg,
 	}
 }
 
+int parse_submodule_fetchjobs(const char *var, const char *value)
+{
+	int fetchjobs = git_config_int(var, value);
+	if (fetchjobs < 0)
+		die(_("negative values not allowed for submodule.fetchjobs"));
+	return fetchjobs;
+}
+
 int parse_fetch_recurse_submodules_arg(const char *opt, const char *arg)
 {
 	return parse_fetch_recurse(opt, arg, 1);
diff --git a/submodule-config.h b/submodule-config.h
index 233bfcb7f..995d404f8 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -27,6 +27,7 @@ struct repository;
 
 extern void submodule_cache_free(struct submodule_cache *cache);
 
+extern int parse_submodule_fetchjobs(const char *var, const char *value);
 extern int parse_fetch_recurse_submodules_arg(const char *opt, const char *arg);
 struct option;
 extern int option_fetch_parse_recurse_submodules(const struct option *opt,
diff --git a/submodule.c b/submodule.c
index 6531c5d60..7293c28a5 100644
--- a/submodule.c
+++ b/submodule.c
@@ -22,7 +22,6 @@
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
-static int parallel_jobs = 1;
 static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
 static struct oid_array ref_tips_before_fetch;
@@ -159,12 +158,7 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
 /* For loading from the .gitmodules file. */
 static int git_modules_config(const char *var, const char *value, void *cb)
 {
-	if (!strcmp(var, "submodule.fetchjobs")) {
-		parallel_jobs = git_config_int(var, value);
-		if (parallel_jobs < 0)
-			die(_("negative values not allowed for submodule.fetchJobs"));
-		return 0;
-	} else if (starts_with(var, "submodule."))
+	if (starts_with(var, "submodule."))
 		return parse_submodule_config_option(var, value);
 	else if (!strcmp(var, "fetch.recursesubmodules")) {
 		config_fetch_recurse_submodules = parse_fetch_recurse_submodules_arg(var, value);
@@ -1303,9 +1297,6 @@ int fetch_populated_submodules(const struct argv_array *options,
 	argv_array_push(&spf.args, "--recurse-submodules-default");
 	/* default value, "--submodule-prefix" and its value are added later */
 
-	if (max_parallel_jobs < 0)
-		max_parallel_jobs = parallel_jobs;
-
 	calculate_changed_submodule_paths();
 	run_processes_parallel(max_parallel_jobs,
 			       get_next_submodule,
@@ -1825,11 +1816,6 @@ int merge_submodule(struct object_id *result, const char *path,
 	return 0;
 }
 
-int parallel_submodules(void)
-{
-	return parallel_jobs;
-}
-
 /*
  * Embeds a single submodules git directory into the superprojects git dir,
  * non recursively.
diff --git a/submodule.h b/submodule.h
index e85b14486..c8164a3b2 100644
--- a/submodule.h
+++ b/submodule.h
@@ -112,7 +112,6 @@ extern int push_unpushed_submodules(struct oid_array *commits,
 				    const struct string_list *push_options,
 				    int dry_run);
 extern void connect_work_tree_and_git_dir(const char *work_tree, const char *git_dir);
-extern int parallel_submodules(void);
 /*
  * Given a submodule path (as in the index), return the repository
  * path of that submodule in 'buf'. Return -1 on error or when the
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 06/10] submodule: remove fetch.recursesubmodules from submodule-config parsing
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
                       ` (4 preceding siblings ...)
  2017-07-18 19:05     ` [PATCH v3 05/10] submodule: remove submodule.fetchjobs from submodule-config parsing Brandon Williams
@ 2017-07-18 19:05     ` Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 07/10] submodule: check for unstaged .gitmodules outside of config parsing Brandon Williams
                       ` (5 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Remove the 'fetch.recursesubmodules' configuration option from the
general submodule-config parsing and instead rely on using
'config_from_gitmodules()' in order to maintain backwards compatibility
with this config being placed in the '.gitmodules' file.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c |  8 +++++++-
 submodule.c     | 19 ++++++-------------
 submodule.h     |  2 +-
 3 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ade092bf8..d84c26391 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -71,6 +71,9 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
 	if (!strcmp(k, "submodule.fetchjobs")) {
 		max_children = parse_submodule_fetchjobs(k, v);
 		return 0;
+	} else if (!strcmp(k, "fetch.recursesubmodules")) {
+		recurse_submodules = parse_fetch_recurse_submodules_arg(k, v);
+		return 0;
 	}
 
 	return git_default_config(k, v, cb);
@@ -81,6 +84,9 @@ static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
 	if (!strcmp(var, "submodule.fetchjobs")) {
 		max_children = parse_submodule_fetchjobs(var, value);
 		return 0;
+	} else if (!strcmp(var, "fetch.recursesubmodules")) {
+		recurse_submodules = parse_fetch_recurse_submodules_arg(var, value);
+		return 0;
 	}
 
 	return 0;
@@ -1355,7 +1361,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		deepen = 1;
 
 	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
-		set_config_fetch_recurse_submodules(recurse_submodules_default);
 		gitmodules_config();
 		git_config(submodule_config, NULL);
 	}
@@ -1399,6 +1404,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		result = fetch_populated_submodules(&options,
 						    submodule_prefix,
 						    recurse_submodules,
+						    recurse_submodules_default,
 						    verbosity < 0,
 						    max_children);
 		argv_array_clear(&options);
diff --git a/submodule.c b/submodule.c
index 7293c28a5..b1965290f 100644
--- a/submodule.c
+++ b/submodule.c
@@ -20,7 +20,6 @@
 #include "worktree.h"
 #include "parse-options.h"
 
-static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
 static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
@@ -160,10 +159,6 @@ static int git_modules_config(const char *var, const char *value, void *cb)
 {
 	if (starts_with(var, "submodule."))
 		return parse_submodule_config_option(var, value);
-	else if (!strcmp(var, "fetch.recursesubmodules")) {
-		config_fetch_recurse_submodules = parse_fetch_recurse_submodules_arg(var, value);
-		return 0;
-	}
 	return 0;
 }
 
@@ -714,11 +709,6 @@ void show_submodule_inline_diff(FILE *f, const char *path,
 		clear_commit_marks(right, ~0);
 }
 
-void set_config_fetch_recurse_submodules(int value)
-{
-	config_fetch_recurse_submodules = value;
-}
-
 int should_update_submodules(void)
 {
 	return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
@@ -1164,10 +1154,11 @@ struct submodule_parallel_fetch {
 	const char *work_tree;
 	const char *prefix;
 	int command_line_option;
+	int default_option;
 	int quiet;
 	int result;
 };
-#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0}
+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0}
 
 static int get_next_submodule(struct child_process *cp,
 			      struct strbuf *err, void *data, void **task_cb)
@@ -1205,10 +1196,10 @@ static int get_next_submodule(struct child_process *cp,
 					default_argv = "on-demand";
 				}
 			} else {
-				if ((config_fetch_recurse_submodules == RECURSE_SUBMODULES_OFF) ||
+				if ((spf->default_option == RECURSE_SUBMODULES_OFF) ||
 				    gitmodules_is_unmerged)
 					continue;
-				if (config_fetch_recurse_submodules == RECURSE_SUBMODULES_ON_DEMAND) {
+				if (spf->default_option == RECURSE_SUBMODULES_ON_DEMAND) {
 					if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
 						continue;
 					default_argv = "on-demand";
@@ -1275,6 +1266,7 @@ static int fetch_finish(int retvalue, struct strbuf *err,
 
 int fetch_populated_submodules(const struct argv_array *options,
 			       const char *prefix, int command_line_option,
+			       int default_option,
 			       int quiet, int max_parallel_jobs)
 {
 	int i;
@@ -1282,6 +1274,7 @@ int fetch_populated_submodules(const struct argv_array *options,
 
 	spf.work_tree = get_git_work_tree();
 	spf.command_line_option = command_line_option;
+	spf.default_option = default_option;
 	spf.quiet = quiet;
 	spf.prefix = prefix;
 
diff --git a/submodule.h b/submodule.h
index c8164a3b2..29a1ecd19 100644
--- a/submodule.h
+++ b/submodule.h
@@ -76,7 +76,6 @@ extern void show_submodule_inline_diff(FILE *f, const char *path,
 		unsigned dirty_submodule, const char *meta,
 		const char *del, const char *add, const char *reset,
 		const struct diff_options *opt);
-extern void set_config_fetch_recurse_submodules(int value);
 /* Check if we want to update any submodule.*/
 extern int should_update_submodules(void);
 /*
@@ -87,6 +86,7 @@ extern const struct submodule *submodule_from_ce(const struct cache_entry *ce);
 extern void check_for_new_submodule_commits(struct object_id *oid);
 extern int fetch_populated_submodules(const struct argv_array *options,
 			       const char *prefix, int command_line_option,
+			       int default_option,
 			       int quiet, int max_parallel_jobs);
 extern unsigned is_submodule_modified(const char *path, int ignore_untracked);
 extern int submodule_uses_gitfile(const char *path);
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 07/10] submodule: check for unstaged .gitmodules outside of config parsing
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
                       ` (5 preceding siblings ...)
  2017-07-18 19:05     ` [PATCH v3 06/10] submodule: remove fetch.recursesubmodules " Brandon Williams
@ 2017-07-18 19:05     ` Brandon Williams
  2017-07-31 23:41       ` Stefan Beller
  2017-07-18 19:05     ` [PATCH v3 08/10] submodule: check for unmerged " Brandon Williams
                       ` (4 subsequent siblings)
  11 siblings, 1 reply; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Teach 'is_staging_gitmodules_ok()' to be able to determine in the
'.gitmodules' file has unstaged changes based on the passed in index
instead of relying on a global varible which is set during the
submodule-config parsing.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/mv.c |  2 +-
 builtin/rm.c |  2 +-
 submodule.c  | 32 +++++++++++++++++---------------
 submodule.h  |  2 +-
 4 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index dcf6736b5..94fbaaa5d 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -81,7 +81,7 @@ static void prepare_move_submodule(const char *src, int first,
 	struct strbuf submodule_dotgit = STRBUF_INIT;
 	if (!S_ISGITLINK(active_cache[first]->ce_mode))
 		die(_("Directory %s is in index and no submodule?"), src);
-	if (!is_staging_gitmodules_ok())
+	if (!is_staging_gitmodules_ok(&the_index))
 		die(_("Please stage your changes to .gitmodules or stash them to proceed"));
 	strbuf_addf(&submodule_dotgit, "%s/.git", src);
 	*submodule_gitfile = read_gitfile(submodule_dotgit.buf);
diff --git a/builtin/rm.c b/builtin/rm.c
index 52826d137..4057e73fa 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -286,7 +286,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 		list.entry[list.nr].name = xstrdup(ce->name);
 		list.entry[list.nr].is_submodule = S_ISGITLINK(ce->ce_mode);
 		if (list.entry[list.nr++].is_submodule &&
-		    !is_staging_gitmodules_ok())
+		    !is_staging_gitmodules_ok(&the_index))
 			die (_("Please stage your changes to .gitmodules or stash them to proceed"));
 	}
 
diff --git a/submodule.c b/submodule.c
index b1965290f..46ec04d7c 100644
--- a/submodule.c
+++ b/submodule.c
@@ -37,18 +37,25 @@ static struct oid_array ref_tips_after_fetch;
 static int gitmodules_is_unmerged;
 
 /*
- * This flag is set if the .gitmodules file had unstaged modifications on
- * startup. This must be checked before allowing modifications to the
- * .gitmodules file with the intention to stage them later, because when
- * continuing we would stage the modifications the user didn't stage herself
- * too. That might change in a future version when we learn to stage the
- * changes we do ourselves without staging any previous modifications.
+ * Check if the .gitmodules file has unstaged modifications.  This must be
+ * checked before allowing modifications to the .gitmodules file with the
+ * intention to stage them later, because when continuing we would stage the
+ * modifications the user didn't stage herself too. That might change in a
+ * future version when we learn to stage the changes we do ourselves without
+ * staging any previous modifications.
  */
-static int gitmodules_is_modified;
-
-int is_staging_gitmodules_ok(void)
+int is_staging_gitmodules_ok(const struct index_state *istate)
 {
-	return !gitmodules_is_modified;
+	int pos = index_name_pos(istate, GITMODULES_FILE, strlen(GITMODULES_FILE));
+
+	if ((pos >= 0) && (pos < istate->cache_nr)) {
+		struct stat st;
+		if (lstat(GITMODULES_FILE, &st) == 0 &&
+		    ce_match_stat(istate->cache[pos], &st, 0) & DATA_CHANGED)
+			return 0;
+	}
+
+	return 1;
 }
 
 /*
@@ -231,11 +238,6 @@ void gitmodules_config(void)
 				    !memcmp(ce->name, ".gitmodules", 11))
 					gitmodules_is_unmerged = 1;
 			}
-		} else if (pos < active_nr) {
-			struct stat st;
-			if (lstat(".gitmodules", &st) == 0 &&
-			    ce_match_stat(active_cache[pos], &st, 0) & DATA_CHANGED)
-				gitmodules_is_modified = 1;
 		}
 
 		if (!gitmodules_is_unmerged)
diff --git a/submodule.h b/submodule.h
index 29a1ecd19..b14660585 100644
--- a/submodule.h
+++ b/submodule.h
@@ -33,7 +33,7 @@ struct submodule_update_strategy {
 };
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
-extern int is_staging_gitmodules_ok(void);
+extern int is_staging_gitmodules_ok(const struct index_state *istate);
 extern int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 extern int remove_path_from_gitmodules(const char *path);
 extern void stage_updated_gitmodules(void);
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 08/10] submodule: check for unmerged .gitmodules outside of config parsing
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
                       ` (6 preceding siblings ...)
  2017-07-18 19:05     ` [PATCH v3 07/10] submodule: check for unstaged .gitmodules outside of config parsing Brandon Williams
@ 2017-07-18 19:05     ` Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 09/10] submodule: merge repo_read_gitmodules and gitmodules_config Brandon Williams
                       ` (3 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Add 'is_gitmodules_unmerged()' function which can be used to determine
in the '.gitmodules' file is unmerged based on the passed in index
instead of relying on a global variable which is set during the
submodule-config parsing.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 submodule.c | 47 +++++++++++++++++++++++------------------------
 submodule.h |  1 +
 2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/submodule.c b/submodule.c
index 46ec04d7c..a5f825e37 100644
--- a/submodule.c
+++ b/submodule.c
@@ -27,14 +27,25 @@ static struct oid_array ref_tips_before_fetch;
 static struct oid_array ref_tips_after_fetch;
 
 /*
- * The following flag is set if the .gitmodules file is unmerged. We then
- * disable recursion for all submodules where .git/config doesn't have a
- * matching config entry because we can't guess what might be configured in
- * .gitmodules unless the user resolves the conflict. When a command line
- * option is given (which always overrides configuration) this flag will be
- * ignored.
+ * Check if the .gitmodules file is unmerged. Parsing of the .gitmodules file
+ * will be disabled because we can't guess what might be configured in
+ * .gitmodules unless the user resolves the conflict.
  */
-static int gitmodules_is_unmerged;
+int is_gitmodules_unmerged(const struct index_state *istate)
+{
+	int pos = index_name_pos(istate, GITMODULES_FILE, strlen(GITMODULES_FILE));
+	if (pos < 0) { /* .gitmodules not found or isn't merged */
+		pos = -1 - pos;
+		if (istate->cache_nr > pos) {  /* there is a .gitmodules */
+			const struct cache_entry *ce = istate->cache[pos];
+			if (ce_namelen(ce) == strlen(GITMODULES_FILE) &&
+			    !strcmp(ce->name, GITMODULES_FILE))
+				return 1;
+		}
+	}
+
+	return 0;
+}
 
 /*
  * Check if the .gitmodules file has unstaged modifications.  This must be
@@ -71,7 +82,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
 	if (!file_exists(".gitmodules")) /* Do nothing without .gitmodules */
 		return -1;
 
-	if (gitmodules_is_unmerged)
+	if (is_gitmodules_unmerged(&the_index))
 		die(_("Cannot change unmerged .gitmodules, resolve merge conflicts first"));
 
 	submodule = submodule_from_path(null_sha1, oldpath);
@@ -105,7 +116,7 @@ int remove_path_from_gitmodules(const char *path)
 	if (!file_exists(".gitmodules")) /* Do nothing without .gitmodules */
 		return -1;
 
-	if (gitmodules_is_unmerged)
+	if (is_gitmodules_unmerged(&the_index))
 		die(_("Cannot change unmerged .gitmodules, resolve merge conflicts first"));
 
 	submodule = submodule_from_path(null_sha1, path);
@@ -156,7 +167,7 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
 	if (submodule) {
 		if (submodule->ignore)
 			handle_ignore_submodules_arg(diffopt, submodule->ignore);
-		else if (gitmodules_is_unmerged)
+		else if (is_gitmodules_unmerged(&the_index))
 			DIFF_OPT_SET(diffopt, IGNORE_SUBMODULES);
 	}
 }
@@ -224,23 +235,12 @@ void gitmodules_config(void)
 	const char *work_tree = get_git_work_tree();
 	if (work_tree) {
 		struct strbuf gitmodules_path = STRBUF_INIT;
-		int pos;
 		strbuf_addstr(&gitmodules_path, work_tree);
 		strbuf_addstr(&gitmodules_path, "/.gitmodules");
 		if (read_cache() < 0)
 			die("index file corrupt");
-		pos = cache_name_pos(".gitmodules", 11);
-		if (pos < 0) { /* .gitmodules not found or isn't merged */
-			pos = -1 - pos;
-			if (active_nr > pos) {  /* there is a .gitmodules */
-				const struct cache_entry *ce = active_cache[pos];
-				if (ce_namelen(ce) == 11 &&
-				    !memcmp(ce->name, ".gitmodules", 11))
-					gitmodules_is_unmerged = 1;
-			}
-		}
 
-		if (!gitmodules_is_unmerged)
+		if (!is_gitmodules_unmerged(&the_index))
 			git_config_from_file(git_modules_config,
 				gitmodules_path.buf, NULL);
 		strbuf_release(&gitmodules_path);
@@ -1198,8 +1198,7 @@ static int get_next_submodule(struct child_process *cp,
 					default_argv = "on-demand";
 				}
 			} else {
-				if ((spf->default_option == RECURSE_SUBMODULES_OFF) ||
-				    gitmodules_is_unmerged)
+				if (spf->default_option == RECURSE_SUBMODULES_OFF)
 					continue;
 				if (spf->default_option == RECURSE_SUBMODULES_ON_DEMAND) {
 					if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
diff --git a/submodule.h b/submodule.h
index b14660585..8022faa59 100644
--- a/submodule.h
+++ b/submodule.h
@@ -33,6 +33,7 @@ struct submodule_update_strategy {
 };
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
+extern int is_gitmodules_unmerged(const struct index_state *istate);
 extern int is_staging_gitmodules_ok(const struct index_state *istate);
 extern int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 extern int remove_path_from_gitmodules(const char *path);
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 09/10] submodule: merge repo_read_gitmodules and gitmodules_config
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
                       ` (7 preceding siblings ...)
  2017-07-18 19:05     ` [PATCH v3 08/10] submodule: check for unmerged " Brandon Williams
@ 2017-07-18 19:05     ` Brandon Williams
  2017-07-18 19:05     ` [PATCH v3 10/10] grep: recurse in-process using 'struct repository' Brandon Williams
                       ` (2 subsequent siblings)
  11 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Since 69aba5329 (submodule: add repo_read_gitmodules) there have been
two ways to load a repository's .gitmodules file:
'repo_read_gitmodules()' is used if you have a repository object you are
working with or 'gitmodules_config()' if you are implicitly working with
'the_repository'.  Merge the logic of these two functions to remove
duplicate code.

In addition, 'repo_read_gitmodules()' can segfault by passing in a NULL
pointer to 'git_config_from_file()' if a repository doesn't have a
worktree.  Instead check for the existence of a worktree before
attempting to load the .gitmodules file.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 submodule.c | 37 +++++++++++++++++--------------------
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/submodule.c b/submodule.c
index a5f825e37..ab0b74926 100644
--- a/submodule.c
+++ b/submodule.c
@@ -230,23 +230,6 @@ void load_submodule_cache(void)
 	git_config(submodule_config, NULL);
 }
 
-void gitmodules_config(void)
-{
-	const char *work_tree = get_git_work_tree();
-	if (work_tree) {
-		struct strbuf gitmodules_path = STRBUF_INIT;
-		strbuf_addstr(&gitmodules_path, work_tree);
-		strbuf_addstr(&gitmodules_path, "/.gitmodules");
-		if (read_cache() < 0)
-			die("index file corrupt");
-
-		if (!is_gitmodules_unmerged(&the_index))
-			git_config_from_file(git_modules_config,
-				gitmodules_path.buf, NULL);
-		strbuf_release(&gitmodules_path);
-	}
-}
-
 static int gitmodules_cb(const char *var, const char *value, void *data)
 {
 	struct repository *repo = data;
@@ -255,10 +238,24 @@ static int gitmodules_cb(const char *var, const char *value, void *data)
 
 void repo_read_gitmodules(struct repository *repo)
 {
-	char *gitmodules_path = repo_worktree_path(repo, ".gitmodules");
+	if (repo->worktree) {
+		char *gitmodules;
+
+		if (repo_read_index(repo) < 0)
+			return;
 
-	git_config_from_file(gitmodules_cb, gitmodules_path, repo);
-	free(gitmodules_path);
+		gitmodules = repo_worktree_path(repo, GITMODULES_FILE);
+
+		if (!is_gitmodules_unmerged(repo->index))
+			git_config_from_file(gitmodules_cb, gitmodules, repo);
+
+		free(gitmodules);
+	}
+}
+
+void gitmodules_config(void)
+{
+	repo_read_gitmodules(the_repository);
 }
 
 void gitmodules_config_sha1(const unsigned char *commit_sha1)
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v3 10/10] grep: recurse in-process using 'struct repository'
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
                       ` (8 preceding siblings ...)
  2017-07-18 19:05     ` [PATCH v3 09/10] submodule: merge repo_read_gitmodules and gitmodules_config Brandon Williams
@ 2017-07-18 19:05     ` Brandon Williams
  2017-07-18 19:36     ` [PATCH v3 00/10] Convert grep to recurse in-process Junio C Hamano
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
  11 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 19:05 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Convert grep to use 'struct repository' which enables recursing into
submodules to be handled in-process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/git-grep.txt |   7 -
 builtin/grep.c             | 396 ++++++++++-----------------------------------
 cache.h                    |   1 -
 git.c                      |   2 +-
 grep.c                     |  13 --
 grep.h                     |   1 -
 setup.c                    |  12 +-
 7 files changed, 88 insertions(+), 344 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 5033483db..720c7850e 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -95,13 +95,6 @@ OPTIONS
 	<tree> option the prefix of all submodule output will be the name of
 	the parent project's <tree> object.
 
---parent-basename <basename>::
-	For internal use only.  In order to produce uniform output with the
-	--recurse-submodules option, this option can be used to provide the
-	basename of a parent's <tree> object to a submodule so the submodule
-	can prefix its output with the parent's name rather than the SHA1 of
-	the submodule.
-
 -a::
 --text::
 	Process binary files as if they were text.
diff --git a/builtin/grep.c b/builtin/grep.c
index 7e79eb1a7..cd0e51f3c 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -28,13 +28,7 @@ static char const * const grep_usage[] = {
 	NULL
 };
 
-static const char *super_prefix;
 static int recurse_submodules;
-static struct argv_array submodule_options = ARGV_ARRAY_INIT;
-static const char *parent_basename;
-
-static int grep_submodule_launch(struct grep_opt *opt,
-				 const struct grep_source *gs);
 
 #define GREP_NUM_THREADS_DEFAULT 8
 static int num_threads;
@@ -186,10 +180,7 @@ static void *run(void *arg)
 			break;
 
 		opt->output_priv = w;
-		if (w->source.type == GREP_SOURCE_SUBMODULE)
-			hit |= grep_submodule_launch(opt, &w->source);
-		else
-			hit |= grep_source(opt, &w->source);
+		hit |= grep_source(opt, &w->source);
 		grep_source_clear_data(&w->source);
 		work_done(w);
 	}
@@ -327,21 +318,13 @@ static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
 {
 	struct strbuf pathbuf = STRBUF_INIT;
 
-	if (super_prefix) {
-		strbuf_add(&pathbuf, filename, tree_name_len);
-		strbuf_addstr(&pathbuf, super_prefix);
-		strbuf_addstr(&pathbuf, filename + tree_name_len);
+	if (opt->relative && opt->prefix_length) {
+		quote_path_relative(filename + tree_name_len, opt->prefix, &pathbuf);
+		strbuf_insert(&pathbuf, 0, filename, tree_name_len);
 	} else {
 		strbuf_addstr(&pathbuf, filename);
 	}
 
-	if (opt->relative && opt->prefix_length) {
-		char *name = strbuf_detach(&pathbuf, NULL);
-		quote_path_relative(name + tree_name_len, opt->prefix, &pathbuf);
-		strbuf_insert(&pathbuf, 0, name, tree_name_len);
-		free(name);
-	}
-
 #ifndef NO_PTHREADS
 	if (num_threads) {
 		add_work(opt, GREP_SOURCE_OID, pathbuf.buf, path, oid);
@@ -366,15 +349,10 @@ static int grep_file(struct grep_opt *opt, const char *filename)
 {
 	struct strbuf buf = STRBUF_INIT;
 
-	if (super_prefix)
-		strbuf_addstr(&buf, super_prefix);
-	strbuf_addstr(&buf, filename);
-
-	if (opt->relative && opt->prefix_length) {
-		char *name = strbuf_detach(&buf, NULL);
-		quote_path_relative(name, opt->prefix, &buf);
-		free(name);
-	}
+	if (opt->relative && opt->prefix_length)
+		quote_path_relative(filename, opt->prefix, &buf);
+	else
+		strbuf_addstr(&buf, filename);
 
 #ifndef NO_PTHREADS
 	if (num_threads) {
@@ -421,284 +399,89 @@ static void run_pager(struct grep_opt *opt, const char *prefix)
 		exit(status);
 }
 
-static void compile_submodule_options(const struct grep_opt *opt,
-				      const char **argv,
-				      int cached, int untracked,
-				      int opt_exclude, int use_index,
-				      int pattern_type_arg)
-{
-	struct grep_pat *pattern;
-
-	if (recurse_submodules)
-		argv_array_push(&submodule_options, "--recurse-submodules");
-
-	if (cached)
-		argv_array_push(&submodule_options, "--cached");
-	if (!use_index)
-		argv_array_push(&submodule_options, "--no-index");
-	if (untracked)
-		argv_array_push(&submodule_options, "--untracked");
-	if (opt_exclude > 0)
-		argv_array_push(&submodule_options, "--exclude-standard");
-
-	if (opt->invert)
-		argv_array_push(&submodule_options, "-v");
-	if (opt->ignore_case)
-		argv_array_push(&submodule_options, "-i");
-	if (opt->word_regexp)
-		argv_array_push(&submodule_options, "-w");
-	switch (opt->binary) {
-	case GREP_BINARY_NOMATCH:
-		argv_array_push(&submodule_options, "-I");
-		break;
-	case GREP_BINARY_TEXT:
-		argv_array_push(&submodule_options, "-a");
-		break;
-	default:
-		break;
-	}
-	if (opt->allow_textconv)
-		argv_array_push(&submodule_options, "--textconv");
-	if (opt->max_depth != -1)
-		argv_array_pushf(&submodule_options, "--max-depth=%d",
-				 opt->max_depth);
-	if (opt->linenum)
-		argv_array_push(&submodule_options, "-n");
-	if (!opt->pathname)
-		argv_array_push(&submodule_options, "-h");
-	if (!opt->relative)
-		argv_array_push(&submodule_options, "--full-name");
-	if (opt->name_only)
-		argv_array_push(&submodule_options, "-l");
-	if (opt->unmatch_name_only)
-		argv_array_push(&submodule_options, "-L");
-	if (opt->null_following_name)
-		argv_array_push(&submodule_options, "-z");
-	if (opt->count)
-		argv_array_push(&submodule_options, "-c");
-	if (opt->file_break)
-		argv_array_push(&submodule_options, "--break");
-	if (opt->heading)
-		argv_array_push(&submodule_options, "--heading");
-	if (opt->pre_context)
-		argv_array_pushf(&submodule_options, "--before-context=%d",
-				 opt->pre_context);
-	if (opt->post_context)
-		argv_array_pushf(&submodule_options, "--after-context=%d",
-				 opt->post_context);
-	if (opt->funcname)
-		argv_array_push(&submodule_options, "-p");
-	if (opt->funcbody)
-		argv_array_push(&submodule_options, "-W");
-	if (opt->all_match)
-		argv_array_push(&submodule_options, "--all-match");
-	if (opt->debug)
-		argv_array_push(&submodule_options, "--debug");
-	if (opt->status_only)
-		argv_array_push(&submodule_options, "-q");
-
-	switch (pattern_type_arg) {
-	case GREP_PATTERN_TYPE_BRE:
-		argv_array_push(&submodule_options, "-G");
-		break;
-	case GREP_PATTERN_TYPE_ERE:
-		argv_array_push(&submodule_options, "-E");
-		break;
-	case GREP_PATTERN_TYPE_FIXED:
-		argv_array_push(&submodule_options, "-F");
-		break;
-	case GREP_PATTERN_TYPE_PCRE:
-		argv_array_push(&submodule_options, "-P");
-		break;
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		break;
-	default:
-		die("BUG: Added a new grep pattern type without updating switch statement");
-	}
-
-	for (pattern = opt->pattern_list; pattern != NULL;
-	     pattern = pattern->next) {
-		switch (pattern->token) {
-		case GREP_PATTERN:
-			argv_array_pushf(&submodule_options, "-e%s",
-					 pattern->pattern);
-			break;
-		case GREP_AND:
-		case GREP_OPEN_PAREN:
-		case GREP_CLOSE_PAREN:
-		case GREP_NOT:
-		case GREP_OR:
-			argv_array_push(&submodule_options, pattern->pattern);
-			break;
-		/* BODY and HEAD are not used by git-grep */
-		case GREP_PATTERN_BODY:
-		case GREP_PATTERN_HEAD:
-			break;
-		}
-	}
-
-	/*
-	 * Limit number of threads for child process to use.
-	 * This is to prevent potential fork-bomb behavior of git-grep as each
-	 * submodule process has its own thread pool.
-	 */
-	argv_array_pushf(&submodule_options, "--threads=%d",
-			 DIV_ROUND_UP(num_threads, 2));
-
-	/* Add Pathspecs */
-	argv_array_push(&submodule_options, "--");
-	for (; *argv; argv++)
-		argv_array_push(&submodule_options, *argv);
-}
+static int grep_cache(struct grep_opt *opt, struct repository *repo,
+		      const struct pathspec *pathspec, int cached);
+static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
+		     struct tree_desc *tree, struct strbuf *base, int tn_len,
+		     int check_attr, struct repository *repo);
 
-/*
- * Launch child process to grep contents of a submodule
- */
-static int grep_submodule_launch(struct grep_opt *opt,
-				 const struct grep_source *gs)
+static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
+			  const struct pathspec *pathspec,
+			  const struct object_id *oid,
+			  const char *filename, const char *path)
 {
-	struct child_process cp = CHILD_PROCESS_INIT;
-	int status, i;
-	const char *end_of_base;
-	const char *name;
-	struct strbuf child_output = STRBUF_INIT;
-
-	end_of_base = strchr(gs->name, ':');
-	if (gs->identifier && end_of_base)
-		name = end_of_base + 1;
-	else
-		name = gs->name;
+	struct repository submodule;
+	int hit;
 
-	prepare_submodule_repo_env(&cp.env_array);
-	argv_array_push(&cp.env_array, GIT_DIR_ENVIRONMENT);
+	if (!is_submodule_active(superproject, path))
+		return 0;
 
-	if (opt->relative && opt->prefix_length)
-		argv_array_pushf(&cp.env_array, "%s=%s",
-				 GIT_TOPLEVEL_PREFIX_ENVIRONMENT,
-				 opt->prefix);
+	if (repo_submodule_init(&submodule, superproject, path))
+		return 0;
 
-	/* Add super prefix */
-	argv_array_pushf(&cp.args, "--super-prefix=%s%s/",
-			 super_prefix ? super_prefix : "",
-			 name);
-	argv_array_push(&cp.args, "grep");
+	repo_read_gitmodules(&submodule);
 
 	/*
-	 * Add basename of parent project
-	 * When performing grep on a tree object the filename is prefixed
-	 * with the object's name: 'tree-name:filename'.  In order to
-	 * provide uniformity of output we want to pass the name of the
-	 * parent project's object name to the submodule so the submodule can
-	 * prefix its output with the parent's name and not its own OID.
+	 * NEEDSWORK: This adds the submodule's object directory to the list of
+	 * alternates for the single in-memory object store.  This has some bad
+	 * consequences for memory (processed objects will never be freed) and
+	 * performance (this increases the number of pack files git has to pay
+	 * attention to, to the sum of the number of pack files in all the
+	 * repositories processed so far).  This can be removed once the object
+	 * store is no longer global and instead is a member of the repository
+	 * object.
 	 */
-	if (gs->identifier && end_of_base)
-		argv_array_pushf(&cp.args, "--parent-basename=%.*s",
-				 (int) (end_of_base - gs->name),
-				 gs->name);
+	add_to_alternates_memory(submodule.objectdir);
 
-	/* Add options */
-	for (i = 0; i < submodule_options.argc; i++) {
-		/*
-		 * If there is a tree identifier for the submodule, add the
-		 * rev after adding the submodule options but before the
-		 * pathspecs.  To do this we listen for the '--' and insert the
-		 * oid before pushing the '--' onto the child process argv
-		 * array.
-		 */
-		if (gs->identifier &&
-		    !strcmp("--", submodule_options.argv[i])) {
-			argv_array_push(&cp.args, oid_to_hex(gs->identifier));
-		}
+	if (oid) {
+		struct object *object;
+		struct tree_desc tree;
+		void *data;
+		unsigned long size;
+		struct strbuf base = STRBUF_INIT;
 
-		argv_array_push(&cp.args, submodule_options.argv[i]);
-	}
+		object = parse_object_or_die(oid, oid_to_hex(oid));
 
-	cp.git_cmd = 1;
-	cp.dir = gs->path;
+		grep_read_lock();
+		data = read_object_with_reference(object->oid.hash, tree_type,
+						  &size, NULL);
+		grep_read_unlock();
 
-	/*
-	 * Capture output to output buffer and check the return code from the
-	 * child process.  A '0' indicates a hit, a '1' indicates no hit and
-	 * anything else is an error.
-	 */
-	status = capture_command(&cp, &child_output, 0);
-	if (status && (status != 1)) {
-		/* flush the buffer */
-		write_or_die(1, child_output.buf, child_output.len);
-		die("process for submodule '%s' failed with exit code: %d",
-		    gs->name, status);
-	}
+		if (!data)
+			die(_("unable to read tree (%s)"), oid_to_hex(&object->oid));
 
-	opt->output(opt, child_output.buf, child_output.len);
-	strbuf_release(&child_output);
-	/* invert the return code to make a hit equal to 1 */
-	return !status;
-}
+		strbuf_addstr(&base, filename);
+		strbuf_addch(&base, '/');
 
-/*
- * Prep grep structures for a submodule grep
- * oid: the oid of the submodule or NULL if using the working tree
- * filename: name of the submodule including tree name of parent
- * path: location of the submodule
- */
-static int grep_submodule(struct grep_opt *opt, const struct object_id *oid,
-			  const char *filename, const char *path)
-{
-	if (!is_submodule_active(the_repository, path))
-		return 0;
-	if (!is_submodule_populated_gently(path, NULL)) {
-		/*
-		 * If searching history, check for the presence of the
-		 * submodule's gitdir before skipping the submodule.
-		 */
-		if (oid) {
-			const struct submodule *sub =
-					submodule_from_path(null_sha1, path);
-			if (sub)
-				path = git_path("modules/%s", sub->name);
-
-			if (!(is_directory(path) && is_git_directory(path)))
-				return 0;
-		} else {
-			return 0;
-		}
+		init_tree_desc(&tree, data, size);
+		hit = grep_tree(opt, pathspec, &tree, &base, base.len,
+				object->type == OBJ_COMMIT, &submodule);
+		strbuf_release(&base);
+		free(data);
+	} else {
+		hit = grep_cache(opt, &submodule, pathspec, 1);
 	}
 
-#ifndef NO_PTHREADS
-	if (num_threads) {
-		add_work(opt, GREP_SOURCE_SUBMODULE, filename, path, oid);
-		return 0;
-	} else
-#endif
-	{
-		struct grep_source gs;
-		int hit;
-
-		grep_source_init(&gs, GREP_SOURCE_SUBMODULE,
-				 filename, path, oid);
-		hit = grep_submodule_launch(opt, &gs);
-
-		grep_source_clear(&gs);
-		return hit;
-	}
+	repo_clear(&submodule);
+	return hit;
 }
 
-static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
-		      int cached)
+static int grep_cache(struct grep_opt *opt, struct repository *repo,
+		      const struct pathspec *pathspec, int cached)
 {
 	int hit = 0;
 	int nr;
 	struct strbuf name = STRBUF_INIT;
 	int name_base_len = 0;
-	if (super_prefix) {
-		name_base_len = strlen(super_prefix);
-		strbuf_addstr(&name, super_prefix);
+	if (repo->submodule_prefix) {
+		name_base_len = strlen(repo->submodule_prefix);
+		strbuf_addstr(&name, repo->submodule_prefix);
 	}
 
-	read_cache();
+	repo_read_index(repo);
 
-	for (nr = 0; nr < active_nr; nr++) {
-		const struct cache_entry *ce = active_cache[nr];
+	for (nr = 0; nr < repo->index->cache_nr; nr++) {
+		const struct cache_entry *ce = repo->index->cache[nr];
 		strbuf_setlen(&name, name_base_len);
 		strbuf_addstr(&name, ce->name);
 
@@ -715,14 +498,14 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 			    ce_skip_worktree(ce)) {
 				if (ce_stage(ce) || ce_intent_to_add(ce))
 					continue;
-				hit |= grep_oid(opt, &ce->oid, ce->name,
-						 0, ce->name);
+				hit |= grep_oid(opt, &ce->oid, name.buf,
+						 0, name.buf);
 			} else {
-				hit |= grep_file(opt, ce->name);
+				hit |= grep_file(opt, name.buf);
 			}
 		} else if (recurse_submodules && S_ISGITLINK(ce->ce_mode) &&
 			   submodule_path_match(pathspec, name.buf, NULL)) {
-			hit |= grep_submodule(opt, NULL, ce->name, ce->name);
+			hit |= grep_submodule(opt, repo, pathspec, NULL, ce->name, ce->name);
 		} else {
 			continue;
 		}
@@ -730,8 +513,8 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 		if (ce_stage(ce)) {
 			do {
 				nr++;
-			} while (nr < active_nr &&
-				 !strcmp(ce->name, active_cache[nr]->name));
+			} while (nr < repo->index->cache_nr &&
+				 !strcmp(ce->name, repo->index->cache[nr]->name));
 			nr--; /* compensate for loop control */
 		}
 		if (hit && opt->status_only)
@@ -744,7 +527,7 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 
 static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 		     struct tree_desc *tree, struct strbuf *base, int tn_len,
-		     int check_attr)
+		     int check_attr, struct repository *repo)
 {
 	int hit = 0;
 	enum interesting match = entry_not_interesting;
@@ -752,8 +535,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 	int old_baselen = base->len;
 	struct strbuf name = STRBUF_INIT;
 	int name_base_len = 0;
-	if (super_prefix) {
-		strbuf_addstr(&name, super_prefix);
+	if (repo->submodule_prefix) {
+		strbuf_addstr(&name, repo->submodule_prefix);
 		name_base_len = name.len;
 	}
 
@@ -791,11 +574,11 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 			strbuf_addch(base, '/');
 			init_tree_desc(&sub, data, size);
 			hit |= grep_tree(opt, pathspec, &sub, base, tn_len,
-					 check_attr);
+					 check_attr, repo);
 			free(data);
 		} else if (recurse_submodules && S_ISGITLINK(entry.mode)) {
-			hit |= grep_submodule(opt, entry.oid, base->buf,
-					      base->buf + tn_len);
+			hit |= grep_submodule(opt, repo, pathspec, entry.oid,
+					      base->buf, base->buf + tn_len);
 		}
 
 		strbuf_setlen(base, old_baselen);
@@ -809,7 +592,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
-		       struct object *obj, const char *name, const char *path)
+		       struct object *obj, const char *name, const char *path,
+		       struct repository *repo)
 {
 	if (obj->type == OBJ_BLOB)
 		return grep_oid(opt, &obj->oid, name, 0, path);
@@ -828,10 +612,6 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid));
 
-		/* Use parent's name as base when recursing submodules */
-		if (recurse_submodules && parent_basename)
-			name = parent_basename;
-
 		len = name ? strlen(name) : 0;
 		strbuf_init(&base, PATH_MAX + len + 1);
 		if (len) {
@@ -840,7 +620,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		}
 		init_tree_desc(&tree, data, size);
 		hit = grep_tree(opt, pathspec, &tree, &base, base.len,
-				obj->type == OBJ_COMMIT);
+				obj->type == OBJ_COMMIT, repo);
 		strbuf_release(&base);
 		free(data);
 		return hit;
@@ -849,6 +629,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
+			struct repository *repo,
 			const struct object_array *list)
 {
 	unsigned int i;
@@ -864,7 +645,8 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 			submodule_free();
 			gitmodules_config_sha1(real_obj->oid.hash);
 		}
-		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path)) {
+		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path,
+				repo)) {
 			hit = 1;
 			if (opt->status_only)
 				break;
@@ -1005,9 +787,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			    N_("ignore files specified via '.gitignore'"), 1),
 		OPT_BOOL(0, "recurse-submodules", &recurse_submodules,
 			 N_("recursively search in each submodule")),
-		OPT_STRING(0, "parent-basename", &parent_basename,
-			   N_("basename"),
-			   N_("prepend parent project's basename to output")),
 		OPT_GROUP(""),
 		OPT_BOOL('v', "invert-match", &opt.invert,
 			N_("show non-matching lines")),
@@ -1112,7 +891,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	init_grep_defaults();
 	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, prefix);
-	super_prefix = get_super_prefix();
 
 	/*
 	 * If there is no -- then the paths must exist in the working
@@ -1272,9 +1050,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 
 	if (recurse_submodules) {
 		gitmodules_config();
-		compile_submodule_options(&opt, argv + i, cached, untracked,
-					  opt_exclude, use_index,
-					  pattern_type_arg);
 	}
 
 	if (show_in_pager && (cached || list.nr))
@@ -1318,11 +1093,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		if (!cached)
 			setup_work_tree();
 
-		hit = grep_cache(&opt, &pathspec, cached);
+		hit = grep_cache(&opt, the_repository, &pathspec, cached);
 	} else {
 		if (cached)
 			die(_("both --cached and trees are given."));
-		hit = grep_objects(&opt, &pathspec, &list);
+
+		hit = grep_objects(&opt, &pathspec, the_repository, &list);
 	}
 
 	if (num_threads)
diff --git a/cache.h b/cache.h
index d59f767e2..c221434b2 100644
--- a/cache.h
+++ b/cache.h
@@ -417,7 +417,6 @@ static inline enum object_type object_type(unsigned int mode)
 #define GIT_WORK_TREE_ENVIRONMENT "GIT_WORK_TREE"
 #define GIT_PREFIX_ENVIRONMENT "GIT_PREFIX"
 #define GIT_SUPER_PREFIX_ENVIRONMENT "GIT_INTERNAL_SUPER_PREFIX"
-#define GIT_TOPLEVEL_PREFIX_ENVIRONMENT "GIT_INTERNAL_TOPLEVEL_PREFIX"
 #define DEFAULT_GIT_DIR_ENVIRONMENT ".git"
 #define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY"
 #define INDEX_ENVIRONMENT "GIT_INDEX_FILE"
diff --git a/git.c b/git.c
index 489aab4d8..9dd9aead6 100644
--- a/git.c
+++ b/git.c
@@ -392,7 +392,7 @@ static struct cmd_struct commands[] = {
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
 	{ "gc", cmd_gc, RUN_SETUP },
 	{ "get-tar-commit-id", cmd_get_tar_commit_id },
-	{ "grep", cmd_grep, RUN_SETUP_GENTLY | SUPPORT_SUPER_PREFIX },
+	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY },
diff --git a/grep.c b/grep.c
index 2efec0e18..45acd333b 100644
--- a/grep.c
+++ b/grep.c
@@ -1927,16 +1927,6 @@ void grep_source_init(struct grep_source *gs, enum grep_source_type type,
 	case GREP_SOURCE_FILE:
 		gs->identifier = xstrdup(identifier);
 		break;
-	case GREP_SOURCE_SUBMODULE:
-		if (!identifier) {
-			gs->identifier = NULL;
-			break;
-		}
-		/*
-		 * FALL THROUGH
-		 * If the identifier is non-NULL (in the submodule case) it
-		 * will be a SHA1 that needs to be copied.
-		 */
 	case GREP_SOURCE_OID:
 		gs->identifier = oiddup(identifier);
 		break;
@@ -1959,7 +1949,6 @@ void grep_source_clear_data(struct grep_source *gs)
 	switch (gs->type) {
 	case GREP_SOURCE_FILE:
 	case GREP_SOURCE_OID:
-	case GREP_SOURCE_SUBMODULE:
 		FREE_AND_NULL(gs->buf);
 		gs->size = 0;
 		break;
@@ -2030,8 +2019,6 @@ static int grep_source_load(struct grep_source *gs)
 		return grep_source_load_oid(gs);
 	case GREP_SOURCE_BUF:
 		return gs->buf ? 0 : -1;
-	case GREP_SOURCE_SUBMODULE:
-		break;
 	}
 	die("BUG: invalid grep_source type to load");
 }
diff --git a/grep.h b/grep.h
index 0c091e510..52aecfab6 100644
--- a/grep.h
+++ b/grep.h
@@ -193,7 +193,6 @@ struct grep_source {
 		GREP_SOURCE_OID,
 		GREP_SOURCE_FILE,
 		GREP_SOURCE_BUF,
-		GREP_SOURCE_SUBMODULE,
 	} type;
 	void *identifier;
 
diff --git a/setup.c b/setup.c
index 860507e1f..23950173f 100644
--- a/setup.c
+++ b/setup.c
@@ -1027,7 +1027,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
 {
 	static struct strbuf cwd = STRBUF_INIT;
 	struct strbuf dir = STRBUF_INIT, gitdir = STRBUF_INIT;
-	const char *prefix, *env_prefix;
+	const char *prefix;
 
 	/*
 	 * We may have read an incomplete configuration before
@@ -1085,16 +1085,6 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		die("BUG: unhandled setup_git_directory_1() result");
 	}
 
-	/*
-	 * NEEDSWORK: This was a hack in order to get ls-files and grep to have
-	 * properly formated output when recursing submodules.  Once ls-files
-	 * and grep have been changed to perform this recursing in-process this
-	 * needs to be removed.
-	 */
-	env_prefix = getenv(GIT_TOPLEVEL_PREFIX_ENVIRONMENT);
-	if (env_prefix)
-		prefix = env_prefix;
-
 	if (prefix)
 		setenv(GIT_PREFIX_ENVIRONMENT, prefix, 1);
 	else
-- 
2.13.2.932.g7449e964c-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 00/10] Convert grep to recurse in-process
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
                       ` (9 preceding siblings ...)
  2017-07-18 19:05     ` [PATCH v3 10/10] grep: recurse in-process using 'struct repository' Brandon Williams
@ 2017-07-18 19:36     ` Junio C Hamano
  2017-07-18 20:06       ` Brandon Williams
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
  11 siblings, 1 reply; 68+ messages in thread
From: Junio C Hamano @ 2017-07-18 19:36 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, jrnieder

Brandon Williams <bmwill@google.com> writes:

> Changes in v3:
>  * Fixes a bug with repo_read_gitmodules() where it was possible to
>    segfault when a repository didn't have a worktree.  
>  * In order to fix the above bug repo_read_gitmodules() and gitmodules_config()
>    were merged so that there won't be any duplicate logic.  In order to merge
>    these functions the parsing of submodule.fetchjobs and
>    fetch.recursesubmodules were removed from the submodule-config parsing logic
>    and instead moved into fetch and update-clone.  This also makes it easier to
>    ensure that no additonal non-submodule specific configuration like this will
>    be added to .gitmodules in the future.

Sounds good.  

Has this been rebased and if so on top of what?  It seems that I am
getting "am -3" conflicts at around 05/10---I think I can cope with,
but it is one unnecessary source of potential bugs, so...

> Brandon Williams (10):
>   repo_read_index: don't discard the index
>   repository: have the_repository use the_index
>   cache.h: add GITMODULES_FILE macro
>   config: add config_from_gitmodules
>   submodule: remove submodule.fetchjobs from submodule-config parsing
>   submodule: remove fetch.recursesubmodules from submodule-config
>     parsing
>   submodule: check for unstaged .gitmodules outside of config parsing
>   submodule: check for unmerged .gitmodules outside of config parsing
>   submodule: merge repo_read_gitmodules and gitmodules_config
>   grep: recurse in-process using 'struct repository'
>
>  Documentation/git-grep.txt  |   7 -
>  builtin/fetch.c             |  26 ++-
>  builtin/grep.c              | 396 ++++++++++----------------------------------
>  builtin/mv.c                |   2 +-
>  builtin/rm.c                |   2 +-
>  builtin/submodule--helper.c |  17 +-
>  cache.h                     |   2 +-
>  config.c                    |  17 ++
>  config.h                    |  10 ++
>  git.c                       |   2 +-
>  grep.c                      |  13 --
>  grep.h                      |   1 -
>  repository.c                |   6 +-
>  repository.h                |   8 +
>  setup.c                     |  12 +-
>  submodule-config.c          |   8 +
>  submodule-config.h          |   1 +
>  submodule.c                 | 147 +++++++---------
>  submodule.h                 |   6 +-
>  19 files changed, 240 insertions(+), 443 deletions(-)

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 00/10] Convert grep to recurse in-process
  2017-07-18 19:36     ` [PATCH v3 00/10] Convert grep to recurse in-process Junio C Hamano
@ 2017-07-18 20:06       ` Brandon Williams
  0 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-07-18 20:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, sbeller, jrnieder

On 07/18, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > Changes in v3:
> >  * Fixes a bug with repo_read_gitmodules() where it was possible to
> >    segfault when a repository didn't have a worktree.  
> >  * In order to fix the above bug repo_read_gitmodules() and gitmodules_config()
> >    were merged so that there won't be any duplicate logic.  In order to merge
> >    these functions the parsing of submodule.fetchjobs and
> >    fetch.recursesubmodules were removed from the submodule-config parsing logic
> >    and instead moved into fetch and update-clone.  This also makes it easier to
> >    ensure that no additonal non-submodule specific configuration like this will
> >    be added to .gitmodules in the future.
> 
> Sounds good.  
> 
> Has this been rebased and if so on top of what?  It seems that I am
> getting "am -3" conflicts at around 05/10---I think I can cope with,
> but it is one unnecessary source of potential bugs, so...

Oh sorry I also forgot to mention that I rebased it on top of current
master.  The changes to remove the fetch.recursesubmodules and
submdoule.fetchjobs from the config parsing had some weird conflicts
with a series from Stefan that recently hit master.

> 
> > Brandon Williams (10):
> >   repo_read_index: don't discard the index
> >   repository: have the_repository use the_index
> >   cache.h: add GITMODULES_FILE macro
> >   config: add config_from_gitmodules
> >   submodule: remove submodule.fetchjobs from submodule-config parsing
> >   submodule: remove fetch.recursesubmodules from submodule-config
> >     parsing
> >   submodule: check for unstaged .gitmodules outside of config parsing
> >   submodule: check for unmerged .gitmodules outside of config parsing
> >   submodule: merge repo_read_gitmodules and gitmodules_config
> >   grep: recurse in-process using 'struct repository'
> >
> >  Documentation/git-grep.txt  |   7 -
> >  builtin/fetch.c             |  26 ++-
> >  builtin/grep.c              | 396 ++++++++++----------------------------------
> >  builtin/mv.c                |   2 +-
> >  builtin/rm.c                |   2 +-
> >  builtin/submodule--helper.c |  17 +-
> >  cache.h                     |   2 +-
> >  config.c                    |  17 ++
> >  config.h                    |  10 ++
> >  git.c                       |   2 +-
> >  grep.c                      |  13 --
> >  grep.h                      |   1 -
> >  repository.c                |   6 +-
> >  repository.h                |   8 +
> >  setup.c                     |  12 +-
> >  submodule-config.c          |   8 +
> >  submodule-config.h          |   1 +
> >  submodule.c                 | 147 +++++++---------
> >  submodule.h                 |   6 +-
> >  19 files changed, 240 insertions(+), 443 deletions(-)

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH 2/3] setup: have the_repository use the_index
  2017-07-12 21:40           ` Junio C Hamano
@ 2017-07-18 21:34             ` Junio C Hamano
  0 siblings, 0 replies; 68+ messages in thread
From: Junio C Hamano @ 2017-07-18 21:34 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Brandon Williams, git, sbeller

Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Nieder <jrnieder@gmail.com> writes:
>
>>
>> All that said, I don't have a strong opinion on this.  Both the 1-word
>> approach (a pointer) and 24-word approach (embedding) are tolerable
>> and there are reasons to prefer each.
>
> I do not care too much about 24-word wastage.  If this were not "a
> pointer pretending to be embedded object", the fix in 1/3 wouldn't
> have been necessary.  I am worried about this being an invitations
> for such unnecesasry bugs.

Another thing I noticed that you already pointed out was this bit in
your review message:

> I wonder if this can be done sooner.  For example, does the following
> work?  This way, 'the_repository->index == &the_index' would be an
> invariant that always holds, even in the early setup stage before
> setup_git_directory_gently has run completely.
> 
> Thanks,
> Jonathan
> 
> diff --git i/repository.c w/repository.c
> index edca907404..bdc1f93282 100644
> --- i/repository.c
> +++ w/repository.c
> @@ -4,7 +4,7 @@
>  #include "submodule-config.h"
>  
>  /* The main repository */
> -static struct repository the_repo;
> +static struct repository the_repo = { .index = &the_index };
>  struct repository *the_repository = &the_repo;
>  
>  static char *git_path_from_env(const char *envvar, const char *git_dir,

With a pointer that can point at a random instance of index_state,
the current "struct repository" allows two or more instances of it
to share the same index_state.  I do not think that is a designed
and a desirable "feature" but an invitation for a mistake.

Embedding the real instance in it would solve that, too.

So, after saying "I am not (yet) telling you to fix the design" and
then hearing what a potential advanage could be (and none of that
was a convincing one), I am inclined to say that this eventually
needs to be fixed, preferrably before too much code starts relying
on it and making it more work to fix it later.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH] convert any hard coded .gitmodules file string to the MACRO
  2017-07-18 19:05     ` [PATCH v3 03/10] cache.h: add GITMODULES_FILE macro Brandon Williams
@ 2017-07-31 23:11       ` Stefan Beller
  2017-08-01 13:14         ` Jeff Hostetler
  0 siblings, 1 reply; 68+ messages in thread
From: Stefan Beller @ 2017-07-31 23:11 UTC (permalink / raw)
  To: bmwill; +Cc: git, gitster, jrnieder, sbeller

I used these commands:
  $ cat sem.cocci
  @@
  @@
  - ".gitmodules"
  + GITMODULES_FILE

  $ spatch --in-place --sp-file sem.cocci builtin/*.c *.c *.h

Feel free to regenerate or squash it in or have it as a separate commit.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 submodule.c    | 18 +++++++++---------
 unpack-trees.c |  2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/submodule.c b/submodule.c
index 37f4a92872..b75d02ba7b 100644
--- a/submodule.c
+++ b/submodule.c
@@ -63,7 +63,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
 	struct strbuf entry = STRBUF_INIT;
 	const struct submodule *submodule;
 
-	if (!file_exists(".gitmodules")) /* Do nothing without .gitmodules */
+	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
 		return -1;
 
 	if (gitmodules_is_unmerged)
@@ -77,7 +77,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
 	strbuf_addstr(&entry, "submodule.");
 	strbuf_addstr(&entry, submodule->name);
 	strbuf_addstr(&entry, ".path");
-	if (git_config_set_in_file_gently(".gitmodules", entry.buf, newpath) < 0) {
+	if (git_config_set_in_file_gently(GITMODULES_FILE, entry.buf, newpath) < 0) {
 		/* Maybe the user already did that, don't error out here */
 		warning(_("Could not update .gitmodules entry %s"), entry.buf);
 		strbuf_release(&entry);
@@ -97,7 +97,7 @@ int remove_path_from_gitmodules(const char *path)
 	struct strbuf sect = STRBUF_INIT;
 	const struct submodule *submodule;
 
-	if (!file_exists(".gitmodules")) /* Do nothing without .gitmodules */
+	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
 		return -1;
 
 	if (gitmodules_is_unmerged)
@@ -110,7 +110,7 @@ int remove_path_from_gitmodules(const char *path)
 	}
 	strbuf_addstr(&sect, "submodule.");
 	strbuf_addstr(&sect, submodule->name);
-	if (git_config_rename_section_in_file(".gitmodules", sect.buf, NULL) < 0) {
+	if (git_config_rename_section_in_file(GITMODULES_FILE, sect.buf, NULL) < 0) {
 		/* Maybe the user already did that, don't error out here */
 		warning(_("Could not remove .gitmodules entry for %s"), path);
 		strbuf_release(&sect);
@@ -122,7 +122,7 @@ int remove_path_from_gitmodules(const char *path)
 
 void stage_updated_gitmodules(void)
 {
-	if (add_file_to_cache(".gitmodules", 0))
+	if (add_file_to_cache(GITMODULES_FILE, 0))
 		die(_("staging updated .gitmodules failed"));
 }
 
@@ -233,18 +233,18 @@ void gitmodules_config(void)
 		strbuf_addstr(&gitmodules_path, "/.gitmodules");
 		if (read_cache() < 0)
 			die("index file corrupt");
-		pos = cache_name_pos(".gitmodules", 11);
+		pos = cache_name_pos(GITMODULES_FILE, 11);
 		if (pos < 0) { /* .gitmodules not found or isn't merged */
 			pos = -1 - pos;
 			if (active_nr > pos) {  /* there is a .gitmodules */
 				const struct cache_entry *ce = active_cache[pos];
 				if (ce_namelen(ce) == 11 &&
-				    !memcmp(ce->name, ".gitmodules", 11))
+				    !memcmp(ce->name, GITMODULES_FILE, 11))
 					gitmodules_is_unmerged = 1;
 			}
 		} else if (pos < active_nr) {
 			struct stat st;
-			if (lstat(".gitmodules", &st) == 0 &&
+			if (lstat(GITMODULES_FILE, &st) == 0 &&
 			    ce_match_stat(active_cache[pos], &st, 0) & DATA_CHANGED)
 				gitmodules_is_modified = 1;
 		}
@@ -264,7 +264,7 @@ static int gitmodules_cb(const char *var, const char *value, void *data)
 
 void repo_read_gitmodules(struct repository *repo)
 {
-	char *gitmodules_path = repo_worktree_path(repo, ".gitmodules");
+	char *gitmodules_path = repo_worktree_path(repo, GITMODULES_FILE);
 
 	git_config_from_file(gitmodules_cb, gitmodules_path, repo);
 	free(gitmodules_path);
diff --git a/unpack-trees.c b/unpack-trees.c
index dd535bc849..05335fe5bf 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -286,7 +286,7 @@ static void reload_gitmodules_file(struct index_state *index,
 	for (i = 0; i < index->cache_nr; i++) {
 		struct cache_entry *ce = index->cache[i];
 		if (ce->ce_flags & CE_UPDATE) {
-			int r = strcmp(ce->name, ".gitmodules");
+			int r = strcmp(ce->name, GITMODULES_FILE);
 			if (r < 0)
 				continue;
 			else if (r == 0) {
-- 
2.14.0.rc0.3.g6c2e499285


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 07/10] submodule: check for unstaged .gitmodules outside of config parsing
  2017-07-18 19:05     ` [PATCH v3 07/10] submodule: check for unstaged .gitmodules outside of config parsing Brandon Williams
@ 2017-07-31 23:41       ` Stefan Beller
  2017-08-02 17:41         ` Brandon Williams
  0 siblings, 1 reply; 68+ messages in thread
From: Stefan Beller @ 2017-07-31 23:41 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git@vger.kernel.org, Jonathan Nieder, Junio C Hamano

On Tue, Jul 18, 2017 at 12:05 PM, Brandon Williams <bmwill@google.com> wrote:
> Teach 'is_staging_gitmodules_ok()' to be able to determine in the
> '.gitmodules' file has unstaged changes based on the passed in index
> instead of relying on a global varible which is set during the

variable

> submodule-config parsing.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  builtin/mv.c |  2 +-
>  builtin/rm.c |  2 +-
>  submodule.c  | 32 +++++++++++++++++---------------
>  submodule.h  |  2 +-
>  4 files changed, 20 insertions(+), 18 deletions(-)
>
> diff --git a/builtin/mv.c b/builtin/mv.c
> index dcf6736b5..94fbaaa5d 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -81,7 +81,7 @@ static void prepare_move_submodule(const char *src, int first,
>         struct strbuf submodule_dotgit = STRBUF_INIT;
>         if (!S_ISGITLINK(active_cache[first]->ce_mode))
>                 die(_("Directory %s is in index and no submodule?"), src);
> -       if (!is_staging_gitmodules_ok())
> +       if (!is_staging_gitmodules_ok(&the_index))
>                 die(_("Please stage your changes to .gitmodules or stash them to proceed"));
>         strbuf_addf(&submodule_dotgit, "%s/.git", src);
>         *submodule_gitfile = read_gitfile(submodule_dotgit.buf);
> diff --git a/builtin/rm.c b/builtin/rm.c
> index 52826d137..4057e73fa 100644
> --- a/builtin/rm.c
> +++ b/builtin/rm.c
> @@ -286,7 +286,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
>                 list.entry[list.nr].name = xstrdup(ce->name);
>                 list.entry[list.nr].is_submodule = S_ISGITLINK(ce->ce_mode);
>                 if (list.entry[list.nr++].is_submodule &&
> -                   !is_staging_gitmodules_ok())
> +                   !is_staging_gitmodules_ok(&the_index))
>                         die (_("Please stage your changes to .gitmodules or stash them to proceed"));
>         }
>
> diff --git a/submodule.c b/submodule.c
> index b1965290f..46ec04d7c 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -37,18 +37,25 @@ static struct oid_array ref_tips_after_fetch;
>  static int gitmodules_is_unmerged;
>
>  /*
> - * This flag is set if the .gitmodules file had unstaged modifications on
> - * startup. This must be checked before allowing modifications to the
> - * .gitmodules file with the intention to stage them later, because when
> - * continuing we would stage the modifications the user didn't stage herself
> - * too. That might change in a future version when we learn to stage the
> - * changes we do ourselves without staging any previous modifications.
> + * Check if the .gitmodules file has unstaged modifications.  This must be
> + * checked before allowing modifications to the .gitmodules file with the
> + * intention to stage them later, because when continuing we would stage the
> + * modifications the user didn't stage herself too. That might change in a
> + * future version when we learn to stage the changes we do ourselves without
> + * staging any previous modifications.
>   */
> -static int gitmodules_is_modified;
> -
> -int is_staging_gitmodules_ok(void)
> +int is_staging_gitmodules_ok(const struct index_state *istate)
>  {
> -       return !gitmodules_is_modified;
> +       int pos = index_name_pos(istate, GITMODULES_FILE, strlen(GITMODULES_FILE));
> +
> +       if ((pos >= 0) && (pos < istate->cache_nr)) {

Why do we need the second check (pos < istate->cache_nr) ?

I would have assumed the first one suffices,
it might read better if turned around:


    if (pos < 0)
        return 1;

    return (lstat(GITMODULES_FILE, &st) == 0 &&
        ce_match_stat(istate->cache[pos], &st, 0) & DATA_CHANGED);
  }

> @@ -231,11 +238,6 @@ void gitmodules_config(void)
>                                     !memcmp(ce->name, ".gitmodules", 11))
>                                         gitmodules_is_unmerged = 1;
>                         }
> -               } else if (pos < active_nr) {
> -                       struct stat st;
> -                       if (lstat(".gitmodules", &st) == 0 &&
> -                           ce_match_stat(active_cache[pos], &st, 0) & DATA_CHANGED)
> -                               gitmodules_is_modified = 1;
>                 }

So this is where the check "pos < active_nr" is coming from,
introduced in 5fee995244 (submodule.c: add .gitmodules staging
helper functions, 2013-07-30) as well as d4e98b581b (Submodules:
Don't parse .gitmodules when it contains, merge conflicts, 2011-05-14).

If I am reading the docs for cache_name_pos correctly, we would
not need to check for the index exceeding active_cache,
but checking for the index not being out of bounds seems
to be wide spread.

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] convert any hard coded .gitmodules file string to the MACRO
  2017-07-31 23:11       ` [PATCH] convert any hard coded .gitmodules file string to the MACRO Stefan Beller
@ 2017-08-01 13:14         ` Jeff Hostetler
  2017-08-01 17:35           ` Stefan Beller
  0 siblings, 1 reply; 68+ messages in thread
From: Jeff Hostetler @ 2017-08-01 13:14 UTC (permalink / raw)
  To: Stefan Beller, bmwill; +Cc: git, gitster, jrnieder



On 7/31/2017 7:11 PM, Stefan Beller wrote:
> I used these commands:
>    $ cat sem.cocci
>    @@
>    @@
>    - ".gitmodules"
>    + GITMODULES_FILE
> 
>    $ spatch --in-place --sp-file sem.cocci builtin/*.c *.c *.h
> 
> Feel free to regenerate or squash it in or have it as a separate commit.
> 
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>   submodule.c    | 18 +++++++++---------
>   unpack-trees.c |  2 +-
>   2 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/submodule.c b/submodule.c
> index 37f4a92872..b75d02ba7b 100644
> --- a/submodule.c
> +++ b/submodule.c
>   
> @@ -233,18 +233,18 @@ void gitmodules_config(void)
>   		strbuf_addstr(&gitmodules_path, "/.gitmodules");

Did you mean to also change "/.gitmodules" ??

>   		if (read_cache() < 0)
>   			die("index file corrupt");
> -		pos = cache_name_pos(".gitmodules", 11);
> +		pos = cache_name_pos(GITMODULES_FILE, 11);
>   		if (pos < 0) { /* .gitmodules not found or isn't merged */
>   			pos = -1 - pos;
>   			if (active_nr > pos) {  /* there is a .gitmodules */

It might also be nice to change the literals in the comments to
use the macro.

Jeff


^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] convert any hard coded .gitmodules file string to the MACRO
  2017-08-01 13:14         ` Jeff Hostetler
@ 2017-08-01 17:35           ` Stefan Beller
  2017-08-01 20:26             ` Junio C Hamano
  0 siblings, 1 reply; 68+ messages in thread
From: Stefan Beller @ 2017-08-01 17:35 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Brandon Williams, git@vger.kernel.org, Junio C Hamano,
	Jonathan Nieder

On Tue, Aug 1, 2017 at 6:14 AM, Jeff Hostetler <git@jeffhostetler.com> wrote:
>
>
> On 7/31/2017 7:11 PM, Stefan Beller wrote:
>>
>> I used these commands:
>>    $ cat sem.cocci
>>    @@
>>    @@
>>    - ".gitmodules"
>>    + GITMODULES_FILE
>>
>>    $ spatch --in-place --sp-file sem.cocci builtin/*.c *.c *.h
>>
>> Feel free to regenerate or squash it in or have it as a separate commit.
>>
>> Signed-off-by: Stefan Beller <sbeller@google.com>
>> ---
>>   submodule.c    | 18 +++++++++---------
>>   unpack-trees.c |  2 +-
>>   2 files changed, 10 insertions(+), 10 deletions(-)
>>
>> diff --git a/submodule.c b/submodule.c
>> index 37f4a92872..b75d02ba7b 100644
>> --- a/submodule.c
>> +++ b/submodule.c
>>   @@ -233,18 +233,18 @@ void gitmodules_config(void)
>>                 strbuf_addstr(&gitmodules_path, "/.gitmodules");
>
>
> Did you mean to also change "/.gitmodules" ??

Goog point. We should pick that up as well. However as we do not have
a macro for that, we'd have to have 2 calls to strbuf API

    strbuf_addch(&sb, '/');
    strbuf_addstr(&sb, GITMODULES);

>
>>                 if (read_cache() < 0)
>>                         die("index file corrupt");
>> -               pos = cache_name_pos(".gitmodules", 11);
>> +               pos = cache_name_pos(GITMODULES_FILE, 11);
>>                 if (pos < 0) { /* .gitmodules not found or isn't merged */
>>                         pos = -1 - pos;
>>                         if (active_nr > pos) {  /* there is a .gitmodules
>> */
>
>
> It might also be nice to change the literals in the comments to
> use the macro.

Yes, I wondered if sed would have been better for this job.

>
> Jeff
>

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] convert any hard coded .gitmodules file string to the MACRO
  2017-08-01 17:35           ` Stefan Beller
@ 2017-08-01 20:26             ` Junio C Hamano
  2017-08-02 17:26               ` Brandon Williams
  2017-08-02 17:46               ` Brandon Williams
  0 siblings, 2 replies; 68+ messages in thread
From: Junio C Hamano @ 2017-08-01 20:26 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Jeff Hostetler, Brandon Williams, git@vger.kernel.org,
	Jonathan Nieder

Stefan Beller <sbeller@google.com> writes:

>>>   @@ -233,18 +233,18 @@ void gitmodules_config(void)
>>>                 strbuf_addstr(&gitmodules_path, "/.gitmodules");
>>
>>
>> Did you mean to also change "/.gitmodules" ??
>
> Goog point. We should pick that up as well. However as we do not have
> a macro for that, we'd have to have 2 calls to strbuf API
>
>     strbuf_addch(&sb, '/');
>     strbuf_addstr(&sb, GITMODULES);

Ehh, doesn't string literal concatenation work here?  I.e. something
like:

    strbuf_addstr(&gitmodules_path, "/" GITMODULES_FILE);


>>>                 if (pos < 0) { /* .gitmodules not found or isn't merged */
>>>                         pos = -1 - pos;
>>>                         if (active_nr > pos) {  /* there is a .gitmodules
>>> */
>>
>>
>> It might also be nice to change the literals in the comments to
>> use the macro.

The reason you want this patch is not like we want to make it easy
to rename the file to ".gitprojects" later, right?  The patch is
about avoiding misspelled string constant, like "/.gitmdoules",
without getting caught by the compiler, no?

Assuming that I am correctly guessing the intention, I think it is a
bad idea to rename these in the comments.



^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] convert any hard coded .gitmodules file string to the MACRO
  2017-08-01 20:26             ` Junio C Hamano
@ 2017-08-02 17:26               ` Brandon Williams
  2017-08-02 17:46               ` Brandon Williams
  1 sibling, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 17:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, Jeff Hostetler, git@vger.kernel.org,
	Jonathan Nieder

On 08/01, Junio C Hamano wrote:
> Stefan Beller <sbeller@google.com> writes:
> 
> >>>   @@ -233,18 +233,18 @@ void gitmodules_config(void)
> >>>                 strbuf_addstr(&gitmodules_path, "/.gitmodules");
> >>
> >>
> >> Did you mean to also change "/.gitmodules" ??
> >
> > Goog point. We should pick that up as well. However as we do not have
> > a macro for that, we'd have to have 2 calls to strbuf API
> >
> >     strbuf_addch(&sb, '/');
> >     strbuf_addstr(&sb, GITMODULES);
> 
> Ehh, doesn't string literal concatenation work here?  I.e. something
> like:
> 
>     strbuf_addstr(&gitmodules_path, "/" GITMODULES_FILE);
> 
> 
> >>>                 if (pos < 0) { /* .gitmodules not found or isn't merged */
> >>>                         pos = -1 - pos;
> >>>                         if (active_nr > pos) {  /* there is a .gitmodules
> >>> */
> >>
> >>
> >> It might also be nice to change the literals in the comments to
> >> use the macro.
> 
> The reason you want this patch is not like we want to make it easy
> to rename the file to ".gitprojects" later, right?  The patch is
> about avoiding misspelled string constant, like "/.gitmdoules",
> without getting caught by the compiler, no?

Yes, it was mostly about preventing mistakes and having the compiler
help you out a bit, so changing the comments isn't really needed.

> 
> Assuming that I am correctly guessing the intention, I think it is a
> bad idea to rename these in the comments.
> 
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 07/10] submodule: check for unstaged .gitmodules outside of config parsing
  2017-07-31 23:41       ` Stefan Beller
@ 2017-08-02 17:41         ` Brandon Williams
  2017-08-02 18:00           ` Brandon Williams
  0 siblings, 1 reply; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 17:41 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org, Jonathan Nieder, Junio C Hamano

On 07/31, Stefan Beller wrote:
> On Tue, Jul 18, 2017 at 12:05 PM, Brandon Williams <bmwill@google.com> wrote:
> > Teach 'is_staging_gitmodules_ok()' to be able to determine in the
> > '.gitmodules' file has unstaged changes based on the passed in index
> > instead of relying on a global varible which is set during the
> 
> variable
> 

Will change.

> > submodule-config parsing.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  builtin/mv.c |  2 +-
> >  builtin/rm.c |  2 +-
> >  submodule.c  | 32 +++++++++++++++++---------------
> >  submodule.h  |  2 +-
> >  4 files changed, 20 insertions(+), 18 deletions(-)
> >
> > diff --git a/builtin/mv.c b/builtin/mv.c
> > index dcf6736b5..94fbaaa5d 100644
> > --- a/builtin/mv.c
> > +++ b/builtin/mv.c
> > @@ -81,7 +81,7 @@ static void prepare_move_submodule(const char *src, int first,
> >         struct strbuf submodule_dotgit = STRBUF_INIT;
> >         if (!S_ISGITLINK(active_cache[first]->ce_mode))
> >                 die(_("Directory %s is in index and no submodule?"), src);
> > -       if (!is_staging_gitmodules_ok())
> > +       if (!is_staging_gitmodules_ok(&the_index))
> >                 die(_("Please stage your changes to .gitmodules or stash them to proceed"));
> >         strbuf_addf(&submodule_dotgit, "%s/.git", src);
> >         *submodule_gitfile = read_gitfile(submodule_dotgit.buf);
> > diff --git a/builtin/rm.c b/builtin/rm.c
> > index 52826d137..4057e73fa 100644
> > --- a/builtin/rm.c
> > +++ b/builtin/rm.c
> > @@ -286,7 +286,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
> >                 list.entry[list.nr].name = xstrdup(ce->name);
> >                 list.entry[list.nr].is_submodule = S_ISGITLINK(ce->ce_mode);
> >                 if (list.entry[list.nr++].is_submodule &&
> > -                   !is_staging_gitmodules_ok())
> > +                   !is_staging_gitmodules_ok(&the_index))
> >                         die (_("Please stage your changes to .gitmodules or stash them to proceed"));
> >         }
> >
> > diff --git a/submodule.c b/submodule.c
> > index b1965290f..46ec04d7c 100644
> > --- a/submodule.c
> > +++ b/submodule.c
> > @@ -37,18 +37,25 @@ static struct oid_array ref_tips_after_fetch;
> >  static int gitmodules_is_unmerged;
> >
> >  /*
> > - * This flag is set if the .gitmodules file had unstaged modifications on
> > - * startup. This must be checked before allowing modifications to the
> > - * .gitmodules file with the intention to stage them later, because when
> > - * continuing we would stage the modifications the user didn't stage herself
> > - * too. That might change in a future version when we learn to stage the
> > - * changes we do ourselves without staging any previous modifications.
> > + * Check if the .gitmodules file has unstaged modifications.  This must be
> > + * checked before allowing modifications to the .gitmodules file with the
> > + * intention to stage them later, because when continuing we would stage the
> > + * modifications the user didn't stage herself too. That might change in a
> > + * future version when we learn to stage the changes we do ourselves without
> > + * staging any previous modifications.
> >   */
> > -static int gitmodules_is_modified;
> > -
> > -int is_staging_gitmodules_ok(void)
> > +int is_staging_gitmodules_ok(const struct index_state *istate)
> >  {
> > -       return !gitmodules_is_modified;
> > +       int pos = index_name_pos(istate, GITMODULES_FILE, strlen(GITMODULES_FILE));
> > +
> > +       if ((pos >= 0) && (pos < istate->cache_nr)) {
> 
> Why do we need the second check (pos < istate->cache_nr) ?
> 
> I would have assumed the first one suffices,
> it might read better if turned around:
> 
> 
>     if (pos < 0)
>         return 1;
> 
>     return (lstat(GITMODULES_FILE, &st) == 0 &&
>         ce_match_stat(istate->cache[pos], &st, 0) & DATA_CHANGED);
>   }
> 
> > @@ -231,11 +238,6 @@ void gitmodules_config(void)
> >                                     !memcmp(ce->name, ".gitmodules", 11))
> >                                         gitmodules_is_unmerged = 1;
> >                         }
> > -               } else if (pos < active_nr) {
> > -                       struct stat st;
> > -                       if (lstat(".gitmodules", &st) == 0 &&
> > -                           ce_match_stat(active_cache[pos], &st, 0) & DATA_CHANGED)
> > -                               gitmodules_is_modified = 1;
> >                 }
> 
> So this is where the check "pos < active_nr" is coming from,
> introduced in 5fee995244 (submodule.c: add .gitmodules staging
> helper functions, 2013-07-30) as well as d4e98b581b (Submodules:
> Don't parse .gitmodules when it contains, merge conflicts, 2011-05-14).
> 
> If I am reading the docs for cache_name_pos correctly, we would
> not need to check for the index exceeding active_cache,
> but checking for the index not being out of bounds seems
> to be wide spread.

I can drop the pos < active_nr requirement.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH] convert any hard coded .gitmodules file string to the MACRO
  2017-08-01 20:26             ` Junio C Hamano
  2017-08-02 17:26               ` Brandon Williams
@ 2017-08-02 17:46               ` Brandon Williams
  1 sibling, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 17:46 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, Jeff Hostetler, git@vger.kernel.org,
	Jonathan Nieder

On 08/01, Junio C Hamano wrote:
> Stefan Beller <sbeller@google.com> writes:
> 
> >>>   @@ -233,18 +233,18 @@ void gitmodules_config(void)
> >>>                 strbuf_addstr(&gitmodules_path, "/.gitmodules");
> >>
> >>
> >> Did you mean to also change "/.gitmodules" ??
> >
> > Goog point. We should pick that up as well. However as we do not have
> > a macro for that, we'd have to have 2 calls to strbuf API
> >
> >     strbuf_addch(&sb, '/');
> >     strbuf_addstr(&sb, GITMODULES);
> 
> Ehh, doesn't string literal concatenation work here?  I.e. something
> like:
> 
>     strbuf_addstr(&gitmodules_path, "/" GITMODULES_FILE);

Also this doesn't really matter much since this line is removed latter
on in the series, but I'll go with the string literal concatenation for
the intermediate state.

> 
> 
> >>>                 if (pos < 0) { /* .gitmodules not found or isn't merged */
> >>>                         pos = -1 - pos;
> >>>                         if (active_nr > pos) {  /* there is a .gitmodules
> >>> */
> >>
> >>
> >> It might also be nice to change the literals in the comments to
> >> use the macro.
> 
> The reason you want this patch is not like we want to make it easy
> to rename the file to ".gitprojects" later, right?  The patch is
> about avoiding misspelled string constant, like "/.gitmdoules",
> without getting caught by the compiler, no?
> 
> Assuming that I am correctly guessing the intention, I think it is a
> bad idea to rename these in the comments.
> 
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* Re: [PATCH v3 07/10] submodule: check for unstaged .gitmodules outside of config parsing
  2017-08-02 17:41         ` Brandon Williams
@ 2017-08-02 18:00           ` Brandon Williams
  0 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 18:00 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git@vger.kernel.org, Jonathan Nieder, Junio C Hamano

On 08/02, Brandon Williams wrote:
> On 07/31, Stefan Beller wrote:
> > 
> > So this is where the check "pos < active_nr" is coming from,
> > introduced in 5fee995244 (submodule.c: add .gitmodules staging
> > helper functions, 2013-07-30) as well as d4e98b581b (Submodules:
> > Don't parse .gitmodules when it contains, merge conflicts, 2011-05-14).
> > 
> > If I am reading the docs for cache_name_pos correctly, we would
> > not need to check for the index exceeding active_cache,
> > but checking for the index not being out of bounds seems
> > to be wide spread.
> 
> I can drop the pos < active_nr requirement.

From our conversation offline i'll leave this just in case there is some
subtle reason why it exists.  Also makes it more of a 1:1 conversion.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v4 00/10] Convert grep to recurse in-process
  2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
                       ` (10 preceding siblings ...)
  2017-07-18 19:36     ` [PATCH v3 00/10] Convert grep to recurse in-process Junio C Hamano
@ 2017-08-02 19:49     ` Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 01/10] repo_read_index: don't discard the index Brandon Williams
                         ` (9 more replies)
  11 siblings, 10 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Changes in v4:
 * small typo fix in commit message.
 * convert all occurrences of '.gitmodules' to use the new macro.

Brandon Williams (10):
  repo_read_index: don't discard the index
  repository: have the_repository use the_index
  cache.h: add GITMODULES_FILE macro
  config: add config_from_gitmodules
  submodule: remove submodule.fetchjobs from submodule-config parsing
  submodule: remove fetch.recursesubmodules from submodule-config
    parsing
  submodule: check for unstaged .gitmodules outside of config parsing
  submodule: check for unmerged .gitmodules outside of config parsing
  submodule: merge repo_read_gitmodules and gitmodules_config
  grep: recurse in-process using 'struct repository'

 Documentation/git-grep.txt  |   7 -
 builtin/fetch.c             |  26 ++-
 builtin/grep.c              | 396 ++++++++++----------------------------------
 builtin/mv.c                |   2 +-
 builtin/rm.c                |   2 +-
 builtin/submodule--helper.c |  17 +-
 cache.h                     |   2 +-
 config.c                    |  17 ++
 config.h                    |  10 ++
 git.c                       |   2 +-
 grep.c                      |  13 --
 grep.h                      |   1 -
 repository.c                |   6 +-
 repository.h                |   8 +
 setup.c                     |  12 +-
 submodule-config.c          |   8 +
 submodule-config.h          |   1 +
 submodule.c                 | 157 ++++++++----------
 submodule.h                 |   6 +-
 unpack-trees.c              |   2 +-
 20 files changed, 246 insertions(+), 449 deletions(-)

-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply	[flat|nested] 68+ messages in thread

* [PATCH v4 01/10] repo_read_index: don't discard the index
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
@ 2017-08-02 19:49       ` Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 02/10] repository: have the_repository use the_index Brandon Williams
                         ` (8 subsequent siblings)
  9 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Have 'repo_read_index()' behave more like the other read_index family of
functions and don't discard the index if it has already been populated
and instead rely on the quick return of read_index_from which has:

  /* istate->initialized covers both .git/index and .git/sharedindex.xxx */
  if (istate->initialized)
    return istate->cache_nr;

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 repository.c | 2 --
 repository.h | 8 ++++++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/repository.c b/repository.c
index edca90740..8e60af1d5 100644
--- a/repository.c
+++ b/repository.c
@@ -235,8 +235,6 @@ int repo_read_index(struct repository *repo)
 {
 	if (!repo->index)
 		repo->index = xcalloc(1, sizeof(*repo->index));
-	else
-		discard_index(repo->index);
 
 	return read_index_from(repo->index, repo->index_file);
 }
diff --git a/repository.h b/repository.h
index 417787f3e..7f5e24a0a 100644
--- a/repository.h
+++ b/repository.h
@@ -92,6 +92,14 @@ extern int repo_submodule_init(struct repository *submodule,
 			       const char *path);
 extern void repo_clear(struct repository *repo);
 
+/*
+ * Populates the repository's index from its index_file, an index struct will
+ * be allocated if needed.
+ *
+ * Return the number of index entries in the populated index or a value less
+ * than zero if an error occured.  If the repository's index has already been
+ * populated then the number of entries will simply be returned.
+ */
 extern int repo_read_index(struct repository *repo);
 
 #endif /* REPOSITORY_H */
-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v4 02/10] repository: have the_repository use the_index
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 01/10] repo_read_index: don't discard the index Brandon Williams
@ 2017-08-02 19:49       ` Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 03/10] cache.h: add GITMODULES_FILE macro Brandon Williams
                         ` (7 subsequent siblings)
  9 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Have the index state which is stored in 'the_repository' be a pointer to
the in-core index 'the_index'.  This makes it easier to begin
transitioning more parts of the code base to operate on a 'struct
repository'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 repository.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/repository.c b/repository.c
index 8e60af1d5..c0e0e0e7e 100644
--- a/repository.c
+++ b/repository.c
@@ -4,7 +4,9 @@
 #include "submodule-config.h"
 
 /* The main repository */
-static struct repository the_repo;
+static struct repository the_repo = {
+	NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, &the_index, 0, 0
+};
 struct repository *the_repository = &the_repo;
 
 static char *git_path_from_env(const char *envvar, const char *git_dir,
-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v4 03/10] cache.h: add GITMODULES_FILE macro
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 01/10] repo_read_index: don't discard the index Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 02/10] repository: have the_repository use the_index Brandon Williams
@ 2017-08-02 19:49       ` Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 04/10] config: add config_from_gitmodules Brandon Williams
                         ` (6 subsequent siblings)
  9 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Add a macro to be used when specifying the '.gitmodules' file and
convert any existing hard coded '.gitmodules' file strings to use the
new macro.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---
 cache.h        |  1 +
 submodule.c    | 20 ++++++++++----------
 unpack-trees.c |  2 +-
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/cache.h b/cache.h
index 71fe09264..d59f767e2 100644
--- a/cache.h
+++ b/cache.h
@@ -433,6 +433,7 @@ static inline enum object_type object_type(unsigned int mode)
 #define GITATTRIBUTES_FILE ".gitattributes"
 #define INFOATTRIBUTES_FILE "info/attributes"
 #define ATTRIBUTE_MACRO_PREFIX "[attr]"
+#define GITMODULES_FILE ".gitmodules"
 #define GIT_NOTES_REF_ENVIRONMENT "GIT_NOTES_REF"
 #define GIT_NOTES_DEFAULT_REF "refs/notes/commits"
 #define GIT_NOTES_DISPLAY_REF_ENVIRONMENT "GIT_NOTES_DISPLAY_REF"
diff --git a/submodule.c b/submodule.c
index 6531c5d60..64ad5c12d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -63,7 +63,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
 	struct strbuf entry = STRBUF_INIT;
 	const struct submodule *submodule;
 
-	if (!file_exists(".gitmodules")) /* Do nothing without .gitmodules */
+	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
 		return -1;
 
 	if (gitmodules_is_unmerged)
@@ -77,7 +77,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
 	strbuf_addstr(&entry, "submodule.");
 	strbuf_addstr(&entry, submodule->name);
 	strbuf_addstr(&entry, ".path");
-	if (git_config_set_in_file_gently(".gitmodules", entry.buf, newpath) < 0) {
+	if (git_config_set_in_file_gently(GITMODULES_FILE, entry.buf, newpath) < 0) {
 		/* Maybe the user already did that, don't error out here */
 		warning(_("Could not update .gitmodules entry %s"), entry.buf);
 		strbuf_release(&entry);
@@ -97,7 +97,7 @@ int remove_path_from_gitmodules(const char *path)
 	struct strbuf sect = STRBUF_INIT;
 	const struct submodule *submodule;
 
-	if (!file_exists(".gitmodules")) /* Do nothing without .gitmodules */
+	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
 		return -1;
 
 	if (gitmodules_is_unmerged)
@@ -110,7 +110,7 @@ int remove_path_from_gitmodules(const char *path)
 	}
 	strbuf_addstr(&sect, "submodule.");
 	strbuf_addstr(&sect, submodule->name);
-	if (git_config_rename_section_in_file(".gitmodules", sect.buf, NULL) < 0) {
+	if (git_config_rename_section_in_file(GITMODULES_FILE, sect.buf, NULL) < 0) {
 		/* Maybe the user already did that, don't error out here */
 		warning(_("Could not remove .gitmodules entry for %s"), path);
 		strbuf_release(&sect);
@@ -122,7 +122,7 @@ int remove_path_from_gitmodules(const char *path)
 
 void stage_updated_gitmodules(void)
 {
-	if (add_file_to_cache(".gitmodules", 0))
+	if (add_file_to_cache(GITMODULES_FILE, 0))
 		die(_("staging updated .gitmodules failed"));
 }
 
@@ -230,21 +230,21 @@ void gitmodules_config(void)
 		struct strbuf gitmodules_path = STRBUF_INIT;
 		int pos;
 		strbuf_addstr(&gitmodules_path, work_tree);
-		strbuf_addstr(&gitmodules_path, "/.gitmodules");
+		strbuf_addstr(&gitmodules_path, "/" GITMODULES_FILE);
 		if (read_cache() < 0)
 			die("index file corrupt");
-		pos = cache_name_pos(".gitmodules", 11);
+		pos = cache_name_pos(GITMODULES_FILE, 11);
 		if (pos < 0) { /* .gitmodules not found or isn't merged */
 			pos = -1 - pos;
 			if (active_nr > pos) {  /* there is a .gitmodules */
 				const struct cache_entry *ce = active_cache[pos];
 				if (ce_namelen(ce) == 11 &&
-				    !memcmp(ce->name, ".gitmodules", 11))
+				    !memcmp(ce->name, GITMODULES_FILE, 11))
 					gitmodules_is_unmerged = 1;
 			}
 		} else if (pos < active_nr) {
 			struct stat st;
-			if (lstat(".gitmodules", &st) == 0 &&
+			if (lstat(GITMODULES_FILE, &st) == 0 &&
 			    ce_match_stat(active_cache[pos], &st, 0) & DATA_CHANGED)
 				gitmodules_is_modified = 1;
 		}
@@ -264,7 +264,7 @@ static int gitmodules_cb(const char *var, const char *value, void *data)
 
 void repo_read_gitmodules(struct repository *repo)
 {
-	char *gitmodules_path = repo_worktree_path(repo, ".gitmodules");
+	char *gitmodules_path = repo_worktree_path(repo, GITMODULES_FILE);
 
 	git_config_from_file(gitmodules_cb, gitmodules_path, repo);
 	free(gitmodules_path);
diff --git a/unpack-trees.c b/unpack-trees.c
index dd535bc84..05335fe5b 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -286,7 +286,7 @@ static void reload_gitmodules_file(struct index_state *index,
 	for (i = 0; i < index->cache_nr; i++) {
 		struct cache_entry *ce = index->cache[i];
 		if (ce->ce_flags & CE_UPDATE) {
-			int r = strcmp(ce->name, ".gitmodules");
+			int r = strcmp(ce->name, GITMODULES_FILE);
 			if (r < 0)
 				continue;
 			else if (r == 0) {
-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v4 04/10] config: add config_from_gitmodules
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
                         ` (2 preceding siblings ...)
  2017-08-02 19:49       ` [PATCH v4 03/10] cache.h: add GITMODULES_FILE macro Brandon Williams
@ 2017-08-02 19:49       ` Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 05/10] submodule: remove submodule.fetchjobs from submodule-config parsing Brandon Williams
                         ` (5 subsequent siblings)
  9 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Add 'config_from_gitmodules()' function which can be used by 'fetch' and
'update_clone' in order to maintain backwards compatibility with
configuration being stored in .gitmodules' since a future patch will
remove reading these values in the submodule-config.

This function should not be used anywhere other than in 'fetch' and
'update_clone'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 config.c | 17 +++++++++++++++++
 config.h | 10 ++++++++++
 2 files changed, 27 insertions(+)

diff --git a/config.c b/config.c
index 231f9a750..06645a325 100644
--- a/config.c
+++ b/config.c
@@ -2053,6 +2053,23 @@ int git_config_get_pathname(const char *key, const char **dest)
 	return repo_config_get_pathname(the_repository, key, dest);
 }
 
+/*
+ * Note: This function exists solely to maintain backward compatibility with
+ * 'fetch' and 'update_clone' storing configuration in '.gitmodules' and should
+ * NOT be used anywhere else.
+ *
+ * Runs the provided config function on the '.gitmodules' file found in the
+ * working directory.
+ */
+void config_from_gitmodules(config_fn_t fn, void *data)
+{
+	if (the_repository->worktree) {
+		char *file = repo_worktree_path(the_repository, GITMODULES_FILE);
+		git_config_from_file(fn, file, data);
+		free(file);
+	}
+}
+
 int git_config_get_expiry(const char *key, const char **output)
 {
 	int ret = git_config_get_string_const(key, output);
diff --git a/config.h b/config.h
index 0352da117..6998e6645 100644
--- a/config.h
+++ b/config.h
@@ -187,6 +187,16 @@ extern int repo_config_get_maybe_bool(struct repository *repo,
 extern int repo_config_get_pathname(struct repository *repo,
 				    const char *key, const char **dest);
 
+/*
+ * Note: This function exists solely to maintain backward compatibility with
+ * 'fetch' and 'update_clone' storing configuration in '.gitmodules' and should
+ * NOT be used anywhere else.
+ *
+ * Runs the provided config function on the '.gitmodules' file found in the
+ * working directory.
+ */
+extern void config_from_gitmodules(config_fn_t fn, void *data);
+
 extern int git_config_get_value(const char *key, const char **value);
 extern const struct string_list *git_config_get_value_multi(const char *key);
 extern void git_config_clear(void);
-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v4 05/10] submodule: remove submodule.fetchjobs from submodule-config parsing
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
                         ` (3 preceding siblings ...)
  2017-08-02 19:49       ` [PATCH v4 04/10] config: add config_from_gitmodules Brandon Williams
@ 2017-08-02 19:49       ` Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 06/10] submodule: remove fetch.recursesubmodules " Brandon Williams
                         ` (4 subsequent siblings)
  9 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

The '.gitmodules' file should only contain information pertinent to
configuring individual submodules (name to path mapping, URL where to
obtain the submodule, etc.) while other configuration like the number of
jobs to use when fetching submodules should be a part of the
repository's config.

Remove the 'submodule.fetchjobs' configuration option from the general
submodule-config parsing and instead rely on using the
'config_from_gitmodules()' in order to maintain backwards compatibility
with this config being placed in the '.gitmodules' file.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c             | 18 +++++++++++++++++-
 builtin/submodule--helper.c | 17 +++++++++++++----
 submodule-config.c          |  8 ++++++++
 submodule-config.h          |  1 +
 submodule.c                 | 16 +---------------
 submodule.h                 |  1 -
 6 files changed, 40 insertions(+), 21 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index c87e59f3b..ade092bf8 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -39,7 +39,7 @@ static int prune = -1; /* unspecified */
 static int all, append, dry_run, force, keep, multiple, update_head_ok, verbosity, deepen_relative;
 static int progress = -1;
 static int tags = TAGS_DEFAULT, unshallow, update_shallow, deepen;
-static int max_children = -1;
+static int max_children = 1;
 static enum transport_family family;
 static const char *depth;
 static const char *deepen_since;
@@ -68,9 +68,24 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
 		recurse_submodules = r;
 	}
 
+	if (!strcmp(k, "submodule.fetchjobs")) {
+		max_children = parse_submodule_fetchjobs(k, v);
+		return 0;
+	}
+
 	return git_default_config(k, v, cb);
 }
 
+static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
+{
+	if (!strcmp(var, "submodule.fetchjobs")) {
+		max_children = parse_submodule_fetchjobs(var, value);
+		return 0;
+	}
+
+	return 0;
+}
+
 static int parse_refmap_arg(const struct option *opt, const char *arg, int unset)
 {
 	ALLOC_GROW(refmap_array, refmap_nr + 1, refmap_alloc);
@@ -1311,6 +1326,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	for (i = 1; i < argc; i++)
 		strbuf_addf(&default_rla, " %s", argv[i]);
 
+	config_from_gitmodules(gitmodules_fetch_config, NULL);
 	git_config(git_fetch_config, NULL);
 
 	argc = parse_options(argc, argv, prefix,
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 6abdad329..6d9600d4f 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -960,10 +960,19 @@ static int update_clone_task_finished(int result,
 	return 0;
 }
 
+static int gitmodules_update_clone_config(const char *var, const char *value,
+					  void *cb)
+{
+	int *max_jobs = cb;
+	if (!strcmp(var, "submodule.fetchjobs"))
+		*max_jobs = parse_submodule_fetchjobs(var, value);
+	return 0;
+}
+
 static int update_clone(int argc, const char **argv, const char *prefix)
 {
 	const char *update = NULL;
-	int max_jobs = -1;
+	int max_jobs = 1;
 	struct string_list_item *item;
 	struct pathspec pathspec;
 	struct submodule_update_clone suc = SUBMODULE_UPDATE_CLONE_INIT;
@@ -1000,6 +1009,9 @@ static int update_clone(int argc, const char **argv, const char *prefix)
 	};
 	suc.prefix = prefix;
 
+	config_from_gitmodules(gitmodules_update_clone_config, &max_jobs);
+	git_config(gitmodules_update_clone_config, &max_jobs);
+
 	argc = parse_options(argc, argv, prefix, module_update_clone_options,
 			     git_submodule_helper_usage, 0);
 
@@ -1017,9 +1029,6 @@ static int update_clone(int argc, const char **argv, const char *prefix)
 	gitmodules_config();
 	git_config(submodule_config, NULL);
 
-	if (max_jobs < 0)
-		max_jobs = parallel_submodules();
-
 	run_processes_parallel(max_jobs,
 			       update_clone_get_next_task,
 			       update_clone_start_failure,
diff --git a/submodule-config.c b/submodule-config.c
index 5fe2d0787..70400f553 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -248,6 +248,14 @@ static int parse_fetch_recurse(const char *opt, const char *arg,
 	}
 }
 
+int parse_submodule_fetchjobs(const char *var, const char *value)
+{
+	int fetchjobs = git_config_int(var, value);
+	if (fetchjobs < 0)
+		die(_("negative values not allowed for submodule.fetchjobs"));
+	return fetchjobs;
+}
+
 int parse_fetch_recurse_submodules_arg(const char *opt, const char *arg)
 {
 	return parse_fetch_recurse(opt, arg, 1);
diff --git a/submodule-config.h b/submodule-config.h
index 233bfcb7f..995d404f8 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -27,6 +27,7 @@ struct repository;
 
 extern void submodule_cache_free(struct submodule_cache *cache);
 
+extern int parse_submodule_fetchjobs(const char *var, const char *value);
 extern int parse_fetch_recurse_submodules_arg(const char *opt, const char *arg);
 struct option;
 extern int option_fetch_parse_recurse_submodules(const struct option *opt,
diff --git a/submodule.c b/submodule.c
index 64ad5c12d..aa4fb1eaa 100644
--- a/submodule.c
+++ b/submodule.c
@@ -22,7 +22,6 @@
 
 static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
-static int parallel_jobs = 1;
 static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
 static struct oid_array ref_tips_before_fetch;
@@ -159,12 +158,7 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
 /* For loading from the .gitmodules file. */
 static int git_modules_config(const char *var, const char *value, void *cb)
 {
-	if (!strcmp(var, "submodule.fetchjobs")) {
-		parallel_jobs = git_config_int(var, value);
-		if (parallel_jobs < 0)
-			die(_("negative values not allowed for submodule.fetchJobs"));
-		return 0;
-	} else if (starts_with(var, "submodule."))
+	if (starts_with(var, "submodule."))
 		return parse_submodule_config_option(var, value);
 	else if (!strcmp(var, "fetch.recursesubmodules")) {
 		config_fetch_recurse_submodules = parse_fetch_recurse_submodules_arg(var, value);
@@ -1303,9 +1297,6 @@ int fetch_populated_submodules(const struct argv_array *options,
 	argv_array_push(&spf.args, "--recurse-submodules-default");
 	/* default value, "--submodule-prefix" and its value are added later */
 
-	if (max_parallel_jobs < 0)
-		max_parallel_jobs = parallel_jobs;
-
 	calculate_changed_submodule_paths();
 	run_processes_parallel(max_parallel_jobs,
 			       get_next_submodule,
@@ -1825,11 +1816,6 @@ int merge_submodule(struct object_id *result, const char *path,
 	return 0;
 }
 
-int parallel_submodules(void)
-{
-	return parallel_jobs;
-}
-
 /*
  * Embeds a single submodules git directory into the superprojects git dir,
  * non recursively.
diff --git a/submodule.h b/submodule.h
index e85b14486..c8164a3b2 100644
--- a/submodule.h
+++ b/submodule.h
@@ -112,7 +112,6 @@ extern int push_unpushed_submodules(struct oid_array *commits,
 				    const struct string_list *push_options,
 				    int dry_run);
 extern void connect_work_tree_and_git_dir(const char *work_tree, const char *git_dir);
-extern int parallel_submodules(void);
 /*
  * Given a submodule path (as in the index), return the repository
  * path of that submodule in 'buf'. Return -1 on error or when the
-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v4 06/10] submodule: remove fetch.recursesubmodules from submodule-config parsing
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
                         ` (4 preceding siblings ...)
  2017-08-02 19:49       ` [PATCH v4 05/10] submodule: remove submodule.fetchjobs from submodule-config parsing Brandon Williams
@ 2017-08-02 19:49       ` Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 07/10] submodule: check for unstaged .gitmodules outside of config parsing Brandon Williams
                         ` (3 subsequent siblings)
  9 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Remove the 'fetch.recursesubmodules' configuration option from the
general submodule-config parsing and instead rely on using
'config_from_gitmodules()' in order to maintain backwards compatibility
with this config being placed in the '.gitmodules' file.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c |  8 +++++++-
 submodule.c     | 19 ++++++-------------
 submodule.h     |  2 +-
 3 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index ade092bf8..d84c26391 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -71,6 +71,9 @@ static int git_fetch_config(const char *k, const char *v, void *cb)
 	if (!strcmp(k, "submodule.fetchjobs")) {
 		max_children = parse_submodule_fetchjobs(k, v);
 		return 0;
+	} else if (!strcmp(k, "fetch.recursesubmodules")) {
+		recurse_submodules = parse_fetch_recurse_submodules_arg(k, v);
+		return 0;
 	}
 
 	return git_default_config(k, v, cb);
@@ -81,6 +84,9 @@ static int gitmodules_fetch_config(const char *var, const char *value, void *cb)
 	if (!strcmp(var, "submodule.fetchjobs")) {
 		max_children = parse_submodule_fetchjobs(var, value);
 		return 0;
+	} else if (!strcmp(var, "fetch.recursesubmodules")) {
+		recurse_submodules = parse_fetch_recurse_submodules_arg(var, value);
+		return 0;
 	}
 
 	return 0;
@@ -1355,7 +1361,6 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		deepen = 1;
 
 	if (recurse_submodules != RECURSE_SUBMODULES_OFF) {
-		set_config_fetch_recurse_submodules(recurse_submodules_default);
 		gitmodules_config();
 		git_config(submodule_config, NULL);
 	}
@@ -1399,6 +1404,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 		result = fetch_populated_submodules(&options,
 						    submodule_prefix,
 						    recurse_submodules,
+						    recurse_submodules_default,
 						    verbosity < 0,
 						    max_children);
 		argv_array_clear(&options);
diff --git a/submodule.c b/submodule.c
index aa4fb1eaa..1d9d2ce09 100644
--- a/submodule.c
+++ b/submodule.c
@@ -20,7 +20,6 @@
 #include "worktree.h"
 #include "parse-options.h"
 
-static int config_fetch_recurse_submodules = RECURSE_SUBMODULES_ON_DEMAND;
 static int config_update_recurse_submodules = RECURSE_SUBMODULES_OFF;
 static struct string_list changed_submodule_paths = STRING_LIST_INIT_DUP;
 static int initialized_fetch_ref_tips;
@@ -160,10 +159,6 @@ static int git_modules_config(const char *var, const char *value, void *cb)
 {
 	if (starts_with(var, "submodule."))
 		return parse_submodule_config_option(var, value);
-	else if (!strcmp(var, "fetch.recursesubmodules")) {
-		config_fetch_recurse_submodules = parse_fetch_recurse_submodules_arg(var, value);
-		return 0;
-	}
 	return 0;
 }
 
@@ -714,11 +709,6 @@ void show_submodule_inline_diff(FILE *f, const char *path,
 		clear_commit_marks(right, ~0);
 }
 
-void set_config_fetch_recurse_submodules(int value)
-{
-	config_fetch_recurse_submodules = value;
-}
-
 int should_update_submodules(void)
 {
 	return config_update_recurse_submodules == RECURSE_SUBMODULES_ON;
@@ -1164,10 +1154,11 @@ struct submodule_parallel_fetch {
 	const char *work_tree;
 	const char *prefix;
 	int command_line_option;
+	int default_option;
 	int quiet;
 	int result;
 };
-#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0}
+#define SPF_INIT {0, ARGV_ARRAY_INIT, NULL, NULL, 0, 0, 0, 0}
 
 static int get_next_submodule(struct child_process *cp,
 			      struct strbuf *err, void *data, void **task_cb)
@@ -1205,10 +1196,10 @@ static int get_next_submodule(struct child_process *cp,
 					default_argv = "on-demand";
 				}
 			} else {
-				if ((config_fetch_recurse_submodules == RECURSE_SUBMODULES_OFF) ||
+				if ((spf->default_option == RECURSE_SUBMODULES_OFF) ||
 				    gitmodules_is_unmerged)
 					continue;
-				if (config_fetch_recurse_submodules == RECURSE_SUBMODULES_ON_DEMAND) {
+				if (spf->default_option == RECURSE_SUBMODULES_ON_DEMAND) {
 					if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
 						continue;
 					default_argv = "on-demand";
@@ -1275,6 +1266,7 @@ static int fetch_finish(int retvalue, struct strbuf *err,
 
 int fetch_populated_submodules(const struct argv_array *options,
 			       const char *prefix, int command_line_option,
+			       int default_option,
 			       int quiet, int max_parallel_jobs)
 {
 	int i;
@@ -1282,6 +1274,7 @@ int fetch_populated_submodules(const struct argv_array *options,
 
 	spf.work_tree = get_git_work_tree();
 	spf.command_line_option = command_line_option;
+	spf.default_option = default_option;
 	spf.quiet = quiet;
 	spf.prefix = prefix;
 
diff --git a/submodule.h b/submodule.h
index c8164a3b2..29a1ecd19 100644
--- a/submodule.h
+++ b/submodule.h
@@ -76,7 +76,6 @@ extern void show_submodule_inline_diff(FILE *f, const char *path,
 		unsigned dirty_submodule, const char *meta,
 		const char *del, const char *add, const char *reset,
 		const struct diff_options *opt);
-extern void set_config_fetch_recurse_submodules(int value);
 /* Check if we want to update any submodule.*/
 extern int should_update_submodules(void);
 /*
@@ -87,6 +86,7 @@ extern const struct submodule *submodule_from_ce(const struct cache_entry *ce);
 extern void check_for_new_submodule_commits(struct object_id *oid);
 extern int fetch_populated_submodules(const struct argv_array *options,
 			       const char *prefix, int command_line_option,
+			       int default_option,
 			       int quiet, int max_parallel_jobs);
 extern unsigned is_submodule_modified(const char *path, int ignore_untracked);
 extern int submodule_uses_gitfile(const char *path);
-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v4 07/10] submodule: check for unstaged .gitmodules outside of config parsing
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
                         ` (5 preceding siblings ...)
  2017-08-02 19:49       ` [PATCH v4 06/10] submodule: remove fetch.recursesubmodules " Brandon Williams
@ 2017-08-02 19:49       ` Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 08/10] submodule: check for unmerged " Brandon Williams
                         ` (2 subsequent siblings)
  9 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Teach 'is_staging_gitmodules_ok()' to be able to determine in the
'.gitmodules' file has unstaged changes based on the passed in index
instead of relying on a global variable which is set during the
submodule-config parsing.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/mv.c |  2 +-
 builtin/rm.c |  2 +-
 submodule.c  | 32 +++++++++++++++++---------------
 submodule.h  |  2 +-
 4 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/builtin/mv.c b/builtin/mv.c
index dcf6736b5..94fbaaa5d 100644
--- a/builtin/mv.c
+++ b/builtin/mv.c
@@ -81,7 +81,7 @@ static void prepare_move_submodule(const char *src, int first,
 	struct strbuf submodule_dotgit = STRBUF_INIT;
 	if (!S_ISGITLINK(active_cache[first]->ce_mode))
 		die(_("Directory %s is in index and no submodule?"), src);
-	if (!is_staging_gitmodules_ok())
+	if (!is_staging_gitmodules_ok(&the_index))
 		die(_("Please stage your changes to .gitmodules or stash them to proceed"));
 	strbuf_addf(&submodule_dotgit, "%s/.git", src);
 	*submodule_gitfile = read_gitfile(submodule_dotgit.buf);
diff --git a/builtin/rm.c b/builtin/rm.c
index 52826d137..4057e73fa 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -286,7 +286,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 		list.entry[list.nr].name = xstrdup(ce->name);
 		list.entry[list.nr].is_submodule = S_ISGITLINK(ce->ce_mode);
 		if (list.entry[list.nr++].is_submodule &&
-		    !is_staging_gitmodules_ok())
+		    !is_staging_gitmodules_ok(&the_index))
 			die (_("Please stage your changes to .gitmodules or stash them to proceed"));
 	}
 
diff --git a/submodule.c b/submodule.c
index 1d9d2ce09..677b5c401 100644
--- a/submodule.c
+++ b/submodule.c
@@ -37,18 +37,25 @@ static struct oid_array ref_tips_after_fetch;
 static int gitmodules_is_unmerged;
 
 /*
- * This flag is set if the .gitmodules file had unstaged modifications on
- * startup. This must be checked before allowing modifications to the
- * .gitmodules file with the intention to stage them later, because when
- * continuing we would stage the modifications the user didn't stage herself
- * too. That might change in a future version when we learn to stage the
- * changes we do ourselves without staging any previous modifications.
+ * Check if the .gitmodules file has unstaged modifications.  This must be
+ * checked before allowing modifications to the .gitmodules file with the
+ * intention to stage them later, because when continuing we would stage the
+ * modifications the user didn't stage herself too. That might change in a
+ * future version when we learn to stage the changes we do ourselves without
+ * staging any previous modifications.
  */
-static int gitmodules_is_modified;
-
-int is_staging_gitmodules_ok(void)
+int is_staging_gitmodules_ok(const struct index_state *istate)
 {
-	return !gitmodules_is_modified;
+	int pos = index_name_pos(istate, GITMODULES_FILE, strlen(GITMODULES_FILE));
+
+	if ((pos >= 0) && (pos < istate->cache_nr)) {
+		struct stat st;
+		if (lstat(GITMODULES_FILE, &st) == 0 &&
+		    ce_match_stat(istate->cache[pos], &st, 0) & DATA_CHANGED)
+			return 0;
+	}
+
+	return 1;
 }
 
 /*
@@ -231,11 +238,6 @@ void gitmodules_config(void)
 				    !memcmp(ce->name, GITMODULES_FILE, 11))
 					gitmodules_is_unmerged = 1;
 			}
-		} else if (pos < active_nr) {
-			struct stat st;
-			if (lstat(GITMODULES_FILE, &st) == 0 &&
-			    ce_match_stat(active_cache[pos], &st, 0) & DATA_CHANGED)
-				gitmodules_is_modified = 1;
 		}
 
 		if (!gitmodules_is_unmerged)
diff --git a/submodule.h b/submodule.h
index 29a1ecd19..b14660585 100644
--- a/submodule.h
+++ b/submodule.h
@@ -33,7 +33,7 @@ struct submodule_update_strategy {
 };
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
-extern int is_staging_gitmodules_ok(void);
+extern int is_staging_gitmodules_ok(const struct index_state *istate);
 extern int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 extern int remove_path_from_gitmodules(const char *path);
 extern void stage_updated_gitmodules(void);
-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v4 08/10] submodule: check for unmerged .gitmodules outside of config parsing
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
                         ` (6 preceding siblings ...)
  2017-08-02 19:49       ` [PATCH v4 07/10] submodule: check for unstaged .gitmodules outside of config parsing Brandon Williams
@ 2017-08-02 19:49       ` Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 09/10] submodule: merge repo_read_gitmodules and gitmodules_config Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 10/10] grep: recurse in-process using 'struct repository' Brandon Williams
  9 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Add 'is_gitmodules_unmerged()' function which can be used to determine
in the '.gitmodules' file is unmerged based on the passed in index
instead of relying on a global variable which is set during the
submodule-config parsing.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 submodule.c | 47 +++++++++++++++++++++++------------------------
 submodule.h |  1 +
 2 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/submodule.c b/submodule.c
index 677b5c401..3b0e70c51 100644
--- a/submodule.c
+++ b/submodule.c
@@ -27,14 +27,25 @@ static struct oid_array ref_tips_before_fetch;
 static struct oid_array ref_tips_after_fetch;
 
 /*
- * The following flag is set if the .gitmodules file is unmerged. We then
- * disable recursion for all submodules where .git/config doesn't have a
- * matching config entry because we can't guess what might be configured in
- * .gitmodules unless the user resolves the conflict. When a command line
- * option is given (which always overrides configuration) this flag will be
- * ignored.
+ * Check if the .gitmodules file is unmerged. Parsing of the .gitmodules file
+ * will be disabled because we can't guess what might be configured in
+ * .gitmodules unless the user resolves the conflict.
  */
-static int gitmodules_is_unmerged;
+int is_gitmodules_unmerged(const struct index_state *istate)
+{
+	int pos = index_name_pos(istate, GITMODULES_FILE, strlen(GITMODULES_FILE));
+	if (pos < 0) { /* .gitmodules not found or isn't merged */
+		pos = -1 - pos;
+		if (istate->cache_nr > pos) {  /* there is a .gitmodules */
+			const struct cache_entry *ce = istate->cache[pos];
+			if (ce_namelen(ce) == strlen(GITMODULES_FILE) &&
+			    !strcmp(ce->name, GITMODULES_FILE))
+				return 1;
+		}
+	}
+
+	return 0;
+}
 
 /*
  * Check if the .gitmodules file has unstaged modifications.  This must be
@@ -71,7 +82,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
 	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
 		return -1;
 
-	if (gitmodules_is_unmerged)
+	if (is_gitmodules_unmerged(&the_index))
 		die(_("Cannot change unmerged .gitmodules, resolve merge conflicts first"));
 
 	submodule = submodule_from_path(null_sha1, oldpath);
@@ -105,7 +116,7 @@ int remove_path_from_gitmodules(const char *path)
 	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
 		return -1;
 
-	if (gitmodules_is_unmerged)
+	if (is_gitmodules_unmerged(&the_index))
 		die(_("Cannot change unmerged .gitmodules, resolve merge conflicts first"));
 
 	submodule = submodule_from_path(null_sha1, path);
@@ -156,7 +167,7 @@ void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
 	if (submodule) {
 		if (submodule->ignore)
 			handle_ignore_submodules_arg(diffopt, submodule->ignore);
-		else if (gitmodules_is_unmerged)
+		else if (is_gitmodules_unmerged(&the_index))
 			DIFF_OPT_SET(diffopt, IGNORE_SUBMODULES);
 	}
 }
@@ -224,23 +235,12 @@ void gitmodules_config(void)
 	const char *work_tree = get_git_work_tree();
 	if (work_tree) {
 		struct strbuf gitmodules_path = STRBUF_INIT;
-		int pos;
 		strbuf_addstr(&gitmodules_path, work_tree);
 		strbuf_addstr(&gitmodules_path, "/" GITMODULES_FILE);
 		if (read_cache() < 0)
 			die("index file corrupt");
-		pos = cache_name_pos(GITMODULES_FILE, 11);
-		if (pos < 0) { /* .gitmodules not found or isn't merged */
-			pos = -1 - pos;
-			if (active_nr > pos) {  /* there is a .gitmodules */
-				const struct cache_entry *ce = active_cache[pos];
-				if (ce_namelen(ce) == 11 &&
-				    !memcmp(ce->name, GITMODULES_FILE, 11))
-					gitmodules_is_unmerged = 1;
-			}
-		}
 
-		if (!gitmodules_is_unmerged)
+		if (!is_gitmodules_unmerged(&the_index))
 			git_config_from_file(git_modules_config,
 				gitmodules_path.buf, NULL);
 		strbuf_release(&gitmodules_path);
@@ -1198,8 +1198,7 @@ static int get_next_submodule(struct child_process *cp,
 					default_argv = "on-demand";
 				}
 			} else {
-				if ((spf->default_option == RECURSE_SUBMODULES_OFF) ||
-				    gitmodules_is_unmerged)
+				if (spf->default_option == RECURSE_SUBMODULES_OFF)
 					continue;
 				if (spf->default_option == RECURSE_SUBMODULES_ON_DEMAND) {
 					if (!unsorted_string_list_lookup(&changed_submodule_paths, ce->name))
diff --git a/submodule.h b/submodule.h
index b14660585..8022faa59 100644
--- a/submodule.h
+++ b/submodule.h
@@ -33,6 +33,7 @@ struct submodule_update_strategy {
 };
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
+extern int is_gitmodules_unmerged(const struct index_state *istate);
 extern int is_staging_gitmodules_ok(const struct index_state *istate);
 extern int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 extern int remove_path_from_gitmodules(const char *path);
-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v4 09/10] submodule: merge repo_read_gitmodules and gitmodules_config
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
                         ` (7 preceding siblings ...)
  2017-08-02 19:49       ` [PATCH v4 08/10] submodule: check for unmerged " Brandon Williams
@ 2017-08-02 19:49       ` Brandon Williams
  2017-08-02 19:49       ` [PATCH v4 10/10] grep: recurse in-process using 'struct repository' Brandon Williams
  9 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Since 69aba5329 (submodule: add repo_read_gitmodules) there have been
two ways to load a repository's .gitmodules file:
'repo_read_gitmodules()' is used if you have a repository object you are
working with or 'gitmodules_config()' if you are implicitly working with
'the_repository'.  Merge the logic of these two functions to remove
duplicate code.

In addition, 'repo_read_gitmodules()' can segfault by passing in a NULL
pointer to 'git_config_from_file()' if a repository doesn't have a
worktree.  Instead check for the existence of a worktree before
attempting to load the .gitmodules file.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 submodule.c | 37 +++++++++++++++++--------------------
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/submodule.c b/submodule.c
index 3b0e70c51..9d5eacaf9 100644
--- a/submodule.c
+++ b/submodule.c
@@ -230,23 +230,6 @@ void load_submodule_cache(void)
 	git_config(submodule_config, NULL);
 }
 
-void gitmodules_config(void)
-{
-	const char *work_tree = get_git_work_tree();
-	if (work_tree) {
-		struct strbuf gitmodules_path = STRBUF_INIT;
-		strbuf_addstr(&gitmodules_path, work_tree);
-		strbuf_addstr(&gitmodules_path, "/" GITMODULES_FILE);
-		if (read_cache() < 0)
-			die("index file corrupt");
-
-		if (!is_gitmodules_unmerged(&the_index))
-			git_config_from_file(git_modules_config,
-				gitmodules_path.buf, NULL);
-		strbuf_release(&gitmodules_path);
-	}
-}
-
 static int gitmodules_cb(const char *var, const char *value, void *data)
 {
 	struct repository *repo = data;
@@ -255,10 +238,24 @@ static int gitmodules_cb(const char *var, const char *value, void *data)
 
 void repo_read_gitmodules(struct repository *repo)
 {
-	char *gitmodules_path = repo_worktree_path(repo, GITMODULES_FILE);
+	if (repo->worktree) {
+		char *gitmodules;
+
+		if (repo_read_index(repo) < 0)
+			return;
 
-	git_config_from_file(gitmodules_cb, gitmodules_path, repo);
-	free(gitmodules_path);
+		gitmodules = repo_worktree_path(repo, GITMODULES_FILE);
+
+		if (!is_gitmodules_unmerged(repo->index))
+			git_config_from_file(gitmodules_cb, gitmodules, repo);
+
+		free(gitmodules);
+	}
+}
+
+void gitmodules_config(void)
+{
+	repo_read_gitmodules(the_repository);
 }
 
 void gitmodules_config_sha1(const unsigned char *commit_sha1)
-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

* [PATCH v4 10/10] grep: recurse in-process using 'struct repository'
  2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
                         ` (8 preceding siblings ...)
  2017-08-02 19:49       ` [PATCH v4 09/10] submodule: merge repo_read_gitmodules and gitmodules_config Brandon Williams
@ 2017-08-02 19:49       ` Brandon Williams
  9 siblings, 0 replies; 68+ messages in thread
From: Brandon Williams @ 2017-08-02 19:49 UTC (permalink / raw)
  To: git; +Cc: sbeller, jrnieder, gitster, Brandon Williams

Convert grep to use 'struct repository' which enables recursing into
submodules to be handled in-process.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/git-grep.txt |   7 -
 builtin/grep.c             | 396 ++++++++++-----------------------------------
 cache.h                    |   1 -
 git.c                      |   2 +-
 grep.c                     |  13 --
 grep.h                     |   1 -
 setup.c                    |  12 +-
 7 files changed, 88 insertions(+), 344 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 5033483db..720c7850e 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -95,13 +95,6 @@ OPTIONS
 	<tree> option the prefix of all submodule output will be the name of
 	the parent project's <tree> object.
 
---parent-basename <basename>::
-	For internal use only.  In order to produce uniform output with the
-	--recurse-submodules option, this option can be used to provide the
-	basename of a parent's <tree> object to a submodule so the submodule
-	can prefix its output with the parent's name rather than the SHA1 of
-	the submodule.
-
 -a::
 --text::
 	Process binary files as if they were text.
diff --git a/builtin/grep.c b/builtin/grep.c
index 7e79eb1a7..cd0e51f3c 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -28,13 +28,7 @@ static char const * const grep_usage[] = {
 	NULL
 };
 
-static const char *super_prefix;
 static int recurse_submodules;
-static struct argv_array submodule_options = ARGV_ARRAY_INIT;
-static const char *parent_basename;
-
-static int grep_submodule_launch(struct grep_opt *opt,
-				 const struct grep_source *gs);
 
 #define GREP_NUM_THREADS_DEFAULT 8
 static int num_threads;
@@ -186,10 +180,7 @@ static void *run(void *arg)
 			break;
 
 		opt->output_priv = w;
-		if (w->source.type == GREP_SOURCE_SUBMODULE)
-			hit |= grep_submodule_launch(opt, &w->source);
-		else
-			hit |= grep_source(opt, &w->source);
+		hit |= grep_source(opt, &w->source);
 		grep_source_clear_data(&w->source);
 		work_done(w);
 	}
@@ -327,21 +318,13 @@ static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
 {
 	struct strbuf pathbuf = STRBUF_INIT;
 
-	if (super_prefix) {
-		strbuf_add(&pathbuf, filename, tree_name_len);
-		strbuf_addstr(&pathbuf, super_prefix);
-		strbuf_addstr(&pathbuf, filename + tree_name_len);
+	if (opt->relative && opt->prefix_length) {
+		quote_path_relative(filename + tree_name_len, opt->prefix, &pathbuf);
+		strbuf_insert(&pathbuf, 0, filename, tree_name_len);
 	} else {
 		strbuf_addstr(&pathbuf, filename);
 	}
 
-	if (opt->relative && opt->prefix_length) {
-		char *name = strbuf_detach(&pathbuf, NULL);
-		quote_path_relative(name + tree_name_len, opt->prefix, &pathbuf);
-		strbuf_insert(&pathbuf, 0, name, tree_name_len);
-		free(name);
-	}
-
 #ifndef NO_PTHREADS
 	if (num_threads) {
 		add_work(opt, GREP_SOURCE_OID, pathbuf.buf, path, oid);
@@ -366,15 +349,10 @@ static int grep_file(struct grep_opt *opt, const char *filename)
 {
 	struct strbuf buf = STRBUF_INIT;
 
-	if (super_prefix)
-		strbuf_addstr(&buf, super_prefix);
-	strbuf_addstr(&buf, filename);
-
-	if (opt->relative && opt->prefix_length) {
-		char *name = strbuf_detach(&buf, NULL);
-		quote_path_relative(name, opt->prefix, &buf);
-		free(name);
-	}
+	if (opt->relative && opt->prefix_length)
+		quote_path_relative(filename, opt->prefix, &buf);
+	else
+		strbuf_addstr(&buf, filename);
 
 #ifndef NO_PTHREADS
 	if (num_threads) {
@@ -421,284 +399,89 @@ static void run_pager(struct grep_opt *opt, const char *prefix)
 		exit(status);
 }
 
-static void compile_submodule_options(const struct grep_opt *opt,
-				      const char **argv,
-				      int cached, int untracked,
-				      int opt_exclude, int use_index,
-				      int pattern_type_arg)
-{
-	struct grep_pat *pattern;
-
-	if (recurse_submodules)
-		argv_array_push(&submodule_options, "--recurse-submodules");
-
-	if (cached)
-		argv_array_push(&submodule_options, "--cached");
-	if (!use_index)
-		argv_array_push(&submodule_options, "--no-index");
-	if (untracked)
-		argv_array_push(&submodule_options, "--untracked");
-	if (opt_exclude > 0)
-		argv_array_push(&submodule_options, "--exclude-standard");
-
-	if (opt->invert)
-		argv_array_push(&submodule_options, "-v");
-	if (opt->ignore_case)
-		argv_array_push(&submodule_options, "-i");
-	if (opt->word_regexp)
-		argv_array_push(&submodule_options, "-w");
-	switch (opt->binary) {
-	case GREP_BINARY_NOMATCH:
-		argv_array_push(&submodule_options, "-I");
-		break;
-	case GREP_BINARY_TEXT:
-		argv_array_push(&submodule_options, "-a");
-		break;
-	default:
-		break;
-	}
-	if (opt->allow_textconv)
-		argv_array_push(&submodule_options, "--textconv");
-	if (opt->max_depth != -1)
-		argv_array_pushf(&submodule_options, "--max-depth=%d",
-				 opt->max_depth);
-	if (opt->linenum)
-		argv_array_push(&submodule_options, "-n");
-	if (!opt->pathname)
-		argv_array_push(&submodule_options, "-h");
-	if (!opt->relative)
-		argv_array_push(&submodule_options, "--full-name");
-	if (opt->name_only)
-		argv_array_push(&submodule_options, "-l");
-	if (opt->unmatch_name_only)
-		argv_array_push(&submodule_options, "-L");
-	if (opt->null_following_name)
-		argv_array_push(&submodule_options, "-z");
-	if (opt->count)
-		argv_array_push(&submodule_options, "-c");
-	if (opt->file_break)
-		argv_array_push(&submodule_options, "--break");
-	if (opt->heading)
-		argv_array_push(&submodule_options, "--heading");
-	if (opt->pre_context)
-		argv_array_pushf(&submodule_options, "--before-context=%d",
-				 opt->pre_context);
-	if (opt->post_context)
-		argv_array_pushf(&submodule_options, "--after-context=%d",
-				 opt->post_context);
-	if (opt->funcname)
-		argv_array_push(&submodule_options, "-p");
-	if (opt->funcbody)
-		argv_array_push(&submodule_options, "-W");
-	if (opt->all_match)
-		argv_array_push(&submodule_options, "--all-match");
-	if (opt->debug)
-		argv_array_push(&submodule_options, "--debug");
-	if (opt->status_only)
-		argv_array_push(&submodule_options, "-q");
-
-	switch (pattern_type_arg) {
-	case GREP_PATTERN_TYPE_BRE:
-		argv_array_push(&submodule_options, "-G");
-		break;
-	case GREP_PATTERN_TYPE_ERE:
-		argv_array_push(&submodule_options, "-E");
-		break;
-	case GREP_PATTERN_TYPE_FIXED:
-		argv_array_push(&submodule_options, "-F");
-		break;
-	case GREP_PATTERN_TYPE_PCRE:
-		argv_array_push(&submodule_options, "-P");
-		break;
-	case GREP_PATTERN_TYPE_UNSPECIFIED:
-		break;
-	default:
-		die("BUG: Added a new grep pattern type without updating switch statement");
-	}
-
-	for (pattern = opt->pattern_list; pattern != NULL;
-	     pattern = pattern->next) {
-		switch (pattern->token) {
-		case GREP_PATTERN:
-			argv_array_pushf(&submodule_options, "-e%s",
-					 pattern->pattern);
-			break;
-		case GREP_AND:
-		case GREP_OPEN_PAREN:
-		case GREP_CLOSE_PAREN:
-		case GREP_NOT:
-		case GREP_OR:
-			argv_array_push(&submodule_options, pattern->pattern);
-			break;
-		/* BODY and HEAD are not used by git-grep */
-		case GREP_PATTERN_BODY:
-		case GREP_PATTERN_HEAD:
-			break;
-		}
-	}
-
-	/*
-	 * Limit number of threads for child process to use.
-	 * This is to prevent potential fork-bomb behavior of git-grep as each
-	 * submodule process has its own thread pool.
-	 */
-	argv_array_pushf(&submodule_options, "--threads=%d",
-			 DIV_ROUND_UP(num_threads, 2));
-
-	/* Add Pathspecs */
-	argv_array_push(&submodule_options, "--");
-	for (; *argv; argv++)
-		argv_array_push(&submodule_options, *argv);
-}
+static int grep_cache(struct grep_opt *opt, struct repository *repo,
+		      const struct pathspec *pathspec, int cached);
+static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
+		     struct tree_desc *tree, struct strbuf *base, int tn_len,
+		     int check_attr, struct repository *repo);
 
-/*
- * Launch child process to grep contents of a submodule
- */
-static int grep_submodule_launch(struct grep_opt *opt,
-				 const struct grep_source *gs)
+static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
+			  const struct pathspec *pathspec,
+			  const struct object_id *oid,
+			  const char *filename, const char *path)
 {
-	struct child_process cp = CHILD_PROCESS_INIT;
-	int status, i;
-	const char *end_of_base;
-	const char *name;
-	struct strbuf child_output = STRBUF_INIT;
-
-	end_of_base = strchr(gs->name, ':');
-	if (gs->identifier && end_of_base)
-		name = end_of_base + 1;
-	else
-		name = gs->name;
+	struct repository submodule;
+	int hit;
 
-	prepare_submodule_repo_env(&cp.env_array);
-	argv_array_push(&cp.env_array, GIT_DIR_ENVIRONMENT);
+	if (!is_submodule_active(superproject, path))
+		return 0;
 
-	if (opt->relative && opt->prefix_length)
-		argv_array_pushf(&cp.env_array, "%s=%s",
-				 GIT_TOPLEVEL_PREFIX_ENVIRONMENT,
-				 opt->prefix);
+	if (repo_submodule_init(&submodule, superproject, path))
+		return 0;
 
-	/* Add super prefix */
-	argv_array_pushf(&cp.args, "--super-prefix=%s%s/",
-			 super_prefix ? super_prefix : "",
-			 name);
-	argv_array_push(&cp.args, "grep");
+	repo_read_gitmodules(&submodule);
 
 	/*
-	 * Add basename of parent project
-	 * When performing grep on a tree object the filename is prefixed
-	 * with the object's name: 'tree-name:filename'.  In order to
-	 * provide uniformity of output we want to pass the name of the
-	 * parent project's object name to the submodule so the submodule can
-	 * prefix its output with the parent's name and not its own OID.
+	 * NEEDSWORK: This adds the submodule's object directory to the list of
+	 * alternates for the single in-memory object store.  This has some bad
+	 * consequences for memory (processed objects will never be freed) and
+	 * performance (this increases the number of pack files git has to pay
+	 * attention to, to the sum of the number of pack files in all the
+	 * repositories processed so far).  This can be removed once the object
+	 * store is no longer global and instead is a member of the repository
+	 * object.
 	 */
-	if (gs->identifier && end_of_base)
-		argv_array_pushf(&cp.args, "--parent-basename=%.*s",
-				 (int) (end_of_base - gs->name),
-				 gs->name);
+	add_to_alternates_memory(submodule.objectdir);
 
-	/* Add options */
-	for (i = 0; i < submodule_options.argc; i++) {
-		/*
-		 * If there is a tree identifier for the submodule, add the
-		 * rev after adding the submodule options but before the
-		 * pathspecs.  To do this we listen for the '--' and insert the
-		 * oid before pushing the '--' onto the child process argv
-		 * array.
-		 */
-		if (gs->identifier &&
-		    !strcmp("--", submodule_options.argv[i])) {
-			argv_array_push(&cp.args, oid_to_hex(gs->identifier));
-		}
+	if (oid) {
+		struct object *object;
+		struct tree_desc tree;
+		void *data;
+		unsigned long size;
+		struct strbuf base = STRBUF_INIT;
 
-		argv_array_push(&cp.args, submodule_options.argv[i]);
-	}
+		object = parse_object_or_die(oid, oid_to_hex(oid));
 
-	cp.git_cmd = 1;
-	cp.dir = gs->path;
+		grep_read_lock();
+		data = read_object_with_reference(object->oid.hash, tree_type,
+						  &size, NULL);
+		grep_read_unlock();
 
-	/*
-	 * Capture output to output buffer and check the return code from the
-	 * child process.  A '0' indicates a hit, a '1' indicates no hit and
-	 * anything else is an error.
-	 */
-	status = capture_command(&cp, &child_output, 0);
-	if (status && (status != 1)) {
-		/* flush the buffer */
-		write_or_die(1, child_output.buf, child_output.len);
-		die("process for submodule '%s' failed with exit code: %d",
-		    gs->name, status);
-	}
+		if (!data)
+			die(_("unable to read tree (%s)"), oid_to_hex(&object->oid));
 
-	opt->output(opt, child_output.buf, child_output.len);
-	strbuf_release(&child_output);
-	/* invert the return code to make a hit equal to 1 */
-	return !status;
-}
+		strbuf_addstr(&base, filename);
+		strbuf_addch(&base, '/');
 
-/*
- * Prep grep structures for a submodule grep
- * oid: the oid of the submodule or NULL if using the working tree
- * filename: name of the submodule including tree name of parent
- * path: location of the submodule
- */
-static int grep_submodule(struct grep_opt *opt, const struct object_id *oid,
-			  const char *filename, const char *path)
-{
-	if (!is_submodule_active(the_repository, path))
-		return 0;
-	if (!is_submodule_populated_gently(path, NULL)) {
-		/*
-		 * If searching history, check for the presence of the
-		 * submodule's gitdir before skipping the submodule.
-		 */
-		if (oid) {
-			const struct submodule *sub =
-					submodule_from_path(null_sha1, path);
-			if (sub)
-				path = git_path("modules/%s", sub->name);
-
-			if (!(is_directory(path) && is_git_directory(path)))
-				return 0;
-		} else {
-			return 0;
-		}
+		init_tree_desc(&tree, data, size);
+		hit = grep_tree(opt, pathspec, &tree, &base, base.len,
+				object->type == OBJ_COMMIT, &submodule);
+		strbuf_release(&base);
+		free(data);
+	} else {
+		hit = grep_cache(opt, &submodule, pathspec, 1);
 	}
 
-#ifndef NO_PTHREADS
-	if (num_threads) {
-		add_work(opt, GREP_SOURCE_SUBMODULE, filename, path, oid);
-		return 0;
-	} else
-#endif
-	{
-		struct grep_source gs;
-		int hit;
-
-		grep_source_init(&gs, GREP_SOURCE_SUBMODULE,
-				 filename, path, oid);
-		hit = grep_submodule_launch(opt, &gs);
-
-		grep_source_clear(&gs);
-		return hit;
-	}
+	repo_clear(&submodule);
+	return hit;
 }
 
-static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
-		      int cached)
+static int grep_cache(struct grep_opt *opt, struct repository *repo,
+		      const struct pathspec *pathspec, int cached)
 {
 	int hit = 0;
 	int nr;
 	struct strbuf name = STRBUF_INIT;
 	int name_base_len = 0;
-	if (super_prefix) {
-		name_base_len = strlen(super_prefix);
-		strbuf_addstr(&name, super_prefix);
+	if (repo->submodule_prefix) {
+		name_base_len = strlen(repo->submodule_prefix);
+		strbuf_addstr(&name, repo->submodule_prefix);
 	}
 
-	read_cache();
+	repo_read_index(repo);
 
-	for (nr = 0; nr < active_nr; nr++) {
-		const struct cache_entry *ce = active_cache[nr];
+	for (nr = 0; nr < repo->index->cache_nr; nr++) {
+		const struct cache_entry *ce = repo->index->cache[nr];
 		strbuf_setlen(&name, name_base_len);
 		strbuf_addstr(&name, ce->name);
 
@@ -715,14 +498,14 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 			    ce_skip_worktree(ce)) {
 				if (ce_stage(ce) || ce_intent_to_add(ce))
 					continue;
-				hit |= grep_oid(opt, &ce->oid, ce->name,
-						 0, ce->name);
+				hit |= grep_oid(opt, &ce->oid, name.buf,
+						 0, name.buf);
 			} else {
-				hit |= grep_file(opt, ce->name);
+				hit |= grep_file(opt, name.buf);
 			}
 		} else if (recurse_submodules && S_ISGITLINK(ce->ce_mode) &&
 			   submodule_path_match(pathspec, name.buf, NULL)) {
-			hit |= grep_submodule(opt, NULL, ce->name, ce->name);
+			hit |= grep_submodule(opt, repo, pathspec, NULL, ce->name, ce->name);
 		} else {
 			continue;
 		}
@@ -730,8 +513,8 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 		if (ce_stage(ce)) {
 			do {
 				nr++;
-			} while (nr < active_nr &&
-				 !strcmp(ce->name, active_cache[nr]->name));
+			} while (nr < repo->index->cache_nr &&
+				 !strcmp(ce->name, repo->index->cache[nr]->name));
 			nr--; /* compensate for loop control */
 		}
 		if (hit && opt->status_only)
@@ -744,7 +527,7 @@ static int grep_cache(struct grep_opt *opt, const struct pathspec *pathspec,
 
 static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 		     struct tree_desc *tree, struct strbuf *base, int tn_len,
-		     int check_attr)
+		     int check_attr, struct repository *repo)
 {
 	int hit = 0;
 	enum interesting match = entry_not_interesting;
@@ -752,8 +535,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 	int old_baselen = base->len;
 	struct strbuf name = STRBUF_INIT;
 	int name_base_len = 0;
-	if (super_prefix) {
-		strbuf_addstr(&name, super_prefix);
+	if (repo->submodule_prefix) {
+		strbuf_addstr(&name, repo->submodule_prefix);
 		name_base_len = name.len;
 	}
 
@@ -791,11 +574,11 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 			strbuf_addch(base, '/');
 			init_tree_desc(&sub, data, size);
 			hit |= grep_tree(opt, pathspec, &sub, base, tn_len,
-					 check_attr);
+					 check_attr, repo);
 			free(data);
 		} else if (recurse_submodules && S_ISGITLINK(entry.mode)) {
-			hit |= grep_submodule(opt, entry.oid, base->buf,
-					      base->buf + tn_len);
+			hit |= grep_submodule(opt, repo, pathspec, entry.oid,
+					      base->buf, base->buf + tn_len);
 		}
 
 		strbuf_setlen(base, old_baselen);
@@ -809,7 +592,8 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
-		       struct object *obj, const char *name, const char *path)
+		       struct object *obj, const char *name, const char *path,
+		       struct repository *repo)
 {
 	if (obj->type == OBJ_BLOB)
 		return grep_oid(opt, &obj->oid, name, 0, path);
@@ -828,10 +612,6 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid));
 
-		/* Use parent's name as base when recursing submodules */
-		if (recurse_submodules && parent_basename)
-			name = parent_basename;
-
 		len = name ? strlen(name) : 0;
 		strbuf_init(&base, PATH_MAX + len + 1);
 		if (len) {
@@ -840,7 +620,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		}
 		init_tree_desc(&tree, data, size);
 		hit = grep_tree(opt, pathspec, &tree, &base, base.len,
-				obj->type == OBJ_COMMIT);
+				obj->type == OBJ_COMMIT, repo);
 		strbuf_release(&base);
 		free(data);
 		return hit;
@@ -849,6 +629,7 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 }
 
 static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
+			struct repository *repo,
 			const struct object_array *list)
 {
 	unsigned int i;
@@ -864,7 +645,8 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 			submodule_free();
 			gitmodules_config_sha1(real_obj->oid.hash);
 		}
-		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path)) {
+		if (grep_object(opt, pathspec, real_obj, list->objects[i].name, list->objects[i].path,
+				repo)) {
 			hit = 1;
 			if (opt->status_only)
 				break;
@@ -1005,9 +787,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			    N_("ignore files specified via '.gitignore'"), 1),
 		OPT_BOOL(0, "recurse-submodules", &recurse_submodules,
 			 N_("recursively search in each submodule")),
-		OPT_STRING(0, "parent-basename", &parent_basename,
-			   N_("basename"),
-			   N_("prepend parent project's basename to output")),
 		OPT_GROUP(""),
 		OPT_BOOL('v', "invert-match", &opt.invert,
 			N_("show non-matching lines")),
@@ -1112,7 +891,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	init_grep_defaults();
 	git_config(grep_cmd_config, NULL);
 	grep_init(&opt, prefix);
-	super_prefix = get_super_prefix();
 
 	/*
 	 * If there is no -- then the paths must exist in the working
@@ -1272,9 +1050,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 
 	if (recurse_submodules) {
 		gitmodules_config();
-		compile_submodule_options(&opt, argv + i, cached, untracked,
-					  opt_exclude, use_index,
-					  pattern_type_arg);
 	}
 
 	if (show_in_pager && (cached || list.nr))
@@ -1318,11 +1093,12 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		if (!cached)
 			setup_work_tree();
 
-		hit = grep_cache(&opt, &pathspec, cached);
+		hit = grep_cache(&opt, the_repository, &pathspec, cached);
 	} else {
 		if (cached)
 			die(_("both --cached and trees are given."));
-		hit = grep_objects(&opt, &pathspec, &list);
+
+		hit = grep_objects(&opt, &pathspec, the_repository, &list);
 	}
 
 	if (num_threads)
diff --git a/cache.h b/cache.h
index d59f767e2..c221434b2 100644
--- a/cache.h
+++ b/cache.h
@@ -417,7 +417,6 @@ static inline enum object_type object_type(unsigned int mode)
 #define GIT_WORK_TREE_ENVIRONMENT "GIT_WORK_TREE"
 #define GIT_PREFIX_ENVIRONMENT "GIT_PREFIX"
 #define GIT_SUPER_PREFIX_ENVIRONMENT "GIT_INTERNAL_SUPER_PREFIX"
-#define GIT_TOPLEVEL_PREFIX_ENVIRONMENT "GIT_INTERNAL_TOPLEVEL_PREFIX"
 #define DEFAULT_GIT_DIR_ENVIRONMENT ".git"
 #define DB_ENVIRONMENT "GIT_OBJECT_DIRECTORY"
 #define INDEX_ENVIRONMENT "GIT_INDEX_FILE"
diff --git a/git.c b/git.c
index 489aab4d8..9dd9aead6 100644
--- a/git.c
+++ b/git.c
@@ -392,7 +392,7 @@ static struct cmd_struct commands[] = {
 	{ "fsck-objects", cmd_fsck, RUN_SETUP },
 	{ "gc", cmd_gc, RUN_SETUP },
 	{ "get-tar-commit-id", cmd_get_tar_commit_id },
-	{ "grep", cmd_grep, RUN_SETUP_GENTLY | SUPPORT_SUPER_PREFIX },
+	{ "grep", cmd_grep, RUN_SETUP_GENTLY },
 	{ "hash-object", cmd_hash_object },
 	{ "help", cmd_help },
 	{ "index-pack", cmd_index_pack, RUN_SETUP_GENTLY },
diff --git a/grep.c b/grep.c
index 2efec0e18..45acd333b 100644
--- a/grep.c
+++ b/grep.c
@@ -1927,16 +1927,6 @@ void grep_source_init(struct grep_source *gs, enum grep_source_type type,
 	case GREP_SOURCE_FILE:
 		gs->identifier = xstrdup(identifier);
 		break;
-	case GREP_SOURCE_SUBMODULE:
-		if (!identifier) {
-			gs->identifier = NULL;
-			break;
-		}
-		/*
-		 * FALL THROUGH
-		 * If the identifier is non-NULL (in the submodule case) it
-		 * will be a SHA1 that needs to be copied.
-		 */
 	case GREP_SOURCE_OID:
 		gs->identifier = oiddup(identifier);
 		break;
@@ -1959,7 +1949,6 @@ void grep_source_clear_data(struct grep_source *gs)
 	switch (gs->type) {
 	case GREP_SOURCE_FILE:
 	case GREP_SOURCE_OID:
-	case GREP_SOURCE_SUBMODULE:
 		FREE_AND_NULL(gs->buf);
 		gs->size = 0;
 		break;
@@ -2030,8 +2019,6 @@ static int grep_source_load(struct grep_source *gs)
 		return grep_source_load_oid(gs);
 	case GREP_SOURCE_BUF:
 		return gs->buf ? 0 : -1;
-	case GREP_SOURCE_SUBMODULE:
-		break;
 	}
 	die("BUG: invalid grep_source type to load");
 }
diff --git a/grep.h b/grep.h
index 0c091e510..52aecfab6 100644
--- a/grep.h
+++ b/grep.h
@@ -193,7 +193,6 @@ struct grep_source {
 		GREP_SOURCE_OID,
 		GREP_SOURCE_FILE,
 		GREP_SOURCE_BUF,
-		GREP_SOURCE_SUBMODULE,
 	} type;
 	void *identifier;
 
diff --git a/setup.c b/setup.c
index 860507e1f..23950173f 100644
--- a/setup.c
+++ b/setup.c
@@ -1027,7 +1027,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
 {
 	static struct strbuf cwd = STRBUF_INIT;
 	struct strbuf dir = STRBUF_INIT, gitdir = STRBUF_INIT;
-	const char *prefix, *env_prefix;
+	const char *prefix;
 
 	/*
 	 * We may have read an incomplete configuration before
@@ -1085,16 +1085,6 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		die("BUG: unhandled setup_git_directory_1() result");
 	}
 
-	/*
-	 * NEEDSWORK: This was a hack in order to get ls-files and grep to have
-	 * properly formated output when recursing submodules.  Once ls-files
-	 * and grep have been changed to perform this recursing in-process this
-	 * needs to be removed.
-	 */
-	env_prefix = getenv(GIT_TOPLEVEL_PREFIX_ENVIRONMENT);
-	if (env_prefix)
-		prefix = env_prefix;
-
 	if (prefix)
 		setenv(GIT_PREFIX_ENVIRONMENT, prefix, 1);
 	else
-- 
2.14.0.rc1.383.gd1ce394fe2-goog


^ permalink raw reply related	[flat|nested] 68+ messages in thread

end of thread, other threads:[~2017-08-02 19:49 UTC | newest]

Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-07-11 22:04 [PATCH 0/3] Convert grep to recurse in-process Brandon Williams
2017-07-11 22:04 ` [PATCH 1/3] repo_read_index: don't discard the index Brandon Williams
2017-07-11 23:51   ` Jonathan Nieder
2017-07-12 17:27     ` Brandon Williams
2017-07-11 23:58   ` Stefan Beller
2017-07-12 17:23     ` Brandon Williams
2017-07-11 22:04 ` [PATCH 2/3] setup: have the_repository use the_index Brandon Williams
2017-07-12  0:00   ` Jonathan Nieder
2017-07-12  0:07     ` Stefan Beller
2017-07-12 17:30     ` Brandon Williams
2017-07-12  0:11   ` Junio C Hamano
2017-07-12 18:01     ` Brandon Williams
2017-07-12 20:38       ` Junio C Hamano
2017-07-12 21:33         ` Jonathan Nieder
2017-07-12 21:40           ` Junio C Hamano
2017-07-18 21:34             ` Junio C Hamano
2017-07-11 22:04 ` [PATCH 3/3] grep: recurse in-process using 'struct repository' Brandon Williams
2017-07-11 22:44   ` Jacob Keller
2017-07-12 18:54     ` Brandon Williams
2017-07-12  0:04   ` Stefan Beller
2017-07-12 18:56     ` Brandon Williams
2017-07-12  0:25   ` Jonathan Nieder
2017-07-12 18:49     ` Brandon Williams
2017-07-12  7:42 ` [PATCH 0/3] Convert grep to recurse in-process Jeff King
2017-07-12 18:06   ` Brandon Williams
2017-07-12 18:17     ` Jeff King
2017-07-12 18:24       ` Jonathan Nieder
2017-07-12 18:33         ` Jeff King
2017-07-12 18:09   ` Jonathan Nieder
2017-07-12 18:17     ` Stefan Beller
2017-07-12 18:27     ` Jeff King
2017-07-14 22:28 ` [PATCH v2 " Brandon Williams
2017-07-14 22:28   ` [PATCH v2 1/3] repo_read_index: don't discard the index Brandon Williams
2017-07-14 22:28   ` [PATCH v2 2/3] repository: have the_repository use the_index Brandon Williams
2017-07-14 22:28   ` [PATCH v2 3/3] grep: recurse in-process using 'struct repository' Brandon Williams
2017-07-18 19:05   ` [PATCH v3 00/10] Convert grep to recurse in-process Brandon Williams
2017-07-18 19:05     ` [PATCH v3 01/10] repo_read_index: don't discard the index Brandon Williams
2017-07-18 19:05     ` [PATCH v3 02/10] repository: have the_repository use the_index Brandon Williams
2017-07-18 19:05     ` [PATCH v3 03/10] cache.h: add GITMODULES_FILE macro Brandon Williams
2017-07-31 23:11       ` [PATCH] convert any hard coded .gitmodules file string to the MACRO Stefan Beller
2017-08-01 13:14         ` Jeff Hostetler
2017-08-01 17:35           ` Stefan Beller
2017-08-01 20:26             ` Junio C Hamano
2017-08-02 17:26               ` Brandon Williams
2017-08-02 17:46               ` Brandon Williams
2017-07-18 19:05     ` [PATCH v3 04/10] config: add config_from_gitmodules Brandon Williams
2017-07-18 19:05     ` [PATCH v3 05/10] submodule: remove submodule.fetchjobs from submodule-config parsing Brandon Williams
2017-07-18 19:05     ` [PATCH v3 06/10] submodule: remove fetch.recursesubmodules " Brandon Williams
2017-07-18 19:05     ` [PATCH v3 07/10] submodule: check for unstaged .gitmodules outside of config parsing Brandon Williams
2017-07-31 23:41       ` Stefan Beller
2017-08-02 17:41         ` Brandon Williams
2017-08-02 18:00           ` Brandon Williams
2017-07-18 19:05     ` [PATCH v3 08/10] submodule: check for unmerged " Brandon Williams
2017-07-18 19:05     ` [PATCH v3 09/10] submodule: merge repo_read_gitmodules and gitmodules_config Brandon Williams
2017-07-18 19:05     ` [PATCH v3 10/10] grep: recurse in-process using 'struct repository' Brandon Williams
2017-07-18 19:36     ` [PATCH v3 00/10] Convert grep to recurse in-process Junio C Hamano
2017-07-18 20:06       ` Brandon Williams
2017-08-02 19:49     ` [PATCH v4 " Brandon Williams
2017-08-02 19:49       ` [PATCH v4 01/10] repo_read_index: don't discard the index Brandon Williams
2017-08-02 19:49       ` [PATCH v4 02/10] repository: have the_repository use the_index Brandon Williams
2017-08-02 19:49       ` [PATCH v4 03/10] cache.h: add GITMODULES_FILE macro Brandon Williams
2017-08-02 19:49       ` [PATCH v4 04/10] config: add config_from_gitmodules Brandon Williams
2017-08-02 19:49       ` [PATCH v4 05/10] submodule: remove submodule.fetchjobs from submodule-config parsing Brandon Williams
2017-08-02 19:49       ` [PATCH v4 06/10] submodule: remove fetch.recursesubmodules " Brandon Williams
2017-08-02 19:49       ` [PATCH v4 07/10] submodule: check for unstaged .gitmodules outside of config parsing Brandon Williams
2017-08-02 19:49       ` [PATCH v4 08/10] submodule: check for unmerged " Brandon Williams
2017-08-02 19:49       ` [PATCH v4 09/10] submodule: merge repo_read_gitmodules and gitmodules_config Brandon Williams
2017-08-02 19:49       ` [PATCH v4 10/10] grep: recurse in-process using 'struct repository' Brandon Williams

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).