git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out
@ 2018-09-17 14:09 Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper Antonio Ospite
                   ` (8 more replies)
  0 siblings, 9 replies; 23+ messages in thread
From: Antonio Ospite @ 2018-09-17 14:09 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

Hi,

this series teaches git to try and read the .gitmodules file from the
index (:.gitmodules) or from the current branch (HEAD:.gitmodules) when
the file is not readily available in the working tree.

This can be used, together with sparse checkouts, to enable submodule
usage with programs like vcsh[1] which manage multiple repositories with
their working trees sharing the same path.

[1] https://github.com/RichiH/vcsh


v4 of the series is here:
https://public-inbox.org/git/20180824132951.8000-1-ao2@ao2.it/

v3 of the series is here:
https://public-inbox.org/git/20180814110525.17801-1-ao2@ao2.it/

v2 of the series is here:
https://public-inbox.org/git/20180802134634.10300-1-ao2@ao2.it/

v1 of the series, with some background info, is here:
https://public-inbox.org/git/20180514105823.8378-1-ao2@ao2.it/

v5 only contains some small cosmetic fixes suggested by Ævar Arnfjörð
Bjarmason, I ignored the comment about the memory leak in patch
1 because I think there is no leak in how git_config_parse_key is used:
https://public-inbox.org/git/87wosfesxl.fsf@evledraar.gmail.com/

Changes since v4:

  * Improve names of local variables in patch 1.

  * Move new functions definitions in patches 1 and 2 before
    a pre-existing comment instead of after it, as the comment does not
    apply to the new functions.

I am repeating the previous changelog below for your convenience, as v3
is what is currently in pu/ao/submodule-wo-gitmodules-checked-out.

Changes between v3 and v4:

  * Improve robustness of current tests in t7411-submodule-config.sh:
      - merge two tests that check for effects of the same commit.
      - reset to a well defined point in history when exiting the tests.

  * Fix style issues in new tests added to t7411-submodule-config.sh:
      - use test_when_finished in new tests.
      - name the output file 'expect' instead of 'expected'.

  * Add a new "submodule--helper config --check-writeable" command.

  * Use "s--h config --check-wrteable" in git-submodule.sh to share the
    code with the C implementation instead of duplicating the safety
    check in shell script.

  * Add the ability to read .gitmodules from the index and then
    fall-back to the current branch if the file is not in the index.
    Add also more tests to validate all the possible scenarios.

  * Fix style issues in t7416-submodule-sparse-gitmodules.sh:
      - name the output file 'expect' instead of 'expected'.
      - remove white space after the redirection operator.
      - indent the HEREDOC block.
      - use "git -C super" instead of a subshell when there is only one
        command in the test.

  * Remove a stale file named 'new' which erroneously slipped in
    a commit.

  * Update some comments and commit messages.


Thank you,
   Antonio


Antonio Ospite (9):
  submodule: add a print_config_from_gitmodules() helper
  submodule: factor out a config_set_in_gitmodules_file_gently function
  t7411: merge tests 5 and 6
  t7411: be nicer to future tests and really clean things up
  submodule--helper: add a new 'config' subcommand
  submodule: use the 'submodule--helper config' command
  t7506: clean up .gitmodules properly before setting up new scenario
  submodule: add a helper to check if it is safe to write to .gitmodules
  submodule: support reading .gitmodules when it's not in the working
    tree

 builtin/submodule--helper.c            |  40 +++++++++
 cache.h                                |   2 +
 git-submodule.sh                       |  13 ++-
 submodule-config.c                     |  55 ++++++++++++-
 submodule-config.h                     |   2 +
 submodule.c                            |  28 +++++--
 submodule.h                            |   1 +
 t/t7411-submodule-config.sh            | 107 +++++++++++++++++++++----
 t/t7416-submodule-sparse-gitmodules.sh |  78 ++++++++++++++++++
 t/t7506-status-submodule.sh            |   3 +-
 10 files changed, 300 insertions(+), 29 deletions(-)
 create mode 100755 t/t7416-submodule-sparse-gitmodules.sh

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper
  2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
@ 2018-09-17 14:09 ` Antonio Ospite
  2018-09-24 10:25   ` Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 2/9] submodule: factor out a config_set_in_gitmodules_file_gently function Antonio Ospite
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 23+ messages in thread
From: Antonio Ospite @ 2018-09-17 14:09 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

Add a new print_config_from_gitmodules() helper function to print values
from .gitmodules just like "git config -f .gitmodules" would.

This will be used by a new submodule--helper subcommand to be able to
access the .gitmodules file in a more controlled way.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
---
 submodule-config.c | 25 +++++++++++++++++++++++++
 submodule-config.h |  1 +
 2 files changed, 26 insertions(+)

diff --git a/submodule-config.c b/submodule-config.c
index fc2c41b947..f70b7f1baf 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -682,6 +682,31 @@ void submodule_free(struct repository *r)
 		submodule_cache_clear(r->submodule_cache);
 }
 
+static int config_print_callback(const char *var, const char *value, void *cb_data)
+{
+	char *wanted_key = cb_data;
+
+	if (!strcmp(wanted_key, var))
+		printf("%s\n", value);
+
+	return 0;
+}
+
+int print_config_from_gitmodules(const char *key)
+{
+	int ret;
+	char *store_key;
+
+	ret = git_config_parse_key(key, &store_key, NULL);
+	if (ret < 0)
+		return CONFIG_INVALID_KEY;
+
+	config_from_gitmodules(config_print_callback, the_repository, store_key);
+
+	free(store_key);
+	return 0;
+}
+
 struct fetch_config {
 	int *max_children;
 	int *recurse_submodules;
diff --git a/submodule-config.h b/submodule-config.h
index dc7278eea4..dd7f1b9a46 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -48,6 +48,7 @@ const struct submodule *submodule_from_path(struct repository *r,
 					    const struct object_id *commit_or_tree,
 					    const char *path);
 void submodule_free(struct repository *r);
+int print_config_from_gitmodules(const char *key);
 
 /*
  * Returns 0 if the name is syntactically acceptable as a submodule "name"
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 2/9] submodule: factor out a config_set_in_gitmodules_file_gently function
  2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper Antonio Ospite
@ 2018-09-17 14:09 ` Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 3/9] t7411: merge tests 5 and 6 Antonio Ospite
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Antonio Ospite @ 2018-09-17 14:09 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

Introduce a new config_set_in_gitmodules_file_gently() function to write
config values to the .gitmodules file.

This is in preparation for a future change which will use the function
to write to the .gitmodules file in a more controlled way instead of
using "git config -f .gitmodules".

The purpose of the change is mainly to centralize the code that writes
to the .gitmodules file to avoid some duplication.

The naming follows git_config_set_in_file_gently() but the git_ prefix
is removed to communicate that this is not a generic git-config API.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
---
 submodule-config.c | 12 ++++++++++++
 submodule-config.h |  1 +
 submodule.c        | 10 +++-------
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/submodule-config.c b/submodule-config.c
index f70b7f1baf..61a555e920 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -707,6 +707,18 @@ int print_config_from_gitmodules(const char *key)
 	return 0;
 }
 
+int config_set_in_gitmodules_file_gently(const char *key, const char *value)
+{
+	int ret;
+
+	ret = git_config_set_in_file_gently(GITMODULES_FILE, key, value);
+	if (ret < 0)
+		/* Maybe the user already did that, don't error out here */
+		warning(_("Could not update .gitmodules entry %s"), key);
+
+	return ret;
+}
+
 struct fetch_config {
 	int *max_children;
 	int *recurse_submodules;
diff --git a/submodule-config.h b/submodule-config.h
index dd7f1b9a46..3921927aa1 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -49,6 +49,7 @@ const struct submodule *submodule_from_path(struct repository *r,
 					    const char *path);
 void submodule_free(struct repository *r);
 int print_config_from_gitmodules(const char *key);
+int config_set_in_gitmodules_file_gently(const char *key, const char *value);
 
 /*
  * Returns 0 if the name is syntactically acceptable as a submodule "name"
diff --git a/submodule.c b/submodule.c
index a2b266fbfa..2e97032f86 100644
--- a/submodule.c
+++ b/submodule.c
@@ -89,6 +89,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
 {
 	struct strbuf entry = STRBUF_INIT;
 	const struct submodule *submodule;
+	int ret;
 
 	if (!file_exists(GITMODULES_FILE)) /* Do nothing without .gitmodules */
 		return -1;
@@ -104,14 +105,9 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
 	strbuf_addstr(&entry, "submodule.");
 	strbuf_addstr(&entry, submodule->name);
 	strbuf_addstr(&entry, ".path");
-	if (git_config_set_in_file_gently(GITMODULES_FILE, entry.buf, newpath) < 0) {
-		/* Maybe the user already did that, don't error out here */
-		warning(_("Could not update .gitmodules entry %s"), entry.buf);
-		strbuf_release(&entry);
-		return -1;
-	}
+	ret = config_set_in_gitmodules_file_gently(entry.buf, newpath);
 	strbuf_release(&entry);
-	return 0;
+	return ret;
 }
 
 /*
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 3/9] t7411: merge tests 5 and 6
  2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 2/9] submodule: factor out a config_set_in_gitmodules_file_gently function Antonio Ospite
@ 2018-09-17 14:09 ` Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 4/9] t7411: be nicer to future tests and really clean things up Antonio Ospite
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Antonio Ospite @ 2018-09-17 14:09 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

Tests 5 and 6 check for the effects of the same commit, merge the two
tests to make it more straightforward to clean things up after the test
has finished.

The cleanup will be added in a future commit.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
---
 t/t7411-submodule-config.sh | 18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index 0bde5850ac..f2cd1f4a2c 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -82,29 +82,21 @@ Submodule name: 'a' for path 'b'
 Submodule name: 'submodule' for path 'submodule'
 EOF
 
-test_expect_success 'error in one submodule config lets continue' '
+test_expect_success 'error in history of one submodule config lets continue, stderr message contains blob ref' '
 	(cd super &&
 		cp .gitmodules .gitmodules.bak &&
 		echo "	value = \"" >>.gitmodules &&
 		git add .gitmodules &&
 		mv .gitmodules.bak .gitmodules &&
 		git commit -m "add error" &&
-		test-tool submodule-config \
-			HEAD b \
-			HEAD submodule \
-				>actual &&
-		test_cmp expect_error actual
-	)
-'
-
-test_expect_success 'error message contains blob reference' '
-	(cd super &&
 		sha1=$(git rev-parse HEAD) &&
 		test-tool submodule-config \
 			HEAD b \
 			HEAD submodule \
-				2>actual_err &&
-		test_i18ngrep "submodule-blob $sha1:.gitmodules" actual_err >/dev/null
+				>actual \
+				2>actual_stderr &&
+		test_cmp expect_error actual &&
+		test_i18ngrep "submodule-blob $sha1:.gitmodules" actual_stderr >/dev/null
 	)
 '
 
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 4/9] t7411: be nicer to future tests and really clean things up
  2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
                   ` (2 preceding siblings ...)
  2018-09-17 14:09 ` [PATCH v5 3/9] t7411: merge tests 5 and 6 Antonio Ospite
@ 2018-09-17 14:09 ` Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 5/9] submodule--helper: add a new 'config' subcommand Antonio Ospite
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Antonio Ospite @ 2018-09-17 14:09 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

Tests 5 and 7 in t/t7411-submodule-config.sh add two commits with
invalid lines in .gitmodules but then only the second commit is removed.

This may affect future subsequent tests if they assume that the
.gitmodules file has no errors.

Remove both the commits as soon as they are not needed anymore.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
---
 t/t7411-submodule-config.sh | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index f2cd1f4a2c..b1f3c6489b 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -83,6 +83,8 @@ Submodule name: 'submodule' for path 'submodule'
 EOF
 
 test_expect_success 'error in history of one submodule config lets continue, stderr message contains blob ref' '
+	ORIG=$(git -C super rev-parse HEAD) &&
+	test_when_finished "git -C super reset --hard $ORIG" &&
 	(cd super &&
 		cp .gitmodules .gitmodules.bak &&
 		echo "	value = \"" >>.gitmodules &&
@@ -115,6 +117,8 @@ test_expect_success 'using different treeishs works' '
 '
 
 test_expect_success 'error in history in fetchrecursesubmodule lets continue' '
+	ORIG=$(git -C super rev-parse HEAD) &&
+	test_when_finished "git -C super reset --hard $ORIG" &&
 	(cd super &&
 		git config -f .gitmodules \
 			submodule.submodule.fetchrecursesubmodules blabla &&
@@ -126,8 +130,7 @@ test_expect_success 'error in history in fetchrecursesubmodule lets continue' '
 			HEAD b \
 			HEAD submodule \
 				>actual &&
-		test_cmp expect_error actual  &&
-		git reset --hard HEAD^
+		test_cmp expect_error actual
 	)
 '
 
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 5/9] submodule--helper: add a new 'config' subcommand
  2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
                   ` (3 preceding siblings ...)
  2018-09-17 14:09 ` [PATCH v5 4/9] t7411: be nicer to future tests and really clean things up Antonio Ospite
@ 2018-09-17 14:09 ` Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 6/9] submodule: use the 'submodule--helper config' command Antonio Ospite
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Antonio Ospite @ 2018-09-17 14:09 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

Add a new 'config' subcommand to 'submodule--helper', this extra level
of indirection makes it possible to add some flexibility to how the
submodules configuration is handled.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
---
 builtin/submodule--helper.c | 14 ++++++++++++++
 t/t7411-submodule-config.sh | 27 +++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f6fb8991f3..80f939cd9e 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2003,6 +2003,19 @@ static int check_name(int argc, const char **argv, const char *prefix)
 	return 0;
 }
 
+static int module_config(int argc, const char **argv, const char *prefix)
+{
+	/* Equivalent to ACTION_GET in builtin/config.c */
+	if (argc == 2)
+		return print_config_from_gitmodules(argv[1]);
+
+	/* Equivalent to ACTION_SET in builtin/config.c */
+	if (argc == 3)
+		return config_set_in_gitmodules_file_gently(argv[1], argv[2]);
+
+	die("submodule--helper config takes 1 or 2 arguments: name [value]");
+}
+
 #define SUPPORT_SUPER_PREFIX (1<<0)
 
 struct cmd_struct {
@@ -2030,6 +2043,7 @@ static struct cmd_struct commands[] = {
 	{"absorb-git-dirs", absorb_git_dirs, SUPPORT_SUPER_PREFIX},
 	{"is-active", is_active, 0},
 	{"check-name", check_name, 0},
+	{"config", module_config, 0},
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index b1f3c6489b..791245f18d 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -134,4 +134,31 @@ test_expect_success 'error in history in fetchrecursesubmodule lets continue' '
 	)
 '
 
+test_expect_success 'reading submodules config with "submodule--helper config"' '
+	(cd super &&
+		echo "../submodule" >expect &&
+		git submodule--helper config submodule.submodule.url >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'writing submodules config with "submodule--helper config"' '
+	(cd super &&
+		echo "new_url" >expect &&
+		git submodule--helper config submodule.submodule.url "new_url" &&
+		git submodule--helper config submodule.submodule.url >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'overwriting unstaged submodules config with "submodule--helper config"' '
+	test_when_finished "git -C super checkout .gitmodules" &&
+	(cd super &&
+		echo "newer_url" >expect &&
+		git submodule--helper config submodule.submodule.url "newer_url" &&
+		git submodule--helper config submodule.submodule.url >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 6/9] submodule: use the 'submodule--helper config' command
  2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
                   ` (4 preceding siblings ...)
  2018-09-17 14:09 ` [PATCH v5 5/9] submodule--helper: add a new 'config' subcommand Antonio Ospite
@ 2018-09-17 14:09 ` Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 7/9] t7506: clean up .gitmodules properly before setting up new scenario Antonio Ospite
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 23+ messages in thread
From: Antonio Ospite @ 2018-09-17 14:09 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

Use the 'submodule--helper config' command in git-submodules.sh to avoid
referring explicitly to .gitmodules by the hardcoded file path.

This makes it possible to access the submodules configuration in a more
controlled way.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
---
 git-submodule.sh | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 1cb2c0a31b..25b9bc58cb 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -72,7 +72,7 @@ get_submodule_config () {
 	value=$(git config submodule."$name"."$option")
 	if test -z "$value"
 	then
-		value=$(git config -f .gitmodules submodule."$name"."$option")
+		value=$(git submodule--helper config submodule."$name"."$option")
 	fi
 	printf '%s' "${value:-$default}"
 }
@@ -283,11 +283,11 @@ or you are unsure what this means choose another name with the '--name' option."
 	git add --no-warn-embedded-repo $force "$sm_path" ||
 	die "$(eval_gettext "Failed to add submodule '\$sm_path'")"
 
-	git config -f .gitmodules submodule."$sm_name".path "$sm_path" &&
-	git config -f .gitmodules submodule."$sm_name".url "$repo" &&
+	git submodule--helper config submodule."$sm_name".path "$sm_path" &&
+	git submodule--helper config submodule."$sm_name".url "$repo" &&
 	if test -n "$branch"
 	then
-		git config -f .gitmodules submodule."$sm_name".branch "$branch"
+		git submodule--helper config submodule."$sm_name".branch "$branch"
 	fi &&
 	git add --force .gitmodules ||
 	die "$(eval_gettext "Failed to register submodule '\$sm_path'")"
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 7/9] t7506: clean up .gitmodules properly before setting up new scenario
  2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
                   ` (5 preceding siblings ...)
  2018-09-17 14:09 ` [PATCH v5 6/9] submodule: use the 'submodule--helper config' command Antonio Ospite
@ 2018-09-17 14:09 ` Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 8/9] submodule: add a helper to check if it is safe to write to .gitmodules Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree Antonio Ospite
  8 siblings, 0 replies; 23+ messages in thread
From: Antonio Ospite @ 2018-09-17 14:09 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

In t/t7506-status-submodule.sh at some point a new scenario is set up to
test different things, in particular new submodules are added which are
meant to completely replace the previous ones.

However before calling the "git submodule add" commands for the new
layout, the .gitmodules file is removed only from the working tree still
leaving the previous content in current branch.

This can break if, in the future, "git submodule add" starts
differentiating between the following two cases:

  - .gitmodules is not in the working tree but it is in the current
    branch (it may not be safe to add new submodules in this case);

  - .gitmodules is neither in the working tree nor anywhere in the
    current branch (it is safe to add new submodules).

Since the test intends to get rid of .gitmodules anyways, let's
completely remove it from the current branch, to actually start afresh
in the new scenario.

This is more future-proof and does not break current tests.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
---
 t/t7506-status-submodule.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh
index 943708fb04..08629a6e70 100755
--- a/t/t7506-status-submodule.sh
+++ b/t/t7506-status-submodule.sh
@@ -325,7 +325,8 @@ test_expect_success 'setup superproject with untracked file in nested submodule'
 	(
 		cd super &&
 		git clean -dfx &&
-		rm .gitmodules &&
+		git rm .gitmodules &&
+		git commit -m "remove .gitmodules" &&
 		git submodule add -f ./sub1 &&
 		git submodule add -f ./sub2 &&
 		git submodule add -f ./sub1 sub3 &&
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 8/9] submodule: add a helper to check if it is safe to write to .gitmodules
  2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
                   ` (6 preceding siblings ...)
  2018-09-17 14:09 ` [PATCH v5 7/9] t7506: clean up .gitmodules properly before setting up new scenario Antonio Ospite
@ 2018-09-17 14:09 ` Antonio Ospite
  2018-09-17 14:09 ` [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree Antonio Ospite
  8 siblings, 0 replies; 23+ messages in thread
From: Antonio Ospite @ 2018-09-17 14:09 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

Introduce a helper function named is_writing_gitmodules_ok() to verify
that the .gitmodules file is safe to write.

The function name follows the scheme of is_staging_gitmodules_ok().

The two symbolic constants GITMODULES_INDEX and GITMODULES_HEAD are used
to get help from the C preprocessor in preventing typos, especially for
future users.

This is in preparation for a future change which teaches git how to read
.gitmodules from the index or from the current branch if the file is not
available in the working tree.

The rationale behind the check is that writing to .gitmodules requires
the file to be present in the working tree, unless a brand new
.gitmodules is being created (in which case the .gitmodules file would
not exist at all: neither in the working tree nor in the index or in the
current branch).

Expose the functionality also via a "submodule-helper config
--check-writeable" command, as git scripts may want to perform the check
before modifying submodules configuration.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
---
 builtin/submodule--helper.c | 24 +++++++++++++++++++++++-
 cache.h                     |  2 ++
 submodule.c                 | 18 ++++++++++++++++++
 submodule.h                 |  1 +
 t/t7411-submodule-config.sh | 31 +++++++++++++++++++++++++++++++
 5 files changed, 75 insertions(+), 1 deletion(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 80f939cd9e..bd14f57d00 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2005,6 +2005,28 @@ static int check_name(int argc, const char **argv, const char *prefix)
 
 static int module_config(int argc, const char **argv, const char *prefix)
 {
+	enum {
+		CHECK_WRITEABLE = 1
+	} command = 0;
+
+	struct option module_config_options[] = {
+		OPT_CMDMODE(0, "check-writeable", &command,
+			    N_("check if it is safe to write to the .gitmodules file"),
+			    CHECK_WRITEABLE),
+		OPT_END()
+	};
+	const char *const git_submodule_helper_usage[] = {
+		N_("git submodule--helper config name [value]"),
+		N_("git submodule--helper config --check-writeable"),
+		NULL
+	};
+
+	argc = parse_options(argc, argv, prefix, module_config_options,
+			     git_submodule_helper_usage, PARSE_OPT_KEEP_ARGV0);
+
+	if (argc == 1 && command == CHECK_WRITEABLE)
+		return is_writing_gitmodules_ok() ? 0 : -1;
+
 	/* Equivalent to ACTION_GET in builtin/config.c */
 	if (argc == 2)
 		return print_config_from_gitmodules(argv[1]);
@@ -2013,7 +2035,7 @@ static int module_config(int argc, const char **argv, const char *prefix)
 	if (argc == 3)
 		return config_set_in_gitmodules_file_gently(argv[1], argv[2]);
 
-	die("submodule--helper config takes 1 or 2 arguments: name [value]");
+	usage_with_options(git_submodule_helper_usage, module_config_options);
 }
 
 #define SUPPORT_SUPER_PREFIX (1<<0)
diff --git a/cache.h b/cache.h
index 4d014541ab..33723d2a32 100644
--- a/cache.h
+++ b/cache.h
@@ -486,6 +486,8 @@ static inline enum object_type object_type(unsigned int mode)
 #define INFOATTRIBUTES_FILE "info/attributes"
 #define ATTRIBUTE_MACRO_PREFIX "[attr]"
 #define GITMODULES_FILE ".gitmodules"
+#define GITMODULES_INDEX ":.gitmodules"
+#define GITMODULES_HEAD "HEAD:.gitmodules"
 #define GIT_NOTES_REF_ENVIRONMENT "GIT_NOTES_REF"
 #define GIT_NOTES_DEFAULT_REF "refs/notes/commits"
 #define GIT_NOTES_DISPLAY_REF_ENVIRONMENT "GIT_NOTES_DISPLAY_REF"
diff --git a/submodule.c b/submodule.c
index 2e97032f86..2b7082b2db 100644
--- a/submodule.c
+++ b/submodule.c
@@ -50,6 +50,24 @@ int is_gitmodules_unmerged(const struct index_state *istate)
 	return 0;
 }
 
+/*
+ * Check if the .gitmodules file is safe to write.
+ *
+ * Writing to the .gitmodules file requires that the file exists in the
+ * working tree or, if it doesn't, that a brand new .gitmodules file is going
+ * to be created (i.e. it's neither in the index nor in the current branch).
+ *
+ * It is not safe to write to .gitmodules if it's not in the working tree but
+ * it is in the index or in the current branch, because writing new values
+ * (and staging them) would blindly overwrite ALL the old content.
+ */
+int is_writing_gitmodules_ok(void)
+{
+	struct object_id oid;
+	return file_exists(GITMODULES_FILE) ||
+		(get_oid(GITMODULES_INDEX, &oid) < 0 && get_oid(GITMODULES_HEAD, &oid) < 0);
+}
+
 /*
  * Check if the .gitmodules file has unstaged modifications.  This must be
  * checked before allowing modifications to the .gitmodules file with the
diff --git a/submodule.h b/submodule.h
index e452919aa4..7a22f71cb9 100644
--- a/submodule.h
+++ b/submodule.h
@@ -40,6 +40,7 @@ struct submodule_update_strategy {
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
 int is_gitmodules_unmerged(const struct index_state *istate);
+int is_writing_gitmodules_ok(void);
 int is_staging_gitmodules_ok(struct index_state *istate);
 int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 int remove_path_from_gitmodules(const char *path);
diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index 791245f18d..45953f9300 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -161,4 +161,35 @@ test_expect_success 'overwriting unstaged submodules config with "submodule--hel
 	)
 '
 
+test_expect_success 'writeable .gitmodules when it is in the working tree' '
+	git -C super submodule--helper config --check-writeable
+'
+
+test_expect_success 'writeable .gitmodules when it is nowhere in the repository' '
+	ORIG=$(git -C super rev-parse HEAD) &&
+	test_when_finished "git -C super reset --hard $ORIG" &&
+	(cd super &&
+		git rm .gitmodules &&
+		git commit -m "remove .gitmodules from the current branch" &&
+		git submodule--helper config --check-writeable
+	)
+'
+
+test_expect_success 'non-writeable .gitmodules when it is in the index but not in the working tree' '
+	test_when_finished "git -C super checkout .gitmodules" &&
+	(cd super &&
+		rm -f .gitmodules &&
+		test_must_fail git submodule--helper config --check-writeable
+	)
+'
+
+test_expect_success 'non-writeable .gitmodules when it is in the current branch but not in the index' '
+	ORIG=$(git -C super rev-parse HEAD) &&
+	test_when_finished "git -C super reset --hard $ORIG" &&
+	(cd super &&
+		git rm .gitmodules &&
+		test_must_fail git submodule--helper config --check-writeable
+	)
+'
+
 test_done
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
                   ` (7 preceding siblings ...)
  2018-09-17 14:09 ` [PATCH v5 8/9] submodule: add a helper to check if it is safe to write to .gitmodules Antonio Ospite
@ 2018-09-17 14:09 ` Antonio Ospite
  2018-09-18 17:12   ` SZEDER Gábor
  8 siblings, 1 reply; 23+ messages in thread
From: Antonio Ospite @ 2018-09-17 14:09 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

When the .gitmodules file is not available in the working tree, try
using the content from the index and from the current branch. This
covers the case when the file is part of the repository but for some
reason it is not checked out, for example because of a sparse checkout.

This makes it possible to use at least the 'git submodule' commands
which *read* the gitmodules configuration file without fully populating
the working tree.

Writing to .gitmodules will still require that the file is checked out,
so check for that before calling config_set_in_gitmodules_file_gently.

Add a similar check also in git-submodule.sh::cmd_add() to anticipate
the eventual failure of the "git submodule add" command when .gitmodules
is not safely writeable; this prevents the command from leaving the
repository in a spurious state (e.g. the submodule repository was cloned
but .gitmodules was not updated because
config_set_in_gitmodules_file_gently failed).

Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading
from .gitmodules succeeds and that writing to it fails when the file is
not checked out.

Signed-off-by: Antonio Ospite <ao2@ao2.it>
---
 builtin/submodule--helper.c            |  6 +-
 git-submodule.sh                       |  5 ++
 submodule-config.c                     | 18 +++++-
 t/t7411-submodule-config.sh            | 26 ++++++++-
 t/t7416-submodule-sparse-gitmodules.sh | 78 ++++++++++++++++++++++++++
 5 files changed, 129 insertions(+), 4 deletions(-)
 create mode 100755 t/t7416-submodule-sparse-gitmodules.sh

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index bd14f57d00..f72f6ee58a 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -2032,8 +2032,12 @@ static int module_config(int argc, const char **argv, const char *prefix)
 		return print_config_from_gitmodules(argv[1]);
 
 	/* Equivalent to ACTION_SET in builtin/config.c */
-	if (argc == 3)
+	if (argc == 3) {
+		if (!is_writing_gitmodules_ok())
+			die(_("please make sure that the .gitmodules file is in the working tree"));
+
 		return config_set_in_gitmodules_file_gently(argv[1], argv[2]);
+	}
 
 	usage_with_options(git_submodule_helper_usage, module_config_options);
 }
diff --git a/git-submodule.sh b/git-submodule.sh
index 25b9bc58cb..bff855f54a 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -159,6 +159,11 @@ cmd_add()
 		shift
 	done
 
+	if ! git submodule--helper config --check-writeable >/dev/null 2>&1
+	then
+		 die "$(eval_gettext "please make sure that the .gitmodules file is in the working tree")"
+	fi
+
 	if test -n "$reference_path"
 	then
 		is_absolute_path "$reference_path" ||
diff --git a/submodule-config.c b/submodule-config.c
index 61a555e920..bdb1d0e2c9 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -1,4 +1,5 @@
 #include "cache.h"
+#include "dir.h"
 #include "repository.h"
 #include "config.h"
 #include "submodule-config.h"
@@ -603,8 +604,21 @@ static void submodule_cache_check_init(struct repository *repo)
 static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void *data)
 {
 	if (repo->worktree) {
-		char *file = repo_worktree_path(repo, GITMODULES_FILE);
-		git_config_from_file(fn, file, data);
+		struct git_config_source config_source = { 0 };
+		const struct config_options opts = { 0 };
+		struct object_id oid;
+		char *file;
+
+		file = repo_worktree_path(repo, GITMODULES_FILE);
+		if (file_exists(file))
+			config_source.file = file;
+		else if (get_oid(GITMODULES_INDEX, &oid) >= 0)
+			config_source.blob = GITMODULES_INDEX;
+		else if (get_oid(GITMODULES_HEAD, &oid) >= 0)
+			config_source.blob = GITMODULES_HEAD;
+
+		config_with_options(fn, data, &config_source, &opts);
+
 		free(file);
 	}
 }
diff --git a/t/t7411-submodule-config.sh b/t/t7411-submodule-config.sh
index 45953f9300..2cfabb18bc 100755
--- a/t/t7411-submodule-config.sh
+++ b/t/t7411-submodule-config.sh
@@ -134,7 +134,7 @@ test_expect_success 'error in history in fetchrecursesubmodule lets continue' '
 	)
 '
 
-test_expect_success 'reading submodules config with "submodule--helper config"' '
+test_expect_success 'reading submodules config from the working tree with "submodule--helper config"' '
 	(cd super &&
 		echo "../submodule" >expect &&
 		git submodule--helper config submodule.submodule.url >actual &&
@@ -192,4 +192,28 @@ test_expect_success 'non-writeable .gitmodules when it is in the current branch
 	)
 '
 
+test_expect_success 'reading submodules config from the index when .gitmodules is not in the working tree' '
+	ORIG=$(git -C super rev-parse HEAD) &&
+	test_when_finished "git -C super reset --hard $ORIG" &&
+	(cd super &&
+		git submodule--helper config submodule.submodule.url "staged_url" &&
+		git add .gitmodules &&
+		rm -f .gitmodules &&
+		echo "staged_url" >expect &&
+		git submodule--helper config submodule.submodule.url >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'reading submodules config from the current branch when .gitmodules is not in the index' '
+	ORIG=$(git -C super rev-parse HEAD) &&
+	test_when_finished "git -C super reset --hard $ORIG" &&
+	(cd super &&
+		git rm .gitmodules &&
+		echo "../submodule" >expect &&
+		git submodule--helper config submodule.submodule.url >actual &&
+		test_cmp expect actual
+	)
+'
+
 test_done
diff --git a/t/t7416-submodule-sparse-gitmodules.sh b/t/t7416-submodule-sparse-gitmodules.sh
new file mode 100755
index 0000000000..908a4e6958
--- /dev/null
+++ b/t/t7416-submodule-sparse-gitmodules.sh
@@ -0,0 +1,78 @@
+#!/bin/sh
+#
+# Copyright (C) 2018  Antonio Ospite <ao2@ao2.it>
+#
+
+test_description='Test reading/writing .gitmodules when not in the working tree
+
+This test verifies that, when .gitmodules is in the current branch but is not
+in the working tree reading from it still works but writing to it does not.
+
+The test setup uses a sparse checkout, however the same scenario can be set up
+also by committing .gitmodules and then just removing it from the filesystem.
+'
+
+. ./test-lib.sh
+
+test_expect_success 'sparse checkout setup which hides .gitmodules' '
+	echo file >file &&
+	git add file &&
+	test_tick &&
+	git commit -m upstream &&
+	git clone . super &&
+	git clone super submodule &&
+	git clone super new_submodule &&
+	(cd super &&
+		git submodule add ../submodule &&
+		test_tick &&
+		git commit -m submodule &&
+		cat >.git/info/sparse-checkout <<-\EOF &&
+		/*
+		!/.gitmodules
+		EOF
+		git config core.sparsecheckout true &&
+		git read-tree -m -u HEAD &&
+		test_path_is_missing .gitmodules
+	)
+'
+
+test_expect_success 'reading gitmodules config file when it is not checked out' '
+	(cd super &&
+		echo "../submodule" >expect &&
+		git submodule--helper config submodule.submodule.url >actual &&
+		test_cmp expect actual
+	)
+'
+
+test_expect_success 'not writing gitmodules config file when it is not checked out' '
+	 test_must_fail git -C super submodule--helper config submodule.submodule.url newurl
+'
+
+test_expect_success 'initialising submodule when the gitmodules config is not checked out' '
+	git -C super submodule init
+'
+
+test_expect_success 'showing submodule summary when the gitmodules config is not checked out' '
+	git -C super submodule summary
+'
+
+test_expect_success 'updating submodule when the gitmodules config is not checked out' '
+	(cd submodule &&
+		echo file2 >file2 &&
+		git add file2 &&
+		git commit -m "add file2 to submodule"
+	) &&
+	git -C super submodule update
+'
+
+test_expect_success 'not adding submodules when the gitmodules config is not checked out' '
+	test_must_fail git -C super submodule add ../new_submodule
+'
+
+# This test checks that the previous "git submodule add" did not leave the
+# repository in a spurious state when it failed.
+test_expect_success 'init submodule still works even after the previous add failed' '
+	git -C super submodule init
+'
+
+test_done
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-17 14:09 ` [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree Antonio Ospite
@ 2018-09-18 17:12   ` SZEDER Gábor
  2018-09-19 19:24     ` Junio C Hamano
                       ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: SZEDER Gábor @ 2018-09-18 17:12 UTC (permalink / raw)
  To: Antonio Ospite
  Cc: gitster, git, Brandon Williams, Daniel Graña,
	Jonathan Nieder, Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason

Hi Antonio,

it appears that this patch (and its previous versions as well) is
responsible for triggering occasional test failures in
't7814-grep-recurse-submodules.sh', more frequently, about once in
every ten runs, on macOS on Travis CI, less frequently, about once in
a couple of hundred runs on Linux on my machine.

The reason for the failure is memory corruption manifesting in various
ways: segfault, malloc() or use after free() errors from libc, corrupt
loose object, invalid ref, bogus output, etc.

Applying the following patch makes t7814 fail almost every time,
though sometimes that loop has to iterate over 1000 times until that
'git grep' finally fails...  so good luck with debugging ;)

diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 7184113b9b..93ae2e8e7c 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -337,6 +337,10 @@ test_expect_success 'grep --recurse-submodules should pass the pattern type alon
 	test_must_fail git -c grep.patternType=fixed grep --recurse-submodules -e "(.|.)[\d]" &&
 
 	# Basic
+	for i in $(seq 0 2000)
+	do
+		git grep --recurse-submodules 1 >/dev/null || return 1
+	done &&
 	git grep -G --recurse-submodules -e "(.|.)[\d]" >actual &&
 	cat >expect <<-\EOF &&
 	a:(1|2)d(3|4)


On first look I didn't notice anything that is obviously wrong in this
patch and could be responsible for the memory corruption, but there is
one thing I found strange, though:


On Mon, Sep 17, 2018 at 04:09:40PM +0200, Antonio Ospite wrote:
> When the .gitmodules file is not available in the working tree, try
> using the content from the index and from the current branch.

"from the index and from the current branch" of which repository?

> This
> covers the case when the file is part of the repository but for some
> reason it is not checked out, for example because of a sparse checkout.
> 
> This makes it possible to use at least the 'git submodule' commands
> which *read* the gitmodules configuration file without fully populating
> the working tree.
> 
> Writing to .gitmodules will still require that the file is checked out,
> so check for that before calling config_set_in_gitmodules_file_gently.
> 
> Add a similar check also in git-submodule.sh::cmd_add() to anticipate
> the eventual failure of the "git submodule add" command when .gitmodules
> is not safely writeable; this prevents the command from leaving the
> repository in a spurious state (e.g. the submodule repository was cloned
> but .gitmodules was not updated because
> config_set_in_gitmodules_file_gently failed).
> 
> Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading
> from .gitmodules succeeds and that writing to it fails when the file is
> not checked out.
> 
> Signed-off-by: Antonio Ospite <ao2@ao2.it>
> ---

> diff --git a/submodule-config.c b/submodule-config.c
> index 61a555e920..bdb1d0e2c9 100644
> --- a/submodule-config.c
> +++ b/submodule-config.c

> @@ -603,8 +604,21 @@ static void submodule_cache_check_init(struct repository *repo)
>  static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void *data)
>  {
>  	if (repo->worktree) {
> -		char *file = repo_worktree_path(repo, GITMODULES_FILE);
> -		git_config_from_file(fn, file, data);
> +		struct git_config_source config_source = { 0 };
> +		const struct config_options opts = { 0 };
> +		struct object_id oid;
> +		char *file;
> +
> +		file = repo_worktree_path(repo, GITMODULES_FILE);
> +		if (file_exists(file))
> +			config_source.file = file;
> +		else if (get_oid(GITMODULES_INDEX, &oid) >= 0)
> +			config_source.blob = GITMODULES_INDEX;

The repository used in t7814 contains nested submodules, which means
that config_from_gitmodules() is invoked three times.

Now, the first two of those calls look at the superproject and at
'submodule', and find the existing files '.../trash
directory.t7814-grep-recurse-submodules/.gitmodules' and '.../trash
directory.t7814-grep-recurse-submodules/submodule/.gitmodules',
respectively.  So far so good.

The third call, however, looks at the nested submodule at
'submodule/sub', which doesn't contain a '.gitmodules' file.  So this
function goes on with the second condition and calls
get_oid(GITMODULES_INDEX, &oid), which then appears to find the blob
in the _superproject's_ index.

I'm no expert on submodules, but my gut feeling says that this can't
be right.  But if it _is_ right, then I would say that the commit
message should explain in detail, why it is right.

Anyway, even if it is indeed wrong, I'm not sure whether this is the
root cause of the memory corruption.


> +		else if (get_oid(GITMODULES_HEAD, &oid) >= 0)
> +			config_source.blob = GITMODULES_HEAD;
> +
> +		config_with_options(fn, data, &config_source, &opts);
> +
>  		free(file);
>  	}
>  }

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-18 17:12   ` SZEDER Gábor
@ 2018-09-19 19:24     ` Junio C Hamano
  2018-09-20 15:35     ` Antonio Ospite
  2018-09-24 10:20     ` Antonio Ospite
  2 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2018-09-19 19:24 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Antonio Ospite, git, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason

SZEDER Gábor <szeder.dev@gmail.com> writes:

> it appears that this patch (and its previous versions as well) is
> responsible for triggering occasional test failures in
> 't7814-grep-recurse-submodules.sh', more frequently, about once in
> every ten runs, on macOS on Travis CI, less frequently, about once in
> a couple of hundred runs on Linux on my machine.

I see that among Cc'ed are people who are more familiar with the
submodule code and where it wants to go.  Thanks for a report and
analysis.

> The reason for the failure is memory corruption manifesting in various
> ways: segfault, malloc() or use after free() errors from libc, corrupt
> loose object, invalid ref, bogus output, etc.
> 
> Applying the following patch makes t7814 fail almost every time,
> though sometimes that loop has to iterate over 1000 times until that
> 'git grep' finally fails...  so good luck with debugging ;)
>
> diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
> index 7184113b9b..93ae2e8e7c 100755
> --- a/t/t7814-grep-recurse-submodules.sh
> +++ b/t/t7814-grep-recurse-submodules.sh
> @@ -337,6 +337,10 @@ test_expect_success 'grep --recurse-submodules should pass the pattern type alon
>  	test_must_fail git -c grep.patternType=fixed grep --recurse-submodules -e "(.|.)[\d]" &&
>  
>  	# Basic
> +	for i in $(seq 0 2000)
> +	do
> +		git grep --recurse-submodules 1 >/dev/null || return 1
> +	done &&
>  	git grep -G --recurse-submodules -e "(.|.)[\d]" >actual &&
>  	cat >expect <<-\EOF &&
>  	a:(1|2)d(3|4)
>
> On first look I didn't notice anything that is obviously wrong in this
> patch and could be responsible for the memory corruption, but there is
> one thing I found strange, though:
>
>
> On Mon, Sep 17, 2018 at 04:09:40PM +0200, Antonio Ospite wrote:
>> When the .gitmodules file is not available in the working tree, try
>> using the content from the index and from the current branch.
>
> "from the index and from the current branch" of which repository?
>
>> This
>> covers the case when the file is part of the repository but for some
>> reason it is not checked out, for example because of a sparse checkout.
>> 
>> This makes it possible to use at least the 'git submodule' commands
>> which *read* the gitmodules configuration file without fully populating
>> the working tree.
>> 
>> Writing to .gitmodules will still require that the file is checked out,
>> so check for that before calling config_set_in_gitmodules_file_gently.
>> 
>> Add a similar check also in git-submodule.sh::cmd_add() to anticipate
>> the eventual failure of the "git submodule add" command when .gitmodules
>> is not safely writeable; this prevents the command from leaving the
>> repository in a spurious state (e.g. the submodule repository was cloned
>> but .gitmodules was not updated because
>> config_set_in_gitmodules_file_gently failed).
>> 
>> Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading
>> from .gitmodules succeeds and that writing to it fails when the file is
>> not checked out.
>> 
>> Signed-off-by: Antonio Ospite <ao2@ao2.it>
>> ---
>
>> diff --git a/submodule-config.c b/submodule-config.c
>> index 61a555e920..bdb1d0e2c9 100644
>> --- a/submodule-config.c
>> +++ b/submodule-config.c
>
>> @@ -603,8 +604,21 @@ static void submodule_cache_check_init(struct repository *repo)
>>  static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void *data)
>>  {
>>  	if (repo->worktree) {
>> -		char *file = repo_worktree_path(repo, GITMODULES_FILE);
>> -		git_config_from_file(fn, file, data);
>> +		struct git_config_source config_source = { 0 };
>> +		const struct config_options opts = { 0 };
>> +		struct object_id oid;
>> +		char *file;
>> +
>> +		file = repo_worktree_path(repo, GITMODULES_FILE);
>> +		if (file_exists(file))
>> +			config_source.file = file;
>> +		else if (get_oid(GITMODULES_INDEX, &oid) >= 0)
>> +			config_source.blob = GITMODULES_INDEX;
>
> The repository used in t7814 contains nested submodules, which means
> that config_from_gitmodules() is invoked three times.
>
> Now, the first two of those calls look at the superproject and at
> 'submodule', and find the existing files '.../trash
> directory.t7814-grep-recurse-submodules/.gitmodules' and '.../trash
> directory.t7814-grep-recurse-submodules/submodule/.gitmodules',
> respectively.  So far so good.
>
> The third call, however, looks at the nested submodule at
> 'submodule/sub', which doesn't contain a '.gitmodules' file.  So this
> function goes on with the second condition and calls
> get_oid(GITMODULES_INDEX, &oid), which then appears to find the blob
> in the _superproject's_ index.
>
> I'm no expert on submodules, but my gut feeling says that this can't
> be right.  But if it _is_ right, then I would say that the commit
> message should explain in detail, why it is right.
>
> Anyway, even if it is indeed wrong, I'm not sure whether this is the
> root cause of the memory corruption.
>
>
>> +		else if (get_oid(GITMODULES_HEAD, &oid) >= 0)
>> +			config_source.blob = GITMODULES_HEAD;
>> +
>> +		config_with_options(fn, data, &config_source, &opts);
>> +
>>  		free(file);
>>  	}
>>  }

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-18 17:12   ` SZEDER Gábor
  2018-09-19 19:24     ` Junio C Hamano
@ 2018-09-20 15:35     ` Antonio Ospite
  2018-09-21 16:19       ` Junio C Hamano
  2018-09-24 10:20     ` Antonio Ospite
  2 siblings, 1 reply; 23+ messages in thread
From: Antonio Ospite @ 2018-09-20 15:35 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: gitster, git, Brandon Williams, Daniel Graña,
	Jonathan Nieder, Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason

On Tue, 18 Sep 2018 19:12:57 +0200
SZEDER Gábor <szeder.dev@gmail.com> wrote:

> Hi Antonio,
> 
> it appears that this patch (and its previous versions as well) is
> responsible for triggering occasional test failures in
> 't7814-grep-recurse-submodules.sh', more frequently, about once in
> every ten runs, on macOS on Travis CI, less frequently, about once in
> a couple of hundred runs on Linux on my machine.
>

Thanks a lot for testing Gábor, it's really appreciated.

> The reason for the failure is memory corruption manifesting in various
> ways: segfault, malloc() or use after free() errors from libc, corrupt
> loose object, invalid ref, bogus output, etc.
> 
> Applying the following patch makes t7814 fail almost every time,
> though sometimes that loop has to iterate over 1000 times until that
> 'git grep' finally fails...  so good luck with debugging ;)
[...]

I managed to capture some traces of the segfaults using this variation:

diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
index 7184113b9b..56e87c3f8a 100755
--- a/t/t7814-grep-recurse-submodules.sh
+++ b/t/t7814-grep-recurse-submodules.sh
@@ -337,6 +337,10 @@ test_expect_success 'grep --recurse-submodules should pass the pattern type alon
        test_must_fail git -c grep.patternType=fixed grep --recurse-submodules -e "(.|.)[\d]" &&

        # Basic
+       for i in $(test_seq 0 2000)
+       do
+               debug --debugger="gdb --silent -ex run -ex quit --return-child-result --args" git grep --recurse-submodules 1 >/dev/null || return 1
+       done &&
        git grep -G --recurse-submodules -e "(.|.)[\d]" >actual &&
        cat >expect <<-\EOF &&
        a:(1|2)d(3|4)


Running t7814 with --run="1,6,22" is enough to observe the issue.

FWICS these corruptions are caused by concurrent accesses to the object
store.

The issue is caused by these facts:
  1. git grep uses threads;
  2. git grep reads submodules config with repo_read_gitmodules;
  3. repo_read_gitmodules calls config_from_gitmodules
  4. the changes in patch 9 in this series make config_from_gitmodules
     use the object store, which apparently is not mt-safe, while the
     previous use of git_config_from_file() was.

> On first look I didn't notice anything that is obviously wrong in this
> patch and could be responsible for the memory corruption, but there is
> one thing I found strange, though:
> 
> 
> On Mon, Sep 17, 2018 at 04:09:40PM +0200, Antonio Ospite wrote:
> > When the .gitmodules file is not available in the working tree, try
> > using the content from the index and from the current branch.
> 
> "from the index and from the current branch" of which repository?
> 
[...]

> > diff --git a/submodule-config.c b/submodule-config.c
> > index 61a555e920..bdb1d0e2c9 100644
> > --- a/submodule-config.c
> > +++ b/submodule-config.c
> 
> > @@ -603,8 +604,21 @@ static void submodule_cache_check_init(struct repository *repo)
> >  static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void *data)
> >  {
> >  	if (repo->worktree) {
> > -		char *file = repo_worktree_path(repo, GITMODULES_FILE);
> > -		git_config_from_file(fn, file, data);
> > +		struct git_config_source config_source = { 0 };
> > +		const struct config_options opts = { 0 };
> > +		struct object_id oid;
> > +		char *file;
> > +
> > +		file = repo_worktree_path(repo, GITMODULES_FILE);
> > +		if (file_exists(file))
> > +			config_source.file = file;
> > +		else if (get_oid(GITMODULES_INDEX, &oid) >= 0)
> > +			config_source.blob = GITMODULES_INDEX;
> 
> The repository used in t7814 contains nested submodules, which means
> that config_from_gitmodules() is invoked three times.
> 
> Now, the first two of those calls look at the superproject and at
> 'submodule', and find the existing files '.../trash
> directory.t7814-grep-recurse-submodules/.gitmodules' and '.../trash
> directory.t7814-grep-recurse-submodules/submodule/.gitmodules',
> respectively.  So far so good.
> 
> The third call, however, looks at the nested submodule at
> 'submodule/sub', which doesn't contain a '.gitmodules' file.  So this
> function goes on with the second condition and calls
> get_oid(GITMODULES_INDEX, &oid), which then appears to find the blob
> in the _superproject's_ index.
> 
> I'm no expert on submodules, but my gut feeling says that this can't
> be right.  But if it _is_ right, then I would say that the commit
> message should explain in detail, why it is right.
>

I'll think about that too.

> Anyway, even if it is indeed wrong, I'm not sure whether this is the
> root cause of the memory corruption.
> 

I think the immediate cause of the corruptions is multi-threading in
grep, I can prevent the issue from happening by using "git grep
--threads 1 ...".

Protecting the problematic submodules function could work for now, but
I'd like to have more comments, my proposal is:

diff --git a/builtin/grep.c b/builtin/grep.c
index 601f801158..52b45de749 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -427,6 +427,11 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
        if (repo_submodule_init(&submodule, superproject, path))
                return 0;

+       grep_read_lock();
+       /*
+        * NEEDSWORK: repo_read_gitmodules accesses the object store which is
+        * global, thus it needs to be protected.
+        */
        repo_read_gitmodules(&submodule);

        /*
@@ -439,7 +444,6 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
         * store is no longer global and instead is a member of the repository
         * object.
         */
-       grep_read_lock();
        add_to_alternates_memory(submodule.objects->objectdir);
        grep_read_unlock();


The pre-existing NEEDSWORK comment there also suggests that these
problems with the object store are known. I was not aware of them.

With the change from above I could not reproduce the problem anymore,
this should be the only location where
repo_read_gitmodules/config_from_gitmodules is called in a thread.

Thanks you,
   Antonio

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-20 15:35     ` Antonio Ospite
@ 2018-09-21 16:19       ` Junio C Hamano
  2018-09-27 14:49         ` Antonio Ospite
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2018-09-21 16:19 UTC (permalink / raw)
  To: Antonio Ospite
  Cc: SZEDER Gábor, git, Brandon Williams, Daniel Graña,
	Jonathan Nieder, Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason

Antonio Ospite <ao2@ao2.it> writes:

> Protecting the problematic submodules function could work for now, but
> I'd like to have more comments, my proposal is:
>
> diff --git a/builtin/grep.c b/builtin/grep.c
> index 601f801158..52b45de749 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -427,6 +427,11 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
>         if (repo_submodule_init(&submodule, superproject, path))
>                 return 0;
>
> +       grep_read_lock();
> +       /*
> +        * NEEDSWORK: repo_read_gitmodules accesses the object store which is
> +        * global, thus it needs to be protected.
> +        */
>         repo_read_gitmodules(&submodule);
>
>         /*
> @@ -439,7 +444,6 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
>          * store is no longer global and instead is a member of the repository
>          * object.
>          */
> -       grep_read_lock();
>         add_to_alternates_memory(submodule.objects->objectdir);
>         grep_read_unlock();

I think this is in line with how the grep codepath protects itself
when doing anything that accesses the object store.

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-18 17:12   ` SZEDER Gábor
  2018-09-19 19:24     ` Junio C Hamano
  2018-09-20 15:35     ` Antonio Ospite
@ 2018-09-24 10:20     ` Antonio Ospite
  2018-09-24 21:00       ` Stefan Beller
  2 siblings, 1 reply; 23+ messages in thread
From: Antonio Ospite @ 2018-09-24 10:20 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: gitster, git, Brandon Williams, Daniel Graña,
	Jonathan Nieder, Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason

On Tue, 18 Sep 2018 19:12:57 +0200
SZEDER Gábor <szeder.dev@gmail.com> wrote:

[...]
> On Mon, Sep 17, 2018 at 04:09:40PM +0200, Antonio Ospite wrote:
> > When the .gitmodules file is not available in the working tree, try
> > using the content from the index and from the current branch.
> 
> "from the index and from the current branch" of which repository?
>

I took a look, some comments below.

> > diff --git a/submodule-config.c b/submodule-config.c
> > index 61a555e920..bdb1d0e2c9 100644
> > --- a/submodule-config.c
> > +++ b/submodule-config.c
> 
> > @@ -603,8 +604,21 @@ static void submodule_cache_check_init(struct repository *repo)
> >  static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void *data)
> >  {
[...]
> > +		file = repo_worktree_path(repo, GITMODULES_FILE);
> > +		if (file_exists(file))
> > +			config_source.file = file;
> > +		else if (get_oid(GITMODULES_INDEX, &oid) >= 0)
> > +			config_source.blob = GITMODULES_INDEX;
> > +		else if (get_oid(GITMODULES_HEAD, &oid) >= 0)
> > +			config_source.blob = GITMODULES_HEAD;
> > +
> 
> The repository used in t7814 contains nested submodules, which means
> that config_from_gitmodules() is invoked three times.
> 
> Now, the first two of those calls look at the superproject and at
> 'submodule', and find the existing files '.../trash
> directory.t7814-grep-recurse-submodules/.gitmodules' and '.../trash
> directory.t7814-grep-recurse-submodules/submodule/.gitmodules',
> respectively.  So far so good.
> 
> The third call, however, looks at the nested submodule at
> 'submodule/sub', which doesn't contain a '.gitmodules' file.  So this
> function goes on with the second condition and calls
> get_oid(GITMODULES_INDEX, &oid), which then appears to find the blob
> in the _superproject's_ index.
>

You are correct.

This is a limitation of the object store in git, there is no equivalent
of get_oid() to get the oid from a specific repository and this affects
config_with_options too when the config source is a blob.

This does not affect commands called via "git -C submodule_dir cmd"
because in that case the chdir happens before the_repository is set up,
for instance "git-submodule $SOMETHING --recursive" commands seem to
change the working directory before the recursion.

> > +		config_with_options(fn, data, &config_source, &opts);
> > +
> >  		free(file);
> >  	}
> >  }
> I'm no expert on submodules, but my gut feeling says that this can't
> be right.  But if it _is_ right, then I would say that the commit
> message should explain in detail, why it is right.
> 

I agree it isn't right, I didn't consider the case of nested
submodules, but even if I did the current git design does not
allow to correctly solve the problem: "read config from blob of an
arbitrary repository".

So what to do for the time being?

The issue is there but in a "normal" scenario it is not causing any real
harm because:

  1. currently it is exposed "only" by git grep and nested submodules.

  2. the new mechanism works fine when reading the submodule config for
     the root repository.

  3. the new mechanism does not "usually" impact non-leaf nested
     submodules, because the .gitmodules file is "normally" there.

  4. git grep never *uses* the GITMODULES_INDEX erroneously read
     from the root project when scanning the _leaf_ nested submodule
     because there are no further submodules down the line, the
     following check fails in builtin/grep.c:

       if (recurse_submodules && S_ISGITLINK(entry.mode) ...
       ...

In fact, because of 4. the test suite passes even if the gitmodule
config is not correct for the leaf submodule.

Actually 4. makes me think that the repo_read_gitmodules() call in
builtin/grep.c might not be strictly necessary, its purpose seems to be
to *anticipate* reading the config of *child* submodules while still
processing the *current* submodule, the config for the *current*
submodule was already read from the superproject by the immediately
preceding repo_submodule_init(), via:

  repo_submodule_init()
    submodule_from_path()
      gitmodules_read_check()
        repo_read_gitmodules()

And this would happen anyway also for child submodules down the
recursion path if we removed repo_read_gitmodules() in builtin/grep.c,
the operation would be not protected by grep_read_lock() tho.

The test suite passes even after removing repo_read_gitmodules()
entirely from builtin/grep.c, but I am still not confident that I get
all the implication of why that call was originally added in commit
f9ee2fcdfa (grep: recurse in-process using 'struct repository',
2017-08-02).

Anyways, even if we removed the call we would prevent the problem from
happening in the test suite, but not in the real world, in case non-leaf
submodules without .gitmodules in their working tree.

To recap:
  - patch 9/9 exposes a problem with the object store but for now it's
    only a potential problem in the future case that someone wanted to
    use *nested* submodules without .gitmodules in the working tree.
  - the new code should not affect current users which
    assume .gitmodules to be in the working tree for nested submodules;
  - even if we removed the call to repo_read_gitmodules() call in
    builtin/grep.c we would not avoid the problem entirely, just avoid
    it for the the _leaf_ submodules in case of nested submodules.

Selfishly, I'd propose to still merge the changes (I can send a v6 with
the locking fix in) and I'll write a test_expect_failure snippet to
document the problem Gábor spotted so we remember about it and fix it
when the object store can be accessed per-repository.

I am afraid I cannot look into the core issue about the object store in
my free time, however if someone wanted to sponsor some time I might
consider taking a stab at it.

Ciao,
   Antonio

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper
  2018-09-17 14:09 ` [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper Antonio Ospite
@ 2018-09-24 10:25   ` Antonio Ospite
  2018-09-24 23:06     ` Stefan Beller
  0 siblings, 1 reply; 23+ messages in thread
From: Antonio Ospite @ 2018-09-24 10:25 UTC (permalink / raw)
  To: gitster
  Cc: git, Brandon Williams, Daniel Graña, Jonathan Nieder,
	Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason, Antonio Ospite

On Mon, 17 Sep 2018 16:09:32 +0200
Antonio Ospite <ao2@ao2.it> wrote:

> Add a new print_config_from_gitmodules() helper function to print values
> from .gitmodules just like "git config -f .gitmodules" would.
> 
[...]

> +int print_config_from_gitmodules(const char *key)

I am thinking about adding  a "struct repository" argument to this
function

> +{
> +	int ret;
> +	char *store_key;
> +
> +	ret = git_config_parse_key(key, &store_key, NULL);
> +	if (ret < 0)
> +		return CONFIG_INVALID_KEY;
> +
> +	config_from_gitmodules(config_print_callback, the_repository, store_key);

And use it here, to avoid another usage of "the_repository" when it's
not strictly necessary.

Ciao,
   Antonio

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-24 10:20     ` Antonio Ospite
@ 2018-09-24 21:00       ` Stefan Beller
  2018-09-27 14:44         ` Antonio Ospite
  0 siblings, 1 reply; 23+ messages in thread
From: Stefan Beller @ 2018-09-24 21:00 UTC (permalink / raw)
  To: Antonio Ospite
  Cc: SZEDER Gábor, Junio C Hamano, git, Brandon Williams,
	Daniel Graña, Jonathan Nieder, Richard Hartmann,
	Ævar Arnfjörð Bjarmason

On Mon, Sep 24, 2018 at 3:20 AM Antonio Ospite <ao2@ao2.it> wrote:

> > The third call, however, looks at the nested submodule at
> > 'submodule/sub', which doesn't contain a '.gitmodules' file.  So this
> > function goes on with the second condition and calls
> > get_oid(GITMODULES_INDEX, &oid), which then appears to find the blob
> > in the _superproject's_ index.
> >
>
> You are correct.
>
> This is a limitation of the object store in git, there is no equivalent
> of get_oid() to get the oid from a specific repository and this affects
> config_with_options too when the config source is a blob.

Not yet, as there is a big push to pass-through an object-store object
or similar recently and rely less on global variables.
I am not sure I get to this code, though.

> This does not affect commands called via "git -C submodule_dir cmd"
> because in that case the chdir happens before the_repository is set up,
> for instance "git-submodule $SOMETHING --recursive" commands seem to
> change the working directory before the recursion.

For this it may be worth looking into the option
       --super-prefix=<path>
  Currently for internal use only. Set a prefix which gives a
  path from above a repository down to its root. One use is
  to give submodules context about the superproject that
  invoked it.

the whole motion of moving to in-process deprecates this clunky
API to pass around strings to subprocesses.

> The test suite passes even after removing repo_read_gitmodules()
> entirely from builtin/grep.c, but I am still not confident that I get
> all the implication of why that call was originally added in commit
> f9ee2fcdfa (grep: recurse in-process using 'struct repository',
> 2017-08-02).

If you checkout that commit and remove the call to repo_read_gitmodules
and then call git-grep in a superproject with nested submodules, you
get a segfault.

On master (and deleting out that line) you do not get the segfault,
I think praise goes to ff6f1f564c4 (submodule-config: lazy-load a
repository's .gitmodules file, 2017-08-03) which happened shortly
after f9ee2fcdfa.

It showcased that it worked by converting ls-files, but left out grep.

So I think based on ff6f1f564c4 it is safe to remove all calls to
repo_read_gitmodules.

> Anyways, even if we removed the call we would prevent the problem from
> happening in the test suite, but not in the real world, in case non-leaf
> submodules without .gitmodules in their working tree.

Quite frankly I think grep was just overlooked in review of
https://public-inbox.org/git/20170803182000.179328-14-bmwill@google.com/

Stefan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper
  2018-09-24 10:25   ` Antonio Ospite
@ 2018-09-24 23:06     ` Stefan Beller
  0 siblings, 0 replies; 23+ messages in thread
From: Stefan Beller @ 2018-09-24 23:06 UTC (permalink / raw)
  To: Antonio Ospite
  Cc: Junio C Hamano, git, Brandon Williams, Daniel Graña,
	Jonathan Nieder, Richard Hartmann,
	Ævar Arnfjörð Bjarmason

> > +int print_config_from_gitmodules(const char *key)
>
> I am thinking about adding  a "struct repository" argument to this
> function

Sounds like a good idea.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-24 21:00       ` Stefan Beller
@ 2018-09-27 14:44         ` Antonio Ospite
  2018-09-27 18:00           ` Stefan Beller
  0 siblings, 1 reply; 23+ messages in thread
From: Antonio Ospite @ 2018-09-27 14:44 UTC (permalink / raw)
  To: Stefan Beller
  Cc: SZEDER Gábor, Junio C Hamano, git, Brandon Williams,
	Daniel Graña, Jonathan Nieder, Richard Hartmann,
	Ævar Arnfjörð Bjarmason

Hi Stefan,

On Mon, 24 Sep 2018 14:00:50 -0700
Stefan Beller <sbeller@google.com> wrote:

> On Mon, Sep 24, 2018 at 3:20 AM Antonio Ospite <ao2@ao2.it> wrote:
> 
[...]
> > This is a limitation of the object store in git, there is no equivalent
> > of get_oid() to get the oid from a specific repository and this affects
> > config_with_options too when the config source is a blob.
> 
> Not yet, as there is a big push to pass-through an object-store object
> or similar recently and rely less on global variables.
> I am not sure I get to this code, though.
>

If you end up touching get_oid() please CC me.

> > This does not affect commands called via "git -C submodule_dir cmd"
> > because in that case the chdir happens before the_repository is set up,
> > for instance "git-submodule $SOMETHING --recursive" commands seem to
> > change the working directory before the recursion.
> 
> For this it may be worth looking into the option
>        --super-prefix=<path>
>   Currently for internal use only. Set a prefix which gives a
>   path from above a repository down to its root. One use is
>   to give submodules context about the superproject that
>   invoked it.
>

My comment wanted to highlight that there are NO problems in the
mentioned cases:

  - git -C submodule_dir cmd
  - git submodule cmd --recursive

Are you suggesting to look into super-prefix for any reason in
particular?

[...]
> > The test suite passes even after removing repo_read_gitmodules()
> > entirely from builtin/grep.c, but I am still not confident that I get
> > all the implication of why that call was originally added in commit
> > f9ee2fcdfa (grep: recurse in-process using 'struct repository',
> > 2017-08-02).
> 
> If you checkout that commit and remove the call to repo_read_gitmodules
> and then call git-grep in a superproject with nested submodules, you
> get a segfault.
> 
> On master (and deleting out that line) you do not get the segfault,
> I think praise goes to ff6f1f564c4 (submodule-config: lazy-load a
> repository's .gitmodules file, 2017-08-03) which happened shortly
> after f9ee2fcdfa.
> 
> It showcased that it worked by converting ls-files, but left out grep.
> 
> So I think based on ff6f1f564c4 it is safe to remove all calls to
> repo_read_gitmodules.
>

Thanks for confirming.

> > Anyways, even if we removed the call we would prevent the problem from
> > happening in the test suite, but not in the real world, in case non-leaf
> > submodules without .gitmodules in their working tree.
> 
> Quite frankly I think grep was just overlooked in review of
> https://public-inbox.org/git/20170803182000.179328-14-bmwill@google.com/
> 

OK, so the plan for v6 is:

  - avoid the corruption issues spotted by Gábor by removing the call
    to repo_read_gitmodules in builtin/grep.c (this still does not fix
    the potential problem with nested submodules).

  - add a new test-tool which better exercises the new
    config_from_gitmodules code,

  - add also a test_expect_failure test to document the use case that
    cannot be supported yet: nested submodules without .gitmodules in
    their working tree.

Thanks,
   Antonio

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-21 16:19       ` Junio C Hamano
@ 2018-09-27 14:49         ` Antonio Ospite
  0 siblings, 0 replies; 23+ messages in thread
From: Antonio Ospite @ 2018-09-27 14:49 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, git, Brandon Williams, Daniel Graña,
	Jonathan Nieder, Richard Hartmann, Stefan Beller,
	Ævar Arnfjörð Bjarmason

On Fri, 21 Sep 2018 09:19:45 -0700
Junio C Hamano <gitster@pobox.com> wrote:

> Antonio Ospite <ao2@ao2.it> writes:
> 
> > Protecting the problematic submodules function could work for now, but
> > I'd like to have more comments, my proposal is:
> >
> > diff --git a/builtin/grep.c b/builtin/grep.c
> > index 601f801158..52b45de749 100644
> > --- a/builtin/grep.c
> > +++ b/builtin/grep.c
> > @@ -427,6 +427,11 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
> >         if (repo_submodule_init(&submodule, superproject, path))
> >                 return 0;
> >
> > +       grep_read_lock();
> > +       /*
> > +        * NEEDSWORK: repo_read_gitmodules accesses the object store which is
> > +        * global, thus it needs to be protected.
> > +        */
> >         repo_read_gitmodules(&submodule);
> >
> >         /*
> > @@ -439,7 +444,6 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
> >          * store is no longer global and instead is a member of the repository
> >          * object.
> >          */
> > -       grep_read_lock();
> >         add_to_alternates_memory(submodule.objects->objectdir);
> >         grep_read_unlock();
> 
> I think this is in line with how the grep codepath protects itself
> when doing anything that accesses the object store.
> 

Thanks for the comment.

However, after confirming with Stefan Beller, I think we are going to
solve the corruption issue by removing this call to
repo_read_gitmodules(), which is not strictly necessary:

https://public-inbox.org/git/CAGZ79kZaomuE3p1puznM1x+hu-w4O+ZqeGUODBDj=-R3Z1hDzg@mail.gmail.com/

Thanks,
   Antonio

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-27 14:44         ` Antonio Ospite
@ 2018-09-27 18:00           ` Stefan Beller
  2018-10-01 15:45             ` Antonio Ospite
  0 siblings, 1 reply; 23+ messages in thread
From: Stefan Beller @ 2018-09-27 18:00 UTC (permalink / raw)
  To: Antonio Ospite
  Cc: SZEDER Gábor, Junio C Hamano, git, Brandon Williams,
	Daniel Graña, Jonathan Nieder, Richard Hartmann,
	Ævar Arnfjörð Bjarmason

On Thu, Sep 27, 2018 at 7:44 AM Antonio Ospite <ao2@ao2.it> wrote:
>
> If you end up touching get_oid() please CC me.

noted. I am not sure I'll touch it anytime soon, though.

>
> Are you suggesting to look into super-prefix for any reason in
> particular?

No, I misread the intent of that part of your message

> >
> > So I think based on ff6f1f564c4 it is safe to remove all calls to
> > repo_read_gitmodules.
> >
>
> Thanks for confirming.
>

> OK, so the plan for v6 is:
>
>   - avoid the corruption issues spotted by Gábor by removing the call
>     to repo_read_gitmodules in builtin/grep.c (this still does not fix
>     the potential problem with nested submodules).
>
>   - add a new test-tool which better exercises the new
>     config_from_gitmodules code,

Sounds good.

>
>   - add also a test_expect_failure test to document the use case that
>     cannot be supported yet: nested submodules without .gitmodules in
>     their working tree.

Personally I would want to live in a world where we don't *have* to nor
*want* to support submodules without .gitmodules in the respective
superproject.

We did support some use cases historically that I would make sure to
continue to support, but I am not sure how much effort we want to spend
on supporting further use cases of incomplete submodules.

Feel free to do so, as such tests help to document the boundaries.

Stefan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-09-27 18:00           ` Stefan Beller
@ 2018-10-01 15:45             ` Antonio Ospite
  2018-10-01 19:42               ` Stefan Beller
  0 siblings, 1 reply; 23+ messages in thread
From: Antonio Ospite @ 2018-10-01 15:45 UTC (permalink / raw)
  To: Stefan Beller
  Cc: SZEDER Gábor, Junio C Hamano, git, Brandon Williams,
	Daniel Graña, Jonathan Nieder, Richard Hartmann,
	Ævar Arnfjörð Bjarmason

On Thu, 27 Sep 2018 11:00:52 -0700
Stefan Beller <sbeller@google.com> wrote:

> On Thu, Sep 27, 2018 at 7:44 AM Antonio Ospite <ao2@ao2.it> wrote:
[...]
> > OK, so the plan for v6 is:
> >
> >   - avoid the corruption issues spotted by Gábor by removing the call
> >     to repo_read_gitmodules in builtin/grep.c (this still does not fix
> >     the potential problem with nested submodules).
> >

Actually that is not enough to fix the inconsistent access to the
object store: the functions is_submodule_active() and
repo_submodule_init() too end up calling config_from_gitmodules() and
need protecting as well, so I am going to put them under the git read
lock and leave repo_read_gitmodules() in place for now.

Removing unneeded code can go in a possible stand-alone patch.

> >   - add a new test-tool which better exercises the new
> >     config_from_gitmodules code,
> 
> Sounds good.
> 
> >
> >   - add also a test_expect_failure test to document the use case that
> >     cannot be supported yet: nested submodules without .gitmodules in
> >     their working tree.
> 
> Personally I would want to live in a world where we don't *have* to nor
> *want* to support submodules without .gitmodules in the respective
> superproject.
>

Just to double check: are you referring to *nested* submodules in the
sentence above?

I am asking because the whole point of this patchset is to *enable* the
use of submodules without .gitmodules in the working tree of the
superproject. :)

It's just that current limitations in git do not allow to support this
for *nested* submodules yet.

> We did support some use cases historically that I would make sure to
> continue to support, but I am not sure how much effort we want to spend
> on supporting further use cases of incomplete submodules.
>
> Feel free to do so, as such tests help to document the boundaries.
> 

Let's see how v6 turns out.

Thanks,
   Antonio

-- 
Antonio Ospite
https://ao2.it
https://twitter.com/ao2it

A: Because it messes up the order in which people normally read text.
   See http://en.wikipedia.org/wiki/Posting_style
Q: Why is top-posting such a bad thing?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
  2018-10-01 15:45             ` Antonio Ospite
@ 2018-10-01 19:42               ` Stefan Beller
  0 siblings, 0 replies; 23+ messages in thread
From: Stefan Beller @ 2018-10-01 19:42 UTC (permalink / raw)
  To: Antonio Ospite
  Cc: SZEDER Gábor, Junio C Hamano, git, Brandon Williams,
	Daniel Graña, Jonathan Nieder, Richard Hartmann,
	Ævar Arnfjörð Bjarmason

On Mon, Oct 1, 2018 at 8:45 AM Antonio Ospite <ao2@ao2.it> wrote:
>
> On Thu, 27 Sep 2018 11:00:52 -0700
> Stefan Beller <sbeller@google.com> wrote:
>
> > On Thu, Sep 27, 2018 at 7:44 AM Antonio Ospite <ao2@ao2.it> wrote:
> [...]
> > > OK, so the plan for v6 is:
> > >
> > >   - avoid the corruption issues spotted by Gábor by removing the call
> > >     to repo_read_gitmodules in builtin/grep.c (this still does not fix
> > >     the potential problem with nested submodules).
> > >
>
> Actually that is not enough to fix the inconsistent access to the
> object store: the functions is_submodule_active() and
> repo_submodule_init() too end up calling config_from_gitmodules() and
> need protecting as well, so I am going to put them under the git read
> lock and leave repo_read_gitmodules() in place for now.
>

>
> I am asking because the whole point of this patchset is to *enable* the
> use of submodules without .gitmodules in the working tree of the
> superproject. :)

I was imprecise and meant to
s/.gitmodules/mechanism to configure the name <-> path mapping/

In this series, the .gitmodules may not be present in the working tree,
but it is still there in the repo. Later we may want to rename that file
or put it into a magic branch, and I'd still find it a good idea.
What I find a bad idea is to have only a gitlink and a repo put
into that path and expect that it magically works as then it is not
a submodule, but some "halfway there thing". We need to have
an explicit "yes this is a submodule" statement, (which currently
comes from the .gitmodules file in the working tree), and I am not
attached to where it comes from, but that it exists.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2018-10-01 19:43 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper Antonio Ospite
2018-09-24 10:25   ` Antonio Ospite
2018-09-24 23:06     ` Stefan Beller
2018-09-17 14:09 ` [PATCH v5 2/9] submodule: factor out a config_set_in_gitmodules_file_gently function Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 3/9] t7411: merge tests 5 and 6 Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 4/9] t7411: be nicer to future tests and really clean things up Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 5/9] submodule--helper: add a new 'config' subcommand Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 6/9] submodule: use the 'submodule--helper config' command Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 7/9] t7506: clean up .gitmodules properly before setting up new scenario Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 8/9] submodule: add a helper to check if it is safe to write to .gitmodules Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree Antonio Ospite
2018-09-18 17:12   ` SZEDER Gábor
2018-09-19 19:24     ` Junio C Hamano
2018-09-20 15:35     ` Antonio Ospite
2018-09-21 16:19       ` Junio C Hamano
2018-09-27 14:49         ` Antonio Ospite
2018-09-24 10:20     ` Antonio Ospite
2018-09-24 21:00       ` Stefan Beller
2018-09-27 14:44         ` Antonio Ospite
2018-09-27 18:00           ` Stefan Beller
2018-10-01 15:45             ` Antonio Ospite
2018-10-01 19:42               ` Stefan Beller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).