git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC][PATCH 0/2] rm: changes in the '.gitmodules' are staged after using '--cached'
@ 2021-02-18 18:49 Shourya Shukla
  2021-02-18 18:49 ` [PATCH 1/2] " Shourya Shukla
                   ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Shourya Shukla @ 2021-02-18 18:49 UTC (permalink / raw)
  To: git; +Cc: gitster, christian.couder, levraiphilippeblain, Shourya Shukla

Hello all,

In the series of mail exchanges between me, Junio and Phillipe:
https://lore.kernel.org/git/20210207144144.GA42182@konoha/

The outcome of the BUG(?) reported by Javier Mora in the following mail:
https://lore.kernel.org/git/ea91c2ea29064079914f6a522db5115a@UUSALE0Z.utcmail.com/

Was not fully decided (i.e., whether it should be fixed or not) due to
the potential "pollution" of the 'git rm' command. Here is my patch
series attempting to fix the situation reported by Javier and make sure
that doing a 'git rm --cached <submodule>' deletes the entry of the
submodule in question from the '.gitmodules'. I have tried to keep the
changes as precise as possible without adding much extra stuff.
Reviews and comments are appreciated.

Thank you Phillipe, Junio and Christian for their comments on the patch.

Regards,
Shourya Shukla

Shourya Shukla (2):
  rm: changes in the '.gitmodules' are staged after using '--cached'
  t3600: amend test 46 to check for '.gitmodules' modification

 builtin/rm.c  | 48 +++++++++++++++++++++++++++---------------------
 t/t3600-rm.sh |  7 +++----
 2 files changed, 30 insertions(+), 25 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/2] rm: changes in the '.gitmodules' are staged after using '--cached'
  2021-02-18 18:49 [RFC][PATCH 0/2] rm: changes in the '.gitmodules' are staged after using '--cached' Shourya Shukla
@ 2021-02-18 18:49 ` Shourya Shukla
  2021-02-18 20:14   ` Philippe Blain
  2021-02-18 22:03   ` Junio C Hamano
  2021-02-18 18:49 ` [PATCH 2/2] t3600: amend test 46 to check for '.gitmodules' modification Shourya Shukla
  2021-02-22 17:26 ` [PATCH v2 0/1] rm: stage submodule removal from '.gitmodules' Shourya Shukla
  2 siblings, 2 replies; 20+ messages in thread
From: Shourya Shukla @ 2021-02-18 18:49 UTC (permalink / raw)
  To: git
  Cc: gitster, christian.couder, levraiphilippeblain, Shourya Shukla,
	Javier Mora

Earlier, on doing a 'git rm --cached <submodule>' did not modify the
'.gitmodules' entry of the submodule in question hence the file was not
staged. Change this behaviour to remove the entry of the submodule from
the '.gitmodules', something which might be more expected of the
command.

Reported-by: Javier Mora <javier.moradesambricio@rtx.com>
Signed-off-by: Shourya Shukla <periperidip@gmail.com>
---
 builtin/rm.c | 48 +++++++++++++++++++++++++++---------------------
 1 file changed, 27 insertions(+), 21 deletions(-)

diff --git a/builtin/rm.c b/builtin/rm.c
index 4858631e0f..0b74f50bfe 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -254,7 +254,7 @@ static struct option builtin_rm_options[] = {
 int cmd_rm(int argc, const char **argv, const char *prefix)
 {
 	struct lock_file lock_file = LOCK_INIT;
-	int i;
+	int i, removed = 0, gitmodules_modified = 0;
 	struct pathspec pathspec;
 	char *seen;
 
@@ -365,30 +365,32 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	if (show_only)
 		return 0;
 
-	/*
-	 * Then, unless we used "--cached", remove the filenames from
-	 * the workspace. If we fail to remove the first one, we
-	 * abort the "git rm" (but once we've successfully removed
-	 * any file at all, we'll go ahead and commit to it all:
-	 * by then we've already committed ourselves and can't fail
-	 * in the middle)
-	 */
-	if (!index_only) {
-		int removed = 0, gitmodules_modified = 0;
-		struct strbuf buf = STRBUF_INIT;
-		for (i = 0; i < list.nr; i++) {
-			const char *path = list.entry[i].name;
-			if (list.entry[i].is_submodule) {
+	for (i = 0; i < list.nr; i++) {
+		const char *path = list.entry[i].name;
+		if (list.entry[i].is_submodule) {
+			/*
+			 * Then, unless we used "--cached", remove the filenames from
+			 * the workspace. If we fail to remove the first one, we
+			 * abort the "git rm" (but once we've successfully removed
+			 * any file at all, we'll go ahead and commit to it all:
+			 * by then we've already committed ourselves and can't fail
+			 * in the middle)
+			 */
+			if (!index_only) {
+				struct strbuf buf = STRBUF_INIT;
 				strbuf_reset(&buf);
 				strbuf_addstr(&buf, path);
 				if (remove_dir_recursively(&buf, 0))
 					die(_("could not remove '%s'"), path);
 
 				removed = 1;
-				if (!remove_path_from_gitmodules(path))
-					gitmodules_modified = 1;
-				continue;
+				strbuf_release(&buf);
 			}
+			if (!remove_path_from_gitmodules(path))
+				gitmodules_modified = 1;
+			continue;
+		}
+		if (!index_only) {
 			if (!remove_path(path)) {
 				removed = 1;
 				continue;
@@ -396,11 +398,15 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 			if (!removed)
 				die_errno("git rm: '%s'", path);
 		}
-		strbuf_release(&buf);
-		if (gitmodules_modified)
-			stage_updated_gitmodules(&the_index);
 	}
 
+	/*
+	 * Remove the entry of the submodule from the ".gitmodules" irrespective
+	 * whether "--cached" was passed or not.
+	 */
+	if (gitmodules_modified)
+		stage_updated_gitmodules(&the_index);
+
 	if (write_locked_index(&the_index, &lock_file,
 			       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 		die(_("Unable to write new index file"));
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/2] t3600: amend test 46 to check for '.gitmodules' modification
  2021-02-18 18:49 [RFC][PATCH 0/2] rm: changes in the '.gitmodules' are staged after using '--cached' Shourya Shukla
  2021-02-18 18:49 ` [PATCH 1/2] " Shourya Shukla
@ 2021-02-18 18:49 ` Shourya Shukla
  2021-02-18 20:21   ` Philippe Blain
  2021-02-22 17:26 ` [PATCH v2 0/1] rm: stage submodule removal from '.gitmodules' Shourya Shukla
  2 siblings, 1 reply; 20+ messages in thread
From: Shourya Shukla @ 2021-02-18 18:49 UTC (permalink / raw)
  To: git; +Cc: gitster, christian.couder, levraiphilippeblain, Shourya Shukla

Following commit e5a439dc71 (rm: changes in the '.gitmodules' are
staged after using '--cached', 2021-02-18), amend test 46 of the script
to ensure that the test also checks for '.gitmodules' modification after
a 'git rm --cached <submodule>' i.e., the entry of the submodule in
question is removed from the file.

Signed-off-by: Shourya Shukla <periperidip@gmail.com>
---
 t/t3600-rm.sh | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index 7547f11a5c..45aff97b90 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -309,6 +309,7 @@ cat >expect.modified_untracked <<EOF
 EOF
 
 cat >expect.cached <<EOF
+M  .gitmodules
 D  submod
 EOF
 
@@ -390,16 +391,14 @@ test_expect_success 'rm of a populated submodule with different HEAD fails unles
 	test_must_fail git config -f .gitmodules submodule.sub.path
 '
 
-test_expect_success 'rm --cached leaves work tree of populated submodules and .gitmodules alone' '
+test_expect_success 'rm --cached leaves work tree of populated submodules alone' '
 	git reset --hard &&
 	git submodule update &&
 	git rm --cached submod &&
 	test_path_is_dir submod &&
 	test_path_is_file submod/.git &&
 	git status -s -uno >actual &&
-	test_cmp expect.cached actual &&
-	git config -f .gitmodules submodule.sub.url &&
-	git config -f .gitmodules submodule.sub.path
+	test_cmp expect.cached actual
 '
 
 test_expect_success 'rm --dry-run does not touch the submodule or .gitmodules' '
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] rm: changes in the '.gitmodules' are staged after using '--cached'
  2021-02-18 18:49 ` [PATCH 1/2] " Shourya Shukla
@ 2021-02-18 20:14   ` Philippe Blain
  2021-02-18 20:39     ` Philippe Blain
  2021-02-19 15:19     ` Shourya Shukla
  2021-02-18 22:03   ` Junio C Hamano
  1 sibling, 2 replies; 20+ messages in thread
From: Philippe Blain @ 2021-02-18 20:14 UTC (permalink / raw)
  To: Shourya Shukla, git; +Cc: gitster, christian.couder, Javier Mora

Hello Shourya,

Le 2021-02-18 à 13:49, Shourya Shukla a écrit :
> Earlier, on doing a 'git rm --cached <submodule>' did not modify the
> '.gitmodules' entry of the submodule in question hence the file was not
> staged. Change this behaviour to remove the entry of the submodule from
> the '.gitmodules', something which might be more expected of the
> command.

We prefer using the imperative mood for the commit message title,
the present tense for describing the actual state of the code,
and finally the imperative mood again to give order to the code base
to change its behaviour [1]. So something like the following would fit more
into the project's conventions:


     rm: stage submodule removal from '.gitmodules' when using '--cached'

     Currently, using 'git rm --cached <submodule>' removes submodule <submodule> from the index
     and leaves the submodule working tree intact in the superproject working tree,
     but does not stage any changes to the '.gitmodules' file, in contrast to
     'git rm <submodule>', which removes both the submodule and its configuration
     in '.gitmodules' from the worktree and index.
     
     Fix this inconsistency by also staging the removal of the configuration of the
     submodule from the '.gitmodules' file, leaving the worktree copy intact, a behaviour
     which is more in line with what might be expected when using '--cached'.


However, this is *not* what you patch does; it also removes the relevant
section from the '.gitmodules' file *in the worktree*, which is not acceptable
because it is exactly contrary to what '--cached' means.

This was verified by running Javier's demonstration script that I included in the
Gitgitgadget issue [2], which I copy here:


~~~
rm -rf some_submodule top_repo

mkdir some_submodule
cd some_submodule
git init
echo hello > hello.txt
git add hello.txt
git commit -m 'First commit of submodule'
cd ..
mkdir top_repo
cd top_repo
git init
echo world > world.txt
git add world.txt
git commit -m 'First commit of top repo'
git submodule add ../some_submodule
git status  # both some_submodule and .gitmodules staged
git commit -m 'Added submodule'
git rm --cached some_submodule
git status  # only some_submodule staged
~~~

With your changes, at the end '.gitmodules' is modified in both the
worktree and the index, whereas we would want it to be modified
*only* in the index.

And we would want it to be staged for deletion (and only deleting the config
entry and keeping an empty ".gitmodules' file in the index)
if the user is removing the only submodule in the superproject.


> ---
>   builtin/rm.c | 48 +++++++++++++++++++++++++++---------------------
>   1 file changed, 27 insertions(+), 21 deletions(-)
> 

Once implemeted correctly (leaving the worktree version of '.gitmodules'
intact), that patch should also change the documentation to stay up-to-date,
since the "Submodules" section of Documentation/git-rm.txt states [3]:

     If it exists the submodule.<name> section in the gitmodules[5] file will
     also be removed and that file will be staged (unless --cached or -n are used).


Cheers,
Philippe.

[1] https://git-scm.com/docs/SubmittingPatches#describe-changes
[2] https://github.com/gitgitgadget/git/issues/750
[3] https://git-scm.com/docs/git-rm#_submodules

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] t3600: amend test 46 to check for '.gitmodules' modification
  2021-02-18 18:49 ` [PATCH 2/2] t3600: amend test 46 to check for '.gitmodules' modification Shourya Shukla
@ 2021-02-18 20:21   ` Philippe Blain
  0 siblings, 0 replies; 20+ messages in thread
From: Philippe Blain @ 2021-02-18 20:21 UTC (permalink / raw)
  To: Shourya Shukla, git; +Cc: gitster, christian.couder



Le 2021-02-18 à 13:49, Shourya Shukla a écrit :
> Following commit e5a439dc71 (rm: changes in the '.gitmodules' are
> staged after using '--cached', 2021-02-18), amend test 46 of the script
> to ensure that the test also checks for '.gitmodules' modification after
> a 'git rm --cached <submodule>' i.e., the entry of the submodule in
> question is removed from the file.
> 

You can't reference your previous commit by hash, since it has not yet
made its way to Git's master branch. Usually what is done in that case is writing
"In the previous commit, we fixed *** so that *** now does ***. Change *** accordingly"
or something like this.

However, in the present case the changes to the test should be squashed into
the changes to the code, if not the tests are broken when they are run
on patch 1/2. In this project *all* commits of a topic branch should pass
the test suite before the topic is merged, not just the tip commit.

Regarding the changes themselves, they should be tweaked along with patch 1/2
to test the correct behaviour (not modifying the working tree copy of '.gitmodules'.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] rm: changes in the '.gitmodules' are staged after using '--cached'
  2021-02-18 20:14   ` Philippe Blain
@ 2021-02-18 20:39     ` Philippe Blain
  2021-02-19 15:19     ` Shourya Shukla
  1 sibling, 0 replies; 20+ messages in thread
From: Philippe Blain @ 2021-02-18 20:39 UTC (permalink / raw)
  To: Shourya Shukla, git; +Cc: gitster, christian.couder, cousteaulecommandant

Le 2021-02-18 à 15:14, Philippe Blain a écrit :
> 
> And we would want it to be staged for deletion (and only deleting the config
> entry and keeping an empty ".gitmodules' file in the index)
> if the user is removing the only submodule in the superproject.

Sorry for the typo, a "not" is missing:

And we would want it to be staged for deletion (and *not* only deleting the config
entry and keeping an empty ".gitmodules' file in the index)
if the user is removing the only submodule in the superproject.

P.S. I CC'ed "cousteaulecommandant" who was CC'ed on the original bug report
since Javier's @rtx.com address now bounces.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] rm: changes in the '.gitmodules' are staged after using '--cached'
  2021-02-18 18:49 ` [PATCH 1/2] " Shourya Shukla
  2021-02-18 20:14   ` Philippe Blain
@ 2021-02-18 22:03   ` Junio C Hamano
  2021-02-19 15:24     ` Shourya Shukla
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2021-02-18 22:03 UTC (permalink / raw)
  To: Shourya Shukla; +Cc: git, christian.couder, levraiphilippeblain, Javier Mora

Shourya Shukla <periperidip@gmail.com> writes:

> +		if (list.entry[i].is_submodule) {
> +			/*
> +			 * Then, unless we used "--cached", remove the filenames from
> +			 * the workspace. If we fail to remove the first one, we
> +			 * abort the "git rm" (but once we've successfully removed
> +			 * any file at all, we'll go ahead and commit to it all:
> +			 * by then we've already committed ourselves and can't fail
> +			 * in the middle)
> +			 */
> +			if (!index_only) {
> +				struct strbuf buf = STRBUF_INIT;
>  				strbuf_reset(&buf);
>  				strbuf_addstr(&buf, path);
>  				if (remove_dir_recursively(&buf, 0))
>  					die(_("could not remove '%s'"), path);
>  
>  				removed = 1;
> -				if (!remove_path_from_gitmodules(path))
> -					gitmodules_modified = 1;
> -				continue;
> +				strbuf_release(&buf);

Since we won't come to this block when doing index_only, we are
allowed to touch the working tree contents and files.  We indeed do
"rm -rf" of the submodule working tree and touch .gitmodules file
that is in the working tree.

>  			}
> +			if (!remove_path_from_gitmodules(path))
> +				gitmodules_modified = 1;
> +			continue;

But this looks wrong.  It might be OK to remove from the .gitmodules
stored in the index, but I fail to see why it is justified to touch
the working tree file when "--cached" is given.

> +		}
> +		if (!index_only) {
>  			if (!remove_path(path)) {
>  				removed = 1;
>  				continue;
> @@ -396,11 +398,15 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
>  			if (!removed)
>  				die_errno("git rm: '%s'", path);
>  		}
> -		strbuf_release(&buf);
> -		if (gitmodules_modified)
> -			stage_updated_gitmodules(&the_index);

Assuming that it is somehow justifiable that removing the entry from
the .gitmodules in the index (again, I do not think it is
justifiable to remove from the working tree file), we no longer can
use stage_updated_gitmodules() helper as-is.

I think you'd need to

 - Add a variant of remove_path_from_gitmodules() that can remove
   the given submodule from the .gitmodules in the index entry
   without touching the working tree.  The change could be to update
   the function to take an extra "index_only" parameter and a
   pointer to an index_state instance, and

   (1) if !index_only, then edit the .gitmodules file in the working
       tree to remove the entry for path;

   (2) in both !index_only and index_only cases, read .gitmodules
       file from the index, edit to remove the entry for path, and
       add the result to the index.

   and return 0 for success (e.g. if path is not a submoudle or no
   entry for it is found in .gitmodules, it may return -1).

 - Since the previous point will maintain the correct contents in
   the index in all cases, get rid of gitmodules_modified and calls
   to stage_updated_gitmodules().  The call to write_locked_index()
   at the end will take care of the actual writing out of the index.

if we want to teach "rm --cached" to update only the index, and "rm"
to update both the index and the working tree, of ".gitmodules".

Having said that, I still do not think it is a good direction to go
to teach low level "rm", "mv" etc. to know about ".gitmodules" (yes,
yes, I know that some changes that I consider to be mistakes have
already happened---that does not mean we cannot correct our course
and it does not mean it is OK to make things even worse).  Such a
"how does a submodule get managed" policy decision belongs to the
"git submodule" subcommand, I would think.

Thanks.

> +	/*
> +	 * Remove the entry of the submodule from the ".gitmodules" irrespective
> +	 * whether "--cached" was passed or not.
> +	 */
> +	if (gitmodules_modified)
> +		stage_updated_gitmodules(&the_index);
> +
>  	if (write_locked_index(&the_index, &lock_file,
>  			       COMMIT_LOCK | SKIP_IF_UNCHANGED))
>  		die(_("Unable to write new index file"));

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] rm: changes in the '.gitmodules' are staged after using '--cached'
  2021-02-18 20:14   ` Philippe Blain
  2021-02-18 20:39     ` Philippe Blain
@ 2021-02-19 15:19     ` Shourya Shukla
  1 sibling, 0 replies; 20+ messages in thread
From: Shourya Shukla @ 2021-02-19 15:19 UTC (permalink / raw)
  To: Philippe Blain; +Cc: gitster, christian.couder, levraiphilippeblain, git

On 18/02 03:14, Philippe Blain wrote:
> Hello Shourya,
> 
> Le 2021-02-18 à 13:49, Shourya Shukla a écrit :
> > Earlier, on doing a 'git rm --cached <submodule>' did not modify the
> > '.gitmodules' entry of the submodule in question hence the file was not
> > staged. Change this behaviour to remove the entry of the submodule from
> > the '.gitmodules', something which might be more expected of the
> > command.
> 
> We prefer using the imperative mood for the commit message title,
> the present tense for describing the actual state of the code,
> and finally the imperative mood again to give order to the code base
> to change its behaviour [1]. So something like the following would fit more
> into the project's conventions:

I have no idea how I missed that one. Apologies, will make the change.

>     rm: stage submodule removal from '.gitmodules' when using '--cached'
> 
>     Currently, using 'git rm --cached <submodule>' removes submodule <submodule> from the index
>     and leaves the submodule working tree intact in the superproject working tree,
>     but does not stage any changes to the '.gitmodules' file, in contrast to
>     'git rm <submodule>', which removes both the submodule and its configuration
>     in '.gitmodules' from the worktree and index.
>     Fix this inconsistency by also staging the removal of the configuration of the
>     submodule from the '.gitmodules' file, leaving the worktree copy intact, a behaviour
>     which is more in line with what might be expected when using '--cached'.
> 

Okay. I will use the above message.

> However, this is *not* what you patch does; it also removes the relevant
> section from the '.gitmodules' file *in the worktree*, which is not acceptable
> because it is exactly contrary to what '--cached' means.
> 
> This was verified by running Javier's demonstration script that I included in the
> Gitgitgadget issue [2], which I copy here:
> 
> 
> ~~~
> rm -rf some_submodule top_repo
> 
> mkdir some_submodule
> cd some_submodule
> git init
> echo hello > hello.txt
> git add hello.txt
> git commit -m 'First commit of submodule'
> cd ..
> mkdir top_repo
> cd top_repo
> git init
> echo world > world.txt
> git add world.txt
> git commit -m 'First commit of top repo'
> git submodule add ../some_submodule
> git status  # both some_submodule and .gitmodules staged
> git commit -m 'Added submodule'
> git rm --cached some_submodule
> git status  # only some_submodule staged
> ~~~
> 
> With your changes, at the end '.gitmodules' is modified in both the
> worktree and the index, whereas we would want it to be modified
> *only* in the index.
> 
> And we would want it to be staged for deletion (and only deleting the config
> entry and keeping an empty ".gitmodules' file in the index)
> if the user is removing the only submodule in the superproject.

Correct.

> >   builtin/rm.c | 48 +++++++++++++++++++++++++++---------------------
> >   1 file changed, 27 insertions(+), 21 deletions(-)
> > 
> 
> Once implemeted correctly (leaving the worktree version of '.gitmodules'
> intact), that patch should also change the documentation to stay up-to-date,
> since the "Submodules" section of Documentation/git-rm.txt states [3]:
> 
>     If it exists the submodule.<name> section in the gitmodules[5] file will
>     also be removed and that file will be staged (unless --cached or -n are used).

Understood. I have to let the working tree '.gitmodules' be left as-is
and only change the copy in the index.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] rm: changes in the '.gitmodules' are staged after using '--cached'
  2021-02-18 22:03   ` Junio C Hamano
@ 2021-02-19 15:24     ` Shourya Shukla
  2021-02-20  3:31       ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Shourya Shukla @ 2021-02-19 15:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, christian.couder, levraiphilippeblain

On 18/02 02:03, Junio C Hamano wrote:
> Shourya Shukla <periperidip@gmail.com> writes:
> 
> > +		if (list.entry[i].is_submodule) {
> > +			/*
> > +			 * Then, unless we used "--cached", remove the filenames from
> > +			 * the workspace. If we fail to remove the first one, we
> > +			 * abort the "git rm" (but once we've successfully removed
> > +			 * any file at all, we'll go ahead and commit to it all:
> > +			 * by then we've already committed ourselves and can't fail
> > +			 * in the middle)
> > +			 */
> > +			if (!index_only) {
> > +				struct strbuf buf = STRBUF_INIT;
> >  				strbuf_reset(&buf);
> >  				strbuf_addstr(&buf, path);
> >  				if (remove_dir_recursively(&buf, 0))
> >  					die(_("could not remove '%s'"), path);
> >  
> >  				removed = 1;
> > -				if (!remove_path_from_gitmodules(path))
> > -					gitmodules_modified = 1;
> > -				continue;
> > +				strbuf_release(&buf);
> 
> Since we won't come to this block when doing index_only, we are
> allowed to touch the working tree contents and files.  We indeed do
> "rm -rf" of the submodule working tree and touch .gitmodules file
> that is in the working tree.
> 
> >  			}
> > +			if (!remove_path_from_gitmodules(path))
> > +				gitmodules_modified = 1;
> > +			continue;
> 
> But this looks wrong.  It might be OK to remove from the .gitmodules
> stored in the index, but I fail to see why it is justified to touch
> the working tree file when "--cached" is given.

No no, you are correct. Phillipe pointed out the same thing. I don't
know how I made this mistake.

> > +		}
> > +		if (!index_only) {
> >  			if (!remove_path(path)) {
> >  				removed = 1;
> >  				continue;
> > @@ -396,11 +398,15 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
> >  			if (!removed)
> >  				die_errno("git rm: '%s'", path);
> >  		}
> > -		strbuf_release(&buf);
> > -		if (gitmodules_modified)
> > -			stage_updated_gitmodules(&the_index);
> 
> Assuming that it is somehow justifiable that removing the entry from
> the .gitmodules in the index (again, I do not think it is
> justifiable to remove from the working tree file), we no longer can
> use stage_updated_gitmodules() helper as-is.
> 
> I think you'd need to
> 
>  - Add a variant of remove_path_from_gitmodules() that can remove
>    the given submodule from the .gitmodules in the index entry
>    without touching the working tree.  The change could be to update
>    the function to take an extra "index_only" parameter and a
>    pointer to an index_state instance, and
> 
>    (1) if !index_only, then edit the .gitmodules file in the working
>        tree to remove the entry for path;
> 
>    (2) in both !index_only and index_only cases, read .gitmodules
>        file from the index, edit to remove the entry for path, and
>        add the result to the index.
> 
>    and return 0 for success (e.g. if path is not a submoudle or no
>    entry for it is found in .gitmodules, it may return -1).
> 
>  - Since the previous point will maintain the correct contents in
>    the index in all cases, get rid of gitmodules_modified and calls
>    to stage_updated_gitmodules().  The call to write_locked_index()
>    at the end will take care of the actual writing out of the index.
> 
> if we want to teach "rm --cached" to update only the index, and "rm"
> to update both the index and the working tree, of ".gitmodules".

Yeah, this approach seems perfect. I will do it this way.

> Having said that, I still do not think it is a good direction to go
> to teach low level "rm", "mv" etc. to know about ".gitmodules" (yes,
> yes, I know that some changes that I consider to be mistakes have
> already happened---that does not mean we cannot correct our course
> and it does not mean it is OK to make things even worse).  Such a
> "how does a submodule get managed" policy decision belongs to the
> "git submodule" subcommand, I would think.


Let's do it this way. I will deliver a v2 of this patch, if we get
comments from anyone stating that this should not go forward, then we
will drop this patch or do what is suggested. Else, queue this patch for
now (given that this does not break anything, obviously) and maybe put
up a RFC for the method you suggested. I am saying this because we have
not received any conclusive evidence of whether this patch should carry
on or not (not trying to disregard your take).

What do you say?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 1/2] rm: changes in the '.gitmodules' are staged after using '--cached'
  2021-02-19 15:24     ` Shourya Shukla
@ 2021-02-20  3:31       ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2021-02-20  3:31 UTC (permalink / raw)
  To: Shourya Shukla; +Cc: git, christian.couder, levraiphilippeblain

Shourya Shukla <periperidip@gmail.com> writes:

>> Since we won't come to this block when doing index_only, we are
>> allowed to touch the working tree contents and files.  We indeed do
>> "rm -rf" of the submodule working tree and touch .gitmodules file
>> that is in the working tree.
>> 
>> >  			}
>> > +			if (!remove_path_from_gitmodules(path))
>> > +				gitmodules_modified = 1;
>> > +			continue;
>> 
>> But this looks wrong.  It might be OK to remove from the .gitmodules
>> stored in the index, but I fail to see why it is justified to touch
>> the working tree file when "--cached" is given.
>
> No no, you are correct. Phillipe pointed out the same thing. I don't
> know how I made this mistake.
> ...
>> I think you'd need to
>> ...
>
> Yeah, this approach seems perfect. I will do it this way.

OK, then let's go that way.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 0/1] rm: stage submodule removal from '.gitmodules'
  2021-02-18 18:49 [RFC][PATCH 0/2] rm: changes in the '.gitmodules' are staged after using '--cached' Shourya Shukla
  2021-02-18 18:49 ` [PATCH 1/2] " Shourya Shukla
  2021-02-18 18:49 ` [PATCH 2/2] t3600: amend test 46 to check for '.gitmodules' modification Shourya Shukla
@ 2021-02-22 17:26 ` Shourya Shukla
  2021-02-22 17:26   ` [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached' Shourya Shukla
  2 siblings, 1 reply; 20+ messages in thread
From: Shourya Shukla @ 2021-02-22 17:26 UTC (permalink / raw)
  To: periperidip; +Cc: christian.couder, git, gitster, levraiphilippeblain

Hello all,

This is the v2 of the patch with the same title. After suggestions from
Phillipe and Junio, I have improved the commit messages, squashed the
two commits and did the following:

	1. Change the definition and declaration of
	   'remove_path_from_gitmodules()' to account in for the
	   'index_only' variable denoting the presence of '--cached'
	   option in the 'git rm' command. In case of the variable being
	   1, remove the submodule entry from the index copy of the
	   '.gitmodules' else do the same for the working tree copy.

	2. Remove the 'gitmodules_modified' variable and instead call
	   'stage_updated_gitmodules()' just after the
	   'remove_path_from_gitmodules()' call.

	3. Account for the above changes in 't3600' and make changes in
	   the same.

I am facing some problem with point (2) in the sense that what Junio
suggested in his mail:
https://lore.kernel.org/git/xmqqblchdoej.fsf@gitster.g/

-----8<-----
 - Since the previous point will maintain the correct contents in
   the index in all cases, get rid of gitmodules_modified and calls
   to stage_updated_gitmodules().  The call to write_locked_index()
   at the end will take care of the actual writing out of the index.
----->8-----

I am not able to get rid of the 'stage_updated_gitmodules()' call
without failing tests in t3600 (t3600.4 is the first one to fail). What
am I doing wrong here?

Comments and reviews are appreciated. Thank you Phillipe and Junio for
the constructive feedback on the v1!

Regards,
Shourya Shukla

Shourya Shukla (1):
  rm: stage submodule removal from '.gitmodules' when using '--cached'

 builtin/rm.c  | 42 +++++++++++++++++++++---------------------
 submodule.c   |  5 +++--
 submodule.h   |  2 +-
 t/t3600-rm.sh |  6 ++----
 4 files changed, 27 insertions(+), 28 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached'
  2021-02-22 17:26 ` [PATCH v2 0/1] rm: stage submodule removal from '.gitmodules' Shourya Shukla
@ 2021-02-22 17:26   ` Shourya Shukla
  2021-02-22 18:58     ` Junio C Hamano
  2021-02-22 19:29     ` Junio C Hamano
  0 siblings, 2 replies; 20+ messages in thread
From: Shourya Shukla @ 2021-02-22 17:26 UTC (permalink / raw)
  To: periperidip
  Cc: christian.couder, git, gitster, levraiphilippeblain, Javier Mora

Currently, using 'git rm --cached <submodule>' removes the submodule
<submodule> from the index and leaves the submodule working tree
intact in the superproject working tree, but does not stage any
changes to the '.gitmodules' file, in contrast to 'git rm <submodule>',
which removes both the submodule and its configuration in '.gitmodules'
from the worktree and index.

Fix this inconsistency by also staging the removal of the entry of the
submodule from the '.gitmodules' file, leaving the worktree copy intact,
a behaviour which is more in line with what might be expected when
using '--cached'.

Achieve this by modifying the function 'remove_path_from_gitmodules()'
to also take in the parameter 'index_only' denoting the presence of
the '--cached' option. If present, remove the submodule entry from the
copy of the '.gitmodules' in the index otherwise, do the same for the
working tree copy.

While at it, also change the test 46 of the test script 't3600-rm.sh' to
incorporate for the above changes.

Reported-by: Javier Mora <javier.moradesambricio@rtx.com>
Helped-by: Phillipe Blain <levraiphilippeblain@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Shourya Shukla <periperidip@gmail.com>
---
 builtin/rm.c  | 42 +++++++++++++++++++++---------------------
 submodule.c   |  5 +++--
 submodule.h   |  2 +-
 t/t3600-rm.sh |  6 ++----
 4 files changed, 27 insertions(+), 28 deletions(-)

diff --git a/builtin/rm.c b/builtin/rm.c
index 4858631e0f..5854ef0996 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -254,7 +254,7 @@ static struct option builtin_rm_options[] = {
 int cmd_rm(int argc, const char **argv, const char *prefix)
 {
 	struct lock_file lock_file = LOCK_INIT;
-	int i;
+	int i, removed = 0;
 	struct pathspec pathspec;
 	char *seen;
 
@@ -365,30 +365,33 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	if (show_only)
 		return 0;
 
-	/*
-	 * Then, unless we used "--cached", remove the filenames from
-	 * the workspace. If we fail to remove the first one, we
-	 * abort the "git rm" (but once we've successfully removed
-	 * any file at all, we'll go ahead and commit to it all:
-	 * by then we've already committed ourselves and can't fail
-	 * in the middle)
-	 */
-	if (!index_only) {
-		int removed = 0, gitmodules_modified = 0;
-		struct strbuf buf = STRBUF_INIT;
-		for (i = 0; i < list.nr; i++) {
-			const char *path = list.entry[i].name;
-			if (list.entry[i].is_submodule) {
+	for (i = 0; i < list.nr; i++) {
+		const char *path = list.entry[i].name;
+		if (list.entry[i].is_submodule) {
+			/*
+			 * Then, unless we used "--cached", remove the filenames from
+			 * the workspace. If we fail to remove the first one, we
+			 * abort the "git rm" (but once we've successfully removed
+			 * any file at all, we'll go ahead and commit to it all:
+			 * by then we've already committed ourselves and can't fail
+			 * in the middle)
+			 */
+			if (!index_only) {
+				struct strbuf buf = STRBUF_INIT;
 				strbuf_reset(&buf);
 				strbuf_addstr(&buf, path);
 				if (remove_dir_recursively(&buf, 0))
 					die(_("could not remove '%s'"), path);
 
 				removed = 1;
-				if (!remove_path_from_gitmodules(path))
-					gitmodules_modified = 1;
-				continue;
+				strbuf_release(&buf);
 			}
+			if (!remove_path_from_gitmodules(path, index_only))
+				stage_updated_gitmodules(&the_index);
+
+			continue;
+		}
+		if (!index_only) {
 			if (!remove_path(path)) {
 				removed = 1;
 				continue;
@@ -396,9 +399,6 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 			if (!removed)
 				die_errno("git rm: '%s'", path);
 		}
-		strbuf_release(&buf);
-		if (gitmodules_modified)
-			stage_updated_gitmodules(&the_index);
 	}
 
 	if (write_locked_index(&the_index, &lock_file,
diff --git a/submodule.c b/submodule.c
index 9767ba9893..6ce8c8d0d8 100644
--- a/submodule.c
+++ b/submodule.c
@@ -131,7 +131,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
  * path is configured. Return 0 only if a .gitmodules file was found, a section
  * with the correct path=<path> setting was found and we could remove it.
  */
-int remove_path_from_gitmodules(const char *path)
+int remove_path_from_gitmodules(const char *path, int index_only)
 {
 	struct strbuf sect = STRBUF_INIT;
 	const struct submodule *submodule;
@@ -149,7 +149,8 @@ int remove_path_from_gitmodules(const char *path)
 	}
 	strbuf_addstr(&sect, "submodule.");
 	strbuf_addstr(&sect, submodule->name);
-	if (git_config_rename_section_in_file(GITMODULES_FILE, sect.buf, NULL) < 0) {
+	if (git_config_rename_section_in_file(index_only ? GITMODULES_INDEX :
+					      GITMODULES_FILE, sect.buf, NULL) < 0) {
 		/* Maybe the user already did that, don't error out here */
 		warning(_("Could not remove .gitmodules entry for %s"), path);
 		strbuf_release(&sect);
diff --git a/submodule.h b/submodule.h
index 4ac6e31cf1..4d8707d911 100644
--- a/submodule.h
+++ b/submodule.h
@@ -43,7 +43,7 @@ int is_gitmodules_unmerged(const struct index_state *istate);
 int is_writing_gitmodules_ok(void);
 int is_staging_gitmodules_ok(struct index_state *istate);
 int update_path_in_gitmodules(const char *oldpath, const char *newpath);
-int remove_path_from_gitmodules(const char *path);
+int remove_path_from_gitmodules(const char *path, int index_only);
 void stage_updated_gitmodules(struct index_state *istate);
 void set_diffopt_flags_from_submodule_config(struct diff_options *,
 					     const char *path);
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index 7547f11a5c..c0ca4be5a1 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -390,16 +390,14 @@ test_expect_success 'rm of a populated submodule with different HEAD fails unles
 	test_must_fail git config -f .gitmodules submodule.sub.path
 '
 
-test_expect_success 'rm --cached leaves work tree of populated submodules and .gitmodules alone' '
+test_expect_success 'rm --cached leaves work tree of populated submodules alone' '
 	git reset --hard &&
 	git submodule update &&
 	git rm --cached submod &&
 	test_path_is_dir submod &&
 	test_path_is_file submod/.git &&
 	git status -s -uno >actual &&
-	test_cmp expect.cached actual &&
-	git config -f .gitmodules submodule.sub.url &&
-	git config -f .gitmodules submodule.sub.path
+	test_cmp expect.cached actual
 '
 
 test_expect_success 'rm --dry-run does not touch the submodule or .gitmodules' '
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached'
  2021-02-22 17:26   ` [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached' Shourya Shukla
@ 2021-02-22 18:58     ` Junio C Hamano
  2021-03-05 17:58       ` Shourya Shukla
  2021-02-22 19:29     ` Junio C Hamano
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2021-02-22 18:58 UTC (permalink / raw)
  To: Shourya Shukla; +Cc: christian.couder, git, levraiphilippeblain, Javier Mora

Shourya Shukla <periperidip@gmail.com> writes:

> Currently, using 'git rm --cached <submodule>' removes the submodule
> <submodule> from the index and leaves the submodule working tree
> intact in the superproject working tree, but does not stage any
> changes to the '.gitmodules' file, in contrast to 'git rm <submodule>',
> which removes both the submodule and its configuration in '.gitmodules'
> from the worktree and index.
>
> Fix this inconsistency by also staging the removal of the entry of the
> submodule from the '.gitmodules' file, leaving the worktree copy intact,

The "also" above felt a bit puzzling, as we would be removing the
entry only from the indexed copy without touching the working tree
(by the way, I try to be precise in terminology between worktree and
working tree, and please follow suit.  A working tree is what you
have in a non-bare repository that let's you "less" "gcc" etc. on
the files checked out.  A worktree refers to the mechanism that lets
you have separate working tree by borrowing from a repository, or
refers to an instance of a working tree plus .git file created by
the mechanism.  You mean "working tree" in the above sentence), but
it refers to "remove the submodules directory and also entry", so it
is OK.

> diff --git a/builtin/rm.c b/builtin/rm.c
> index 4858631e0f..5854ef0996 100644
> --- a/builtin/rm.c
> +++ b/builtin/rm.c
> @@ -254,7 +254,7 @@ static struct option builtin_rm_options[] = {
>  int cmd_rm(int argc, const char **argv, const char *prefix)
>  {
>  	struct lock_file lock_file = LOCK_INIT;
> -	int i;
> +	int i, removed = 0;
>  	struct pathspec pathspec;
>  	char *seen;
>  
> @@ -365,30 +365,33 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
>  	if (show_only)
>  		return 0;
>  


> +	for (i = 0; i < list.nr; i++) {
> +		const char *path = list.entry[i].name;
> +		if (list.entry[i].is_submodule) {
> +			/*
> +			 * Then, unless we used "--cached", remove the filenames from
> +			 * the workspace. If we fail to remove the first one, we
> +			 * abort the "git rm" (but once we've successfully removed
> +			 * any file at all, we'll go ahead and commit to it all:
> +			 * by then we've already committed ourselves and can't fail
> +			 * in the middle)
> +			 */
> +			if (!index_only) {
> +				struct strbuf buf = STRBUF_INIT;
>  				strbuf_reset(&buf);
>  				strbuf_addstr(&buf, path);
>  				if (remove_dir_recursively(&buf, 0))
>  					die(_("could not remove '%s'"), path);
>  
>  				removed = 1;
> +				strbuf_release(&buf);

OK, so this part now only deals with the submodule directory.

>  			}
> +			if (!remove_path_from_gitmodules(path, index_only))
> +				stage_updated_gitmodules(&the_index);

And the entry for it in .gitmodules is handled by the helper,
whether --cached or not.

This somehow feels wrong for the index_only case; doesn't the helper
take contents from the .gitmodules in the working tree and add it to
the index?

Unless you touched stage_updated_gitmodules() not to do that in the
index_only case, that is.

> +			continue;

And that is all for what is done to a submodule.

Makes sense so far.

> +		}
> +		if (!index_only) {
>  			if (!remove_path(path)) {
>  				removed = 1;
>  				continue;
> @@ -396,9 +399,6 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
>  			if (!removed)
>  				die_errno("git rm: '%s'", path);
>  		}
> -		strbuf_release(&buf);
> -		if (gitmodules_modified)
> -			stage_updated_gitmodules(&the_index);

OK, because this should have been done where we called
remove_path_from_gitmodules().

>  	}
>  
>  	if (write_locked_index(&the_index, &lock_file,
> diff --git a/submodule.c b/submodule.c
> index 9767ba9893..6ce8c8d0d8 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -131,7 +131,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
>   * path is configured. Return 0 only if a .gitmodules file was found, a section
>   * with the correct path=<path> setting was found and we could remove it.
>   */
> -int remove_path_from_gitmodules(const char *path)
> +int remove_path_from_gitmodules(const char *path, int index_only)
>  {
>  	struct strbuf sect = STRBUF_INIT;
>  	const struct submodule *submodule;
> @@ -149,7 +149,8 @@ int remove_path_from_gitmodules(const char *path)
>  	}
>  	strbuf_addstr(&sect, "submodule.");
>  	strbuf_addstr(&sect, submodule->name);
> -	if (git_config_rename_section_in_file(GITMODULES_FILE, sect.buf, NULL) < 0) {
> +	if (git_config_rename_section_in_file(index_only ? GITMODULES_INDEX :
> +					      GITMODULES_FILE, sect.buf, NULL) < 0) {
>  		/* Maybe the user already did that, don't error out here */
>  		warning(_("Could not remove .gitmodules entry for %s"), path);
>  		strbuf_release(&sect);

When !index_only, do we have any guarantee that .gitmodules in the
working tree and .gitmodules in the index are in sync?  I somehow
doubt it.  

I would have expected that the updated remove_path_from_gitmodules()
would look more like:

 - only if !index_only, nuke the section for the submodule in
   .gitmodules in the working tree.

 - nuke the section for the submodule in .gitmodules in the
   index.

IOW, there would be two git_config_rename_section_in_file() calls to
remove the section in !index_only case.

Doing so would also mean that you should not have the caller call
stage_updated_gitmodules() at all, even in !index_only case.
Imagine if the .gitmodules file in the working tree had local
changes (e.g. registered a few more submodules, or updated the url
field of a few submodules) that are not yet added to the index when
"git rm" removed a submodule.  The user does not want them to be in
the index yet and "git rm" should not add these unrelated local
changes to the index.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached'
  2021-02-22 17:26   ` [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached' Shourya Shukla
  2021-02-22 18:58     ` Junio C Hamano
@ 2021-02-22 19:29     ` Junio C Hamano
  2021-03-07 16:46       ` Shourya Shukla
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2021-02-22 19:29 UTC (permalink / raw)
  To: Shourya Shukla; +Cc: christian.couder, git, levraiphilippeblain, Javier Mora

Shourya Shukla <periperidip@gmail.com> writes:

> +	if (git_config_rename_section_in_file(index_only ? GITMODULES_INDEX :
> +					      GITMODULES_FILE, sect.buf, NULL) < 0) {

Also, is it really sufficient to pass GITMODULES_INDEX as the first
argument to this function to tweak what is in the index?

git_config_copy_or_rename_section_in_file() which is the
implementation of that helper seems to always want to work with a
file that is on disk, by making unconditional calls to
hold_lock_file_for_update(), fopen(), fstat(), chmod(), etc.

So I suspect that there are much more work needed.  

It seems to me that the config editing API is one of the older and
hackier parts of the system and requires quite a lot of work to
teach it to work with anything but a on-disk file.  In the longer
term, it may be a good thing to clean it up, but I suspect that it
is way too much work for too little benefit to do so as a part of
this topic, so an easier way out for now would be to:

 - write out the .gitmodules in the index to a temporary file (learn
   how to correctly call entry.c::checkout_entry() by studying how
   builtin/checkout-index.c::checkout_file() calls it, especially to
   a temporary file with the --temp option).

 - use git_config_rename_section_in_file() on that temporary file to
   remove the section about the submodule.

 - read that temporary file back into memory and write it out as a
   blob object by calling sha1-file.c::write_object_file().

 - add that back to the index as .gitmodules (studying how
   builtin/update-index.c::add_cacheinfo() calls add_cache_entry()
   would be a good way to learn how to do this).

The working tree side can stay as is, but as I said in the earlier
message, I think you need to update the .gitmodules in the working
tree and .gitmodules in the index separately (and without doing any
equivalent of "git add .gitmodules").


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached'
  2021-02-22 18:58     ` Junio C Hamano
@ 2021-03-05 17:58       ` Shourya Shukla
  2021-03-05 21:39         ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Shourya Shukla @ 2021-03-05 17:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: christian.couder, git, levraiphilippeblain

Hi Junio!

Really really sorry for the late reply. I was busy in some personal
engagements and was travelling for the past 8-9 days.

On 22/02 10:58, Junio C Hamano wrote:
> Shourya Shukla <periperidip@gmail.com> writes:
> 
> > Currently, using 'git rm --cached <submodule>' removes the submodule
> > <submodule> from the index and leaves the submodule working tree
> > intact in the superproject working tree, but does not stage any
> > changes to the '.gitmodules' file, in contrast to 'git rm <submodule>',
> > which removes both the submodule and its configuration in '.gitmodules'
> > from the worktree and index.
> >
> > Fix this inconsistency by also staging the removal of the entry of the
> > submodule from the '.gitmodules' file, leaving the worktree copy intact,
> 
> The "also" above felt a bit puzzling, as we would be removing the
> entry only from the indexed copy without touching the working tree
> (by the way, I try to be precise in terminology between worktree and
> working tree, and please follow suit.  A working tree is what you
> have in a non-bare repository that let's you "less" "gcc" etc. on
> the files checked out.  A worktree refers to the mechanism that lets
> you have separate working tree by borrowing from a repository, or
> refers to an instance of a working tree plus .git file created by
> the mechanism.  You mean "working tree" in the above sentence), but
> it refers to "remove the submodules directory and also entry", so it
> is OK.

Sure. Will make it more precise and rather technically connect.

> And that is all for what is done to a submodule.
> 
> Makes sense so far.
> 
> > +		}
> > +		if (!index_only) {
> >  			if (!remove_path(path)) {
> >  				removed = 1;
> >  				continue;
> > @@ -396,9 +399,6 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
> >  			if (!removed)
> >  				die_errno("git rm: '%s'", path);
> >  		}
> > -		strbuf_release(&buf);
> > -		if (gitmodules_modified)
> > -			stage_updated_gitmodules(&the_index);
> 
> OK, because this should have been done where we called
> remove_path_from_gitmodules().
> 
> >  	}
> >  
> >  	if (write_locked_index(&the_index, &lock_file,
> > diff --git a/submodule.c b/submodule.c
> > index 9767ba9893..6ce8c8d0d8 100644
> > --- a/submodule.c
> > +++ b/submodule.c
> > @@ -131,7 +131,7 @@ int update_path_in_gitmodules(const char *oldpath, const char *newpath)
> >   * path is configured. Return 0 only if a .gitmodules file was found, a section
> >   * with the correct path=<path> setting was found and we could remove it.
> >   */
> > -int remove_path_from_gitmodules(const char *path)
> > +int remove_path_from_gitmodules(const char *path, int index_only)
> >  {
> >  	struct strbuf sect = STRBUF_INIT;
> >  	const struct submodule *submodule;
> > @@ -149,7 +149,8 @@ int remove_path_from_gitmodules(const char *path)
> >  	}
> >  	strbuf_addstr(&sect, "submodule.");
> >  	strbuf_addstr(&sect, submodule->name);
> > -	if (git_config_rename_section_in_file(GITMODULES_FILE, sect.buf, NULL) < 0) {
> > +	if (git_config_rename_section_in_file(index_only ? GITMODULES_INDEX :
> > +					      GITMODULES_FILE, sect.buf, NULL) < 0) {
> >  		/* Maybe the user already did that, don't error out here */
> >  		warning(_("Could not remove .gitmodules entry for %s"), path);
> >  		strbuf_release(&sect);
> 
> When !index_only, do we have any guarantee that .gitmodules in the
> working tree and .gitmodules in the index are in sync?  I somehow
> doubt it.  
> 
> I would have expected that the updated remove_path_from_gitmodules()
> would look more like:
> 
>  - only if !index_only, nuke the section for the submodule in
>    .gitmodules in the working tree.
> 
>  - nuke the section for the submodule in .gitmodules in the
>    index.
> 
> IOW, there would be two git_config_rename_section_in_file() calls to
> remove the section in !index_only case.
> 
> Doing so would also mean that you should not have the caller call
> stage_updated_gitmodules() at all, even in !index_only case.
> Imagine if the .gitmodules file in the working tree had local
> changes (e.g. registered a few more submodules, or updated the url
> field of a few submodules) that are not yet added to the index when
> "git rm" removed a submodule.  The user does not want them to be in
> the index yet and "git rm" should not add these unrelated local
> changes to the index.

Won't this be deviating from the current behaviour of 'git rm'?
Currently, the above case won't process and the user will be asked to
stage or undo the mods they made before moving forward. If I am not
mistaken, won't we deviate from the case if we do the above? As of now,
I tried this:

	if (!index_only) {
		if (git_config_rename_section_in_file(GITMODULES_FILE, sect.buf, NULL) < 0) {
			/* Maybe the user already did that, don't error out here */
			warning(_("Could not remove .gitmodules entry for %s"), path);
			strbuf_release(&sect);
			return -1;
		}
	}
	if (git_config_rename_section_in_file(GITMODULES_INDEX , sect.buf, NULL) < 0) {
		/* Maybe the user already did that, don't error out here */
		warning(_("Could not remove .gitmodules entry for %s"), path);
		strbuf_release(&sect);
		return -1;
	}

Everything else being unchanged. Therefore, we still have the
'stage_updated_gitmodules()' call. If we don't call this function then
won't we be NOT adding the updated '.gitmodules' to the staging area,
something which is done as of now?

Or am I mising something here?

Regards,
Shourya Shukla


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached'
  2021-03-05 17:58       ` Shourya Shukla
@ 2021-03-05 21:39         ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2021-03-05 21:39 UTC (permalink / raw)
  To: Shourya Shukla; +Cc: christian.couder, git, levraiphilippeblain

Shourya Shukla <periperidip@gmail.com> writes:

>> Doing so would also mean that you should not have the caller call
>> stage_updated_gitmodules() at all, even in !index_only case.
>> Imagine if the .gitmodules file in the working tree had local
>> changes (e.g. registered a few more submodules, or updated the url
>> field of a few submodules) that are not yet added to the index when
>> "git rm" removed a submodule.  The user does not want them to be in
>> the index yet and "git rm" should not add these unrelated local
>> changes to the index.
>
> Won't this be deviating from the current behaviour of 'git rm'?
> Currently, the above case won't process and the user will be asked to
> stage or undo the mods they made before moving forward.

Ah, adding such safety to ensure that "rm" without "--cached"
(i.e. update both the index and the working tree copies of
.gitmodules) would stop when .gitmodules has a local mod would be a
good idea, on top of the outline you are responding to, I think.

Thanks.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached'
  2021-02-22 19:29     ` Junio C Hamano
@ 2021-03-07 16:46       ` Shourya Shukla
  2021-03-07 20:29         ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Shourya Shukla @ 2021-03-07 16:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: christian.couder, git, levraiphilippeblain

On 22/02 11:29, Junio C Hamano wrote:
> Shourya Shukla <periperidip@gmail.com> writes:
> 
> > +	if (git_config_rename_section_in_file(index_only ? GITMODULES_INDEX :
> > +					      GITMODULES_FILE, sect.buf, NULL) < 0) {
> 
> Also, is it really sufficient to pass GITMODULES_INDEX as the first
> argument to this function to tweak what is in the index?
> 
> git_config_copy_or_rename_section_in_file() which is the
> implementation of that helper seems to always want to work with a
> file that is on disk, by making unconditional calls to
> hold_lock_file_for_update(), fopen(), fstat(), chmod(), etc.
> 
> So I suspect that there are much more work needed.  

I am not able to comprehend _why_ we need so much more work. To me it
seems to work fine.

The flow now is something like:

1. If !index_only i.e., '--cached' is not passed then remove the entry
of the SM from the working tree copy of '.gitmodules' i.e.,
GITMODULES_FILE. If there are any unstaged mods in '.gitmodules', we do
not proceed with 'git rm'.

2. Now, delete the entry of the above SM from the index copy of the
'.gitmodules' i.e., GITMODULES_INDEX (irrespective of the value of
'index_only'). If there are any unstaged mods in '.gitmodules', we do
not proceed with 'git rm'.

3. Finally, after the deletion of the SM entry, we stage the changes
using 'stage_updated_gitmodules()'.

Also, before any of the above thing happens, we check if it is OK to
write the '.gitmodules' using 'is_staging_gitmodules_ok()'. All the
above behaviour is in line with the current behaviour of 'git rm'.

What exactly do we need to change then?

> It seems to me that the config editing API is one of the older and
> hackier parts of the system and requires quite a lot of work to
> teach it to work with anything but a on-disk file.  In the longer
> term, it may be a good thing to clean it up, but I suspect that it
> is way too much work for too little benefit to do so as a part of
> this topic, so an easier way out for now would be to:
> 
>  - write out the .gitmodules in the index to a temporary file (learn
>    how to correctly call entry.c::checkout_entry() by studying how
>    builtin/checkout-index.c::checkout_file() calls it, especially to
>    a temporary file with the --temp option).
> 
>  - use git_config_rename_section_in_file() on that temporary file to
>    remove the section about the submodule.
> 
>  - read that temporary file back into memory and write it out as a
>    blob object by calling sha1-file.c::write_object_file().
> 
>  - add that back to the index as .gitmodules (studying how
>    builtin/update-index.c::add_cacheinfo() calls add_cache_entry()
>    would be a good way to learn how to do this).
> 
> The working tree side can stay as is, but as I said in the earlier
> message, I think you need to update the .gitmodules in the working
> tree and .gitmodules in the index separately (and without doing any
> equivalent of "git add .gitmodules").

But 'git rm' itself used to stage the changes i.e., 'git add'-ing them.

Regards,
Shourya Shukla


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached'
  2021-03-07 16:46       ` Shourya Shukla
@ 2021-03-07 20:29         ` Junio C Hamano
  2021-03-09  7:13           ` Shourya Shukla
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2021-03-07 20:29 UTC (permalink / raw)
  To: Shourya Shukla; +Cc: christian.couder, git, levraiphilippeblain

Shourya Shukla <periperidip@gmail.com> writes:

> On 22/02 11:29, Junio C Hamano wrote:
>> Shourya Shukla <periperidip@gmail.com> writes:
>> 
>> > +	if (git_config_rename_section_in_file(index_only ? GITMODULES_INDEX :
>> > +					      GITMODULES_FILE, sect.buf, NULL) < 0) {
>> 
>> Also, is it really sufficient to pass GITMODULES_INDEX as the first
>> argument to this function to tweak what is in the index?
>> 
>> git_config_copy_or_rename_section_in_file() which is the
>> implementation of that helper seems to always want to work with a
>> file that is on disk, by making unconditional calls to
>> hold_lock_file_for_update(), fopen(), fstat(), chmod(), etc.
>> 
>> So I suspect that there are much more work needed.  
>
> I am not able to comprehend _why_ we need so much more work. To me it
> seems to work fine.

> The flow now is something like:
>
> 1. If !index_only i.e., '--cached' is not passed then remove the entry
> of the SM from the working tree copy of '.gitmodules' i.e.,
> GITMODULES_FILE. If there are any unstaged mods in '.gitmodules', we do
> not proceed with 'git rm'.

That side is fine, especially if we are extending the "when doing
'git rm PATH' (without '--cached'), PATH must match between the
index and the working tree" to "when doing 'git rm SUBMODULE', not
just SUBMODULE but also '.gitmodules' must match between the index
and the working tree", then adjusting the entry for SUBMODULE in
'.gitmodules' in the working tree and adding the result to the index
would give the same result as editing '.gitmodules' both in the
index and in the working tree independently.

But the problem is that there is no way "--cached" case would work
with your code.

> What exactly do we need to change then?

Have you traced what happens when you make this call

>> > +	if (git_config_rename_section_in_file(index_only ? GITMODULES_INDEX :
>> > +					      GITMODULES_FILE, sect.buf, NULL) < 0) {

with index_only set?  i.e. GIT_MODULES_INDEX passed as the
config_filename argument?

The first parameter to the git_config_rename_section_in_file() names
a filename in the working tree to be edited.  Writing ':.gitmodules'
does not make the function magically work in-core without touching
the working tree.  It will make it update a file (likely not
tracked) whose name is ":.gitmodules" in the working tree, no?

Presumably you want to edit in-index .gitmodules without touching
the working tree file, but the call is not doing that---and it would
take much more work to teach it do so.

And a cheaper way out would be how I outlined in the message you are
responding to, i.e. write out the in-index .gitmodules to a
temporary file, let git_config_rename_section_in_file() tweak that
temporary file, and add it back into the index.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached'
  2021-03-07 20:29         ` Junio C Hamano
@ 2021-03-09  7:13           ` Shourya Shukla
  2021-03-09 20:47             ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Shourya Shukla @ 2021-03-09  7:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: christian.couder, git, levraiphilippeblain

On 07/03 12:29, Junio C Hamano wrote:
> Shourya Shukla <periperidip@gmail.com> writes:
> 
> > On 22/02 11:29, Junio C Hamano wrote:
> >> Shourya Shukla <periperidip@gmail.com> writes:
> >> 
> >> > +	if (git_config_rename_section_in_file(index_only ? GITMODULES_INDEX :
> >> > +					      GITMODULES_FILE, sect.buf, NULL) < 0) {
> >> 
> >> Also, is it really sufficient to pass GITMODULES_INDEX as the first
> >> argument to this function to tweak what is in the index?
> >> 
> >> git_config_copy_or_rename_section_in_file() which is the
> >> implementation of that helper seems to always want to work with a
> >> file that is on disk, by making unconditional calls to
> >> hold_lock_file_for_update(), fopen(), fstat(), chmod(), etc.
> >> 
> >> So I suspect that there are much more work needed.  
> >
> > I am not able to comprehend _why_ we need so much more work. To me it
> > seems to work fine.
> 
> > The flow now is something like:
> >
> > 1. If !index_only i.e., '--cached' is not passed then remove the entry
> > of the SM from the working tree copy of '.gitmodules' i.e.,
> > GITMODULES_FILE. If there are any unstaged mods in '.gitmodules', we do
> > not proceed with 'git rm'.
> 
> That side is fine, especially if we are extending the "when doing
> 'git rm PATH' (without '--cached'), PATH must match between the
> index and the working tree" to "when doing 'git rm SUBMODULE', not
> just SUBMODULE but also '.gitmodules' must match between the index
> and the working tree", then adjusting the entry for SUBMODULE in
> '.gitmodules' in the working tree and adding the result to the index
> would give the same result as editing '.gitmodules' both in the
> index and in the working tree independently.
> 
> But the problem is that there is no way "--cached" case would work
> with your code.
> 
> > What exactly do we need to change then?
> 
> Have you traced what happens when you make this call
> 
> >> > +	if (git_config_rename_section_in_file(index_only ? GITMODULES_INDEX :
> >> > +					      GITMODULES_FILE, sect.buf, NULL) < 0) {
> 
> with index_only set?  i.e. GIT_MODULES_INDEX passed as the
> config_filename argument?
> 
> The first parameter to the git_config_rename_section_in_file() names
> a filename in the working tree to be edited.  Writing ':.gitmodules'
> does not make the function magically work in-core without touching
> the working tree.  It will make it update a file (likely not
> tracked) whose name is ":.gitmodules" in the working tree, no?
> 
> Presumably you want to edit in-index .gitmodules without touching
> the working tree file, but the call is not doing that---and it would
> take much more work to teach it do so.
> 
> And a cheaper way out would be how I outlined in the message you are
> responding to, i.e. write out the in-index .gitmodules to a
> temporary file, let git_config_rename_section_in_file() tweak that
> temporary file, and add it back into the index.

Ahhh. Understood and will work on it. BTW then when does
GITMODULES_INDEX even fulfill its purpose? Its name can confuse anyone
into thinking what it made me think: it is the index copy of the
gitmodules.

Is it something which is to be changed in the near future?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached'
  2021-03-09  7:13           ` Shourya Shukla
@ 2021-03-09 20:47             ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2021-03-09 20:47 UTC (permalink / raw)
  To: Shourya Shukla; +Cc: christian.couder, git, levraiphilippeblain

Shourya Shukla <periperidip@gmail.com> writes:

> Ahhh. Understood and will work on it. BTW then when does
> GITMODULES_INDEX even fulfill its purpose? Its name can confuse anyone
> into thinking what it made me think: it is the index copy of the
> gitmodules.

I do not offhand know where in our codebase we use it (I am not a
submodule person).

Perhaps get_sha1(":.gitmodules", sha1)?  Even then, I'd probably
prefer to see it spelled as

	get_sha1(":" GITMODULES_FILE, sha1)

with token concatenation.

> Is it something which is to be changed in the near future?

Sorry, I do not understand the question.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-03-09 20:48 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-18 18:49 [RFC][PATCH 0/2] rm: changes in the '.gitmodules' are staged after using '--cached' Shourya Shukla
2021-02-18 18:49 ` [PATCH 1/2] " Shourya Shukla
2021-02-18 20:14   ` Philippe Blain
2021-02-18 20:39     ` Philippe Blain
2021-02-19 15:19     ` Shourya Shukla
2021-02-18 22:03   ` Junio C Hamano
2021-02-19 15:24     ` Shourya Shukla
2021-02-20  3:31       ` Junio C Hamano
2021-02-18 18:49 ` [PATCH 2/2] t3600: amend test 46 to check for '.gitmodules' modification Shourya Shukla
2021-02-18 20:21   ` Philippe Blain
2021-02-22 17:26 ` [PATCH v2 0/1] rm: stage submodule removal from '.gitmodules' Shourya Shukla
2021-02-22 17:26   ` [PATCH v2 1/1] rm: stage submodule removal from '.gitmodules' when using '--cached' Shourya Shukla
2021-02-22 18:58     ` Junio C Hamano
2021-03-05 17:58       ` Shourya Shukla
2021-03-05 21:39         ` Junio C Hamano
2021-02-22 19:29     ` Junio C Hamano
2021-03-07 16:46       ` Shourya Shukla
2021-03-07 20:29         ` Junio C Hamano
2021-03-09  7:13           ` Shourya Shukla
2021-03-09 20:47             ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).