git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Philippe Blain via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Jonathan Tan <jonathantanmy@google.com>,
	Stefan Beller <stefanbeller@gmail.com>,
	Damien Robert <damien.olivier.robert@gmail.com>,
	Philippe Blain <levraiphilippeblain@gmail.com>
Subject: Re: [PATCH 5/6] unpack-trees: check for missing submodule directory in merged_entry
Date: Wed, 19 Feb 2020 15:32:52 -0800	[thread overview]
Message-ID: <xmqqd0aacfvf.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <d47740181bf02036a83f3573164c6c650ae75afb.1581915186.git.gitgitgadget@gmail.com> (Philippe Blain via GitGitGadget's message of "Mon, 17 Feb 2020 04:53:05 +0000")

"Philippe Blain via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Philippe Blain <levraiphilippeblain@gmail.com>
>
> Using `git checkout --recurse-submodules` to switch between a
> branch with no submodules and a branch with initialized nested
> submodules currently causes a fatal error:
>
>     $ git checkout --recurse-submodules branch-with-nested-submodules
>     fatal: exec '--super-prefix=submodule/nested/': cd to 'nested'
> failed: No such file or directory
>     error: Submodule 'nested' could not be updated.
>     error: Submodule 'submodule/nested' cannot checkout new HEAD.
>     error: Submodule 'submodule' could not be updated.
>     M	submodule
>     Switched to branch 'branch-with-nested-submodules'
>
> The checkout succeeds but the worktree and index of the first level
> submodule are left empty:
>
>     $ cd submodule
>     $ git -c status.submoduleSummary=1 status
>     HEAD detached at b3ce885
>     Changes to be committed:
>       (use "git restore --staged <file>..." to unstage)
>           deleted:    .gitmodules
>           deleted:    first.t
>           deleted:    nested
>
>     fatal: not a git repository: 'nested/.git'
>     Submodule changes to be committed:
>
>     * nested 1e96f59...0000000:
>
>     $ git ls-files -s
>     $ # empty
>     $ ls -A
>     .git
>
> The reason for the fatal error during the checkout is that a child git
> process tries to cd into the yet unexisting nested submodule directory.
> The sequence is the following:
>
> 1. The main git process (the one running in the superproject) eventually
> reaches write_entry() in entry.c, which creates the first level
> submodule directory and then calls submodule_move_head() in submodule.c,
> which spawns `git read-tree` in the submodule directory.
>
> 2. The first child git process (the one in the submodule of the
> superproject) eventually calls check_submodule_move_head() at
> unpack_trees.c:2021, which calls submodule_move_head in dry-run mode,
> which spawns `git read-tree` in the nested submodule directory.
>
> 3. The second child git process tries to chdir() in the yet unexisting
> nested submodule directory in start_command() at run-command.c:829 and
> dies before exec'ing.
>
> The reason why check_submodule_move_head() is reached in the first child
> and not in the main process is that it is inside an
> if(submodule_from_ce()) construct, and submodule_from_ce() returns a
> valid struct submodule pointer, whereas it returns a null pointer in the
> main git process.
>
> The reason why submodule_from_ce() returns a null pointer in the main
> git process is because the call to cache_lookup_path() in config_from()
> (called from submodule_from_path() in submodule_from_ce()) returns a
> null pointer since the hashmap "for_path" in the submodule_cache of
> the_repository is not yet populated. It is not populated because both
> repo_get_oid(repo, GITMODULES_INDEX, &oid) and repo_get_oid(repo,
> GITMODULES_HEAD, &oid) in config_from_gitmodules() at
> submodule-config.c:639-640 return -1, as at this stage of the operation,
> neither the HEAD of the superproject nor its index contain any
> .gitmodules file.
>
> In contrast, in the first child the hashmap is populated because
> repo_get_oid(repo, GITMODULES_HEAD, &oid) returns 0 as the HEAD of the
> first level submodule, i.e. .git/modules/submodule/HEAD, points to a
> commit where .gitmodules is present and records 'nested' as a submodule.
>
> Fix this bug by checking that the submodule directory exists before
> calling check_submodule_move_head() in merged_entry() in the `if(!old)`
> branch, i.e. if going from a commit with no submodule to a commit with a
> submodule present.
>
> Also protect the other call to check_submodule_move_head() in
> merged_entry() the same way as it is safer, even though the `else if
> (!(old->ce_flags & CE_CONFLICTED))` branch of the code is not at play in
> the present bug.
>
> The other calls to check_submodule_move_head() in other functions in
> unpack_trees.c are all already protected by calls to lstat() somewhere
> in
> the program flow so we don't need additional protection for them.
>
> All commands in the unpack_trees machinery are affected, i.e. checkout,
> reset and read-tree when called with the --recurse-submodules flag.

Greate to see a detailed write-up.  I'll read the surrounding
codepath again later before commenting further.

Thanks.

>
> This bug was first reported in [1].
>
> [1]
> https://lore.kernel.org/git/7437BB59-4605-48EC-B05E-E2BDB2D9DABC@gmail.com/
>
> Reported-by: Philippe Blain <levraiphilippeblain@gmail.com>
> Reported-by: Damien Robert <damien.olivier.robert@gmail.com>
> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
> ---
>  t/lib-submodule-update.sh | 14 ++++++++++++++
>  unpack-trees.c            |  4 ++--
>  2 files changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/t/lib-submodule-update.sh b/t/lib-submodule-update.sh
> index 417da3602ae..ab30b2da24f 100755
> --- a/t/lib-submodule-update.sh
> +++ b/t/lib-submodule-update.sh
> @@ -626,6 +626,7 @@ test_submodule_forced_switch () {
>  # New test cases
>  # - Removing a submodule with a git directory absorbs the submodules
>  #   git directory first into the superproject.
> +# - Switching from no submodule to nested submodules
>  
>  # Internal function; use test_submodule_switch_recursing_with_args() or
>  # test_submodule_forced_switch_recursing_with_args() instead.
> @@ -683,6 +684,19 @@ test_submodule_recursing_with_args_common() {
>  			test_submodule_content sub1 origin/replace_directory_with_sub1
>  		)
>  	'
> +	# Switching to a commit with nested submodules recursively checks them out
> +	test_expect_success "$command: nested submodules are checked out" '
> +		prolog &&
> +		reset_work_tree_to_interested no_submodule &&
> +		(
> +			cd submodule_update &&
> +			git branch -t modify_sub1_recursively origin/modify_sub1_recursively &&
> +			$command modify_sub1_recursively &&
> +			test_superproject_content origin/modify_sub1_recursively &&
> +			test_submodule_content sub1 origin/modify_sub1_recursively &&
> +			test_submodule_content -C sub1 sub2 origin/modify_sub1_recursively
> +		)
> +	'
>  
>  	######################## Disappearing submodule #######################
>  	# Removing a submodule removes its work tree ...
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 37eca3ede8b..fc6ba19486d 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -2064,7 +2064,7 @@ static int merged_entry(const struct cache_entry *ce,
>  		}
>  		invalidate_ce_path(merge, o);
>  
> -		if (submodule_from_ce(ce)) {
> +		if (submodule_from_ce(ce) && file_exists(ce->name)) {
>  			int ret = check_submodule_move_head(ce, NULL,
>  							    oid_to_hex(&ce->oid),
>  							    o);
> @@ -2093,7 +2093,7 @@ static int merged_entry(const struct cache_entry *ce,
>  			invalidate_ce_path(old, o);
>  		}
>  
> -		if (submodule_from_ce(ce)) {
> +		if (submodule_from_ce(ce) && file_exists(ce->name)) {
>  			int ret = check_submodule_move_head(ce, oid_to_hex(&old->oid),
>  							    oid_to_hex(&ce->oid),
>  							    o);

  reply	other threads:[~2020-02-19 23:33 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-17  4:53 [PATCH 0/6] unpack-trees: fix '--recurse-submodules' when switching from no submodules to nested submodules Philippe Blain via GitGitGadget
2020-02-17  4:53 ` [PATCH 1/6] t7112: remove mention of KNOWN_FAILURE_SUBMODULE_RECURSIVE_NESTED Philippe Blain via GitGitGadget
2020-02-19 23:26   ` Junio C Hamano
2020-02-17  4:53 ` [PATCH 2/6] t/lib-submodule-update: remove outdated test description Philippe Blain via GitGitGadget
2020-02-17  4:53 ` [PATCH 3/6] t/lib-submodule-update: move a test to the right section Philippe Blain via GitGitGadget
2020-02-17  4:53 ` [PATCH 4/6] unpack-trees: remove outdated description for verify_clean_submodule Philippe Blain via GitGitGadget
2020-02-17  4:53 ` [PATCH 5/6] unpack-trees: check for missing submodule directory in merged_entry Philippe Blain via GitGitGadget
2020-02-19 23:32   ` Junio C Hamano [this message]
2020-02-17  4:53 ` [PATCH 6/6] t/lib-submodule-update: add test removing nested submodules Philippe Blain via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqd0aacfvf.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=damien.olivier.robert@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=levraiphilippeblain@gmail.com \
    --cc=stefanbeller@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).