git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Issue: "Could not access submodule" error when pulling recursively with Git 2.22.0
@ 2019-10-23  7:22 Aleksey Mikhaylov
  2019-10-23 10:04 ` SZEDER Gábor
  0 siblings, 1 reply; 4+ messages in thread
From: Aleksey Mikhaylov @ 2019-10-23  7:22 UTC (permalink / raw)
  To: git@vger.kernel.org

PROBLEM DESCRIPTION

"Could not access submodule" error when pulling recursively with Git 2.22.0.
This issue causes if there is submodule in submodule.

At first, we reported this problem for Git for Windows:
https://github.com/git-for-windows/git/issues/2361
But we received the answer that it was reproduced on Linux too.

Also, the same problem is described here: 
https://gitlab.com/gitlab-org/gitlab/issues/27287

STEPS TO REPRODUCE

Please use my simple test repository to reproduce the problem:
https://github.com/agmikhailov/example2361.git

It is just an empty repository with one submodule (https://github.com/agmikhailov/example2361M1.git) 
and one submodule of submodule (https://github.com/agmikhailov/example2361M2.git):

git clone https://github.com/agmikhailov/example2361.git
cd example2361/
git submodule init
git submodule update
git pull --recurse-submodules=true

ACTUAL RESULT

"git --recurse-submodules=true" command fails with message "Could not access submodule":

$ git --recurse-submodules=true
Fetching submodule example2361M1
Could not access submodule 'example2361M2'

EXPECTED RESULT

All submodules are successfully updated by "git --recurse-submodules=true" command.

ADDITIONAL INFORMATION

Git version 2.20.1 does not have this problem. 
So we had to downgrade Git to work with submodules.

Best regards,
--
Aleksey


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Issue: "Could not access submodule" error when pulling recursively with Git 2.22.0
  2019-10-23  7:22 Issue: "Could not access submodule" error when pulling recursively with Git 2.22.0 Aleksey Mikhaylov
@ 2019-10-23 10:04 ` SZEDER Gábor
  2019-10-25 12:41   ` Johannes Schindelin
  0 siblings, 1 reply; 4+ messages in thread
From: SZEDER Gábor @ 2019-10-23 10:04 UTC (permalink / raw)
  To: Aleksey Mikhaylov; +Cc: git@vger.kernel.org

On Wed, Oct 23, 2019 at 07:22:12AM +0000, Aleksey Mikhaylov wrote:
> "Could not access submodule" error when pulling recursively with Git 2.22.0.
> This issue causes if there is submodule in submodule.

> Please use my simple test repository to reproduce the problem:
> https://github.com/agmikhailov/example2361.git
> 
> It is just an empty repository with one submodule (https://github.com/agmikhailov/example2361M1.git) 
> and one submodule of submodule (https://github.com/agmikhailov/example2361M2.git):
> 
> git clone https://github.com/agmikhailov/example2361.git
> cd example2361/
> git submodule init

According to the docs of 'git submodule init', it "initializes the
submodules recorded in the index".  Therefore, it can only initialize
the submodule in 'example2361M1', because at this point we have no
idea that there is a nested submodule in there.

  $ git submodule init
  Submodule 'example2361M1' (https://github.com/agmikhailov/example2361M1.git) registered for path 'example2361M1'

> git submodule update

This command clones 'example2361M1':

  $ git submodule update --recursive
  Cloning into '/tmp/example2361/example2361M1'...
  Submodule path 'example2361M1': checked out '6a9be24a1c0ebd44d91ae4dcf1fd62580b936540'

Only at this point can we finally see that there is a nested
submodule, and can initialize and clone it with:

  $ cd example2361M1
  $ git submodule init
  Submodule 'example2361M2' (https://github.com/agmikhailov/example2361M2.git) registered for path 'example2361M2'
  $ git submodule update
  Cloning into '/tmp/example2361/example2361M1/example2361M2'...
  Submodule path 'example2361M2': checked out '9ed39cf1fe0a8cf34e72d2e7ebff1ea9d4a63ac1'

> git pull --recurse-submodules=true

And after that:

  $ cd ../..
  $ git pull --recurse-submodules=true
  Fetching submodule example2361M1
  Fetching submodule example2361M1/example2361M2
  Already up to date.


> ACTUAL RESULT
> 
> "git --recurse-submodules=true" command fails with message "Could not access submodule":
> 
> $ git --recurse-submodules=true
> Fetching submodule example2361M1
> Could not access submodule 'example2361M2'
> 
> EXPECTED RESULT
> 
> All submodules are successfully updated by "git --recurse-submodules=true" command.
> 
> ADDITIONAL INFORMATION
> 
> Git version 2.20.1 does not have this problem. 
> So we had to downgrade Git to work with submodules.

The behavior was indeed different with v2.20.1, but that version
didn't show the behavior you expected.  When running your commands
with v2.20.1 I get:

  $ ~/src/git/bin-wrappers/git pull --recurse-submodules=true
  Fetching submodule example2361M1
  Already up to date.
  $ find example2361M1/example2361M2/
  example2361M1/example2361M2/

So while that 'git pull' didn't error out, it didn't even look at the
nested submodule, which is still uninitialized and empty.

FWIW, bisecting shows that the behavior changed with commit
a62387b3fc, but, unfortunately, the commit message doesn't seem to be
very helpful to me, but perhaps others with more experience with
submodules can make something out of it.

commit a62387b3fc9f5aeeb04a2db278121d33a9caafa7
Author: Stefan Beller <sbeller@google.com>
Date:   Wed Nov 28 16:27:55 2018 -0800

    submodule.c: fetch in submodules git directory instead of in worktree
    
    Keep the properties introduced in 10f5c52656 (submodule: avoid
    auto-discovery in prepare_submodule_repo_env(), 2016-09-01), by fixating
    the git directory of the submodule.
    
    Signed-off-by: Stefan Beller <sbeller@google.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>

 submodule.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Issue: "Could not access submodule" error when pulling recursively with Git 2.22.0
  2019-10-23 10:04 ` SZEDER Gábor
@ 2019-10-25 12:41   ` Johannes Schindelin
  2021-01-01 15:42     ` Philippe Blain
  0 siblings, 1 reply; 4+ messages in thread
From: Johannes Schindelin @ 2019-10-25 12:41 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Aleksey Mikhaylov, git@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 9313 bytes --]

Hi Gábor,

On Wed, 23 Oct 2019, SZEDER Gábor wrote:

> On Wed, Oct 23, 2019 at 07:22:12AM +0000, Aleksey Mikhaylov wrote:
> > "Could not access submodule" error when pulling recursively with Git 2.22.0.
> > This issue causes if there is submodule in submodule.
>
> > Please use my simple test repository to reproduce the problem:
> > https://github.com/agmikhailov/example2361.git
> >
> > It is just an empty repository with one submodule (https://github.com/agmikhailov/example2361M1.git)
> > and one submodule of submodule (https://github.com/agmikhailov/example2361M2.git):
> >
> > git clone https://github.com/agmikhailov/example2361.git
> > cd example2361/
> > git submodule init
>
> According to the docs of 'git submodule init', it "initializes the
> submodules recorded in the index".  Therefore, it can only initialize
> the submodule in 'example2361M1', because at this point we have no
> idea that there is a nested submodule in there.
>
>   $ git submodule init
>   Submodule 'example2361M1' (https://github.com/agmikhailov/example2361M1.git) registered for path 'example2361M1'

Indeed, `git submodule init` is not recursive.

> > git submodule update
>
> This command clones 'example2361M1':
>
>   $ git submodule update --recursive
>   Cloning into '/tmp/example2361/example2361M1'...
>   Submodule path 'example2361M1': checked out '6a9be24a1c0ebd44d91ae4dcf1fd62580b936540'
>
> Only at this point can we finally see that there is a nested
> submodule, and can initialize and clone it with:
>
>   $ cd example2361M1
>   $ git submodule init
>   Submodule 'example2361M2' (https://github.com/agmikhailov/example2361M2.git) registered for path 'example2361M2'
>   $ git submodule update
>   Cloning into '/tmp/example2361/example2361M1/example2361M2'...
>   Submodule path 'example2361M2': checked out '9ed39cf1fe0a8cf34e72d2e7ebff1ea9d4a63ac1'

I concur.

> > git pull --recurse-submodules=true
>
> And after that:
>
>   $ cd ../..
>   $ git pull --recurse-submodules=true
>   Fetching submodule example2361M1
>   Fetching submodule example2361M1/example2361M2
>   Already up to date.

Yes, I agree that _probably_ what the user wanted is to initialize the
submodules recursively.

Having said that, I vaguely remember that e.g. Boost has this insane
forest of submodules, and I am almost certain that no sane person wants
to clone them all. _I_ wouldn't.

> > ACTUAL RESULT
> >
> > "git --recurse-submodules=true" command fails with message "Could not access submodule":
> >
> > $ git --recurse-submodules=true
> > Fetching submodule example2361M1
> > Could not access submodule 'example2361M2'
> >
> > EXPECTED RESULT
> >
> > All submodules are successfully updated by "git --recurse-submodules=true" command.
> >
> > ADDITIONAL INFORMATION
> >
> > Git version 2.20.1 does not have this problem.
> > So we had to downgrade Git to work with submodules.
>
> The behavior was indeed different with v2.20.1, but that version
> didn't show the behavior you expected.  When running your commands
> with v2.20.1 I get:
>
>   $ ~/src/git/bin-wrappers/git pull --recurse-submodules=true
>   Fetching submodule example2361M1
>   Already up to date.
>   $ find example2361M1/example2361M2/
>   example2361M1/example2361M2/
>
> So while that 'git pull' didn't error out, it didn't even look at the
> nested submodule, which is still uninitialized and empty.

I would actually argue that this is what is expected: the entire _point_
of submodules is that they can be inactive.

Coming back to the Boost example, what I would want Git to do when only
a fraction of the submodules is active is to skip the inactive ones
during a `git pull --recurse-submodules=true`.

Which v2.20.1 apparently did, and I would call the current behavior a
regression.

> FWIW, bisecting shows that the behavior changed with commit
> a62387b3fc,

Thanks for digging this out!

> but, unfortunately, the commit message doesn't seem to be very helpful
> to me, but perhaps others with more experience with submodules can
> make something out of it.
>
> commit a62387b3fc9f5aeeb04a2db278121d33a9caafa7
> Author: Stefan Beller <sbeller@google.com>
> Date:   Wed Nov 28 16:27:55 2018 -0800
>
>     submodule.c: fetch in submodules git directory instead of in worktree
>
>     Keep the properties introduced in 10f5c52656 (submodule: avoid
>     auto-discovery in prepare_submodule_repo_env(), 2016-09-01), by fixating
>     the git directory of the submodule.
>
>     Signed-off-by: Stefan Beller <sbeller@google.com>
>     Signed-off-by: Junio C Hamano <gitster@pobox.com>
>
>  submodule.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)

To me, this commit message suggests that the bahvior should not have
changed in the reported manner.

Let's look at the diff:

-- snip --
diff --git a/submodule.c b/submodule.c
index 77ace5e784a..d1b6646f42d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -494,6 +494,12 @@ void prepare_submodule_repo_env(struct argv_array *out)
 			 DEFAULT_GIT_DIR_ENVIRONMENT);
 }

+static void prepare_submodule_repo_env_in_gitdir(struct argv_array *out)
+{
+	prepare_submodule_repo_env_no_git_dir(out);
+	argv_array_pushf(out, "%s=.", GIT_DIR_ENVIRONMENT);
+}
+
 /* Helper function to display the submodule header line prior to the full
  * summary output. If it can locate the submodule objects directory it will
  * attempt to lookup both the left and right commits and put them into the
@@ -1327,8 +1333,8 @@ static int get_next_submodule(struct child_process *cp,
 		repo = get_submodule_repo_for(spf->r, submodule);
 		if (repo) {
 			child_process_init(cp);
-			cp->dir = xstrdup(repo->worktree);
-			prepare_submodule_repo_env(&cp->env_array);
+			cp->dir = xstrdup(repo->gitdir);
+			prepare_submodule_repo_env_in_gitdir(&cp->env_array);
 			cp->git_cmd = 1;
 			if (!spf->quiet)
 				strbuf_addf(err, "Fetching submodule %s%s\n",
-- snap --

The obvious difference in behavior is that we no longer let Git
discover the `.git` file/directory, but we define it (because we can).

But actually, we cannot, not if the submodule is not checked out!
Because in this case, there is no `.git` file and Git then tries to
initialize the repository and the worktree, and fails rather miserably
in the reported manner.

Side note: I think there is something spooky going on where we try to
fetch submodules twice when the first time failed, and somehow slip into
the `else` arm of this code:

-- snip --
		task->repo = get_submodule_repo_for(spf->r, task->sub);
		if (task->repo) {
			struct strbuf submodule_prefix = STRBUF_INIT;
			child_process_init(cp);
			cp->dir = task->repo->gitdir;
			prepare_submodule_repo_env_in_gitdir(&cp->env_array);
			cp->git_cmd = 1;
			if (!spf->quiet)
				strbuf_addf(err, "Fetching submodule %s%s\n",
					    spf->prefix, ce->name);
			argv_array_init(&cp->args);
			argv_array_pushv(&cp->args, spf->args.argv);
			argv_array_push(&cp->args, default_argv);
			argv_array_push(&cp->args, "--submodule-prefix");

			strbuf_addf(&submodule_prefix, "%s%s/",
						       spf->prefix,
						       task->sub->path);
			argv_array_push(&cp->args, submodule_prefix.buf);

			spf->count++;
			*task_cb = task;

			strbuf_release(&submodule_prefix);
			return 1;
		} else {

			fetch_task_release(task);
			free(task);

			/*
			 * An empty directory is normal,
			 * the submodule is not initialized
			 */
			if (S_ISGITLINK(ce->ce_mode) &&
			    !is_empty_dir(ce->name)) {
				spf->result = 1;
				strbuf_addf(err,
					    _("Could not access submodule '%s'"),
					    ce->name);
			}
		}
-- snap --

As you can see, at that point, `is_empty_dir()` indicates that it is _no
longer empty_, which means that the recursive pull actually tried to
initialize it, and left things in a half-consistent state.

BTW I also think that Stefan's commit just made a bug very obvious that
had not mattered all that much before: it would seem that before that
commit, when the code tried to fetch a submodule, it would actually
fetch the super project (because it would discover the .git
file/directory, and as the inactive submodule does not have any, it
would have found the super project's).

In essence, I think that the report points out a very real bug, and this
bug should be fixed.

I _could_ imagine that it would be as easy as applying this patch (and
turning the provided reproducer into a regression test), then polishing
this with a nice commit message:

-- snip --
diff --git a/submodule.c b/submodule.c
index 0f199c51378..5694905610a 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1443,6 +1443,12 @@ static int get_next_submodule(struct child_process *cp,
 		task->repo = get_submodule_repo_for(spf->r, task->sub);
 		if (task->repo) {
 			struct strbuf submodule_prefix = STRBUF_INIT;
+
+			/* skip uninitialized submodule */
+			if (!file_exists(task->repo->gitdir) ||
+			    is_empty_dir(task->repo->gitdir))
+				continue;
+
 			child_process_init(cp);
 			cp->dir = task->repo->gitdir;
 			prepare_submodule_repo_env_in_gitdir(&cp->env_array);
-- snap --

Any takers?

Ciao,
Johannes

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Issue: "Could not access submodule" error when pulling recursively with Git 2.22.0
  2019-10-25 12:41   ` Johannes Schindelin
@ 2021-01-01 15:42     ` Philippe Blain
  0 siblings, 0 replies; 4+ messages in thread
From: Philippe Blain @ 2021-01-01 15:42 UTC (permalink / raw)
  To: Johannes Schindelin, SZEDER Gábor
  Cc: Aleksey Mikhaylov, git@vger.kernel.org

Hi Johannes, Gábor, Aleksey,

Le 2019-10-25 à 08:41, Johannes Schindelin a écrit :
> Hi Gábor,
> 
> On Wed, 23 Oct 2019, SZEDER Gábor wrote:
> 
>> On Wed, Oct 23, 2019 at 07:22:12AM +0000, Aleksey Mikhaylov wrote:
>>> "Could not access submodule" error when pulling recursively with Git 2.22.0.
>>> This issue causes if there is submodule in submodule.
>>
>>> Please use my simple test repository to reproduce the problem:
>>> https://github.com/agmikhailov/example2361.git
>>>
>>> It is just an empty repository with one submodule (https://github.com/agmikhailov/example2361M1.git)
>>> and one submodule of submodule (https://github.com/agmikhailov/example2361M2.git):
>>>
>>> git clone https://github.com/agmikhailov/example2361.git
>>> cd example2361/
>>> git submodule init
>>
>> According to the docs of 'git submodule init', it "initializes the
>> submodules recorded in the index".  Therefore, it can only initialize
>> the submodule in 'example2361M1', because at this point we have no
>> idea that there is a nested submodule in there.
>>
>>    $ git submodule init
>>    Submodule 'example2361M1' (https://github.com/agmikhailov/example2361M1.git) registered for path 'example2361M1'
> 
> Indeed, `git submodule init` is not recursive.
> 
>>> git submodule update
>>
>> This command clones 'example2361M1':
>>
>>    $ git submodule update --recursive
>>    Cloning into '/tmp/example2361/example2361M1'...
>>    Submodule path 'example2361M1': checked out '6a9be24a1c0ebd44d91ae4dcf1fd62580b936540'
>>
>> Only at this point can we finally see that there is a nested
>> submodule, and can initialize and clone it with:
>>
>>    $ cd example2361M1
>>    $ git submodule init
>>    Submodule 'example2361M2' (https://github.com/agmikhailov/example2361M2.git) registered for path 'example2361M2'
>>    $ git submodule update
>>    Cloning into '/tmp/example2361/example2361M1/example2361M2'...
>>    Submodule path 'example2361M2': checked out '9ed39cf1fe0a8cf34e72d2e7ebff1ea9d4a63ac1'
> 
> I concur.
> 
>>> git pull --recurse-submodules=true
>>
>> And after that:
>>
>>    $ cd ../..
>>    $ git pull --recurse-submodules=true
>>    Fetching submodule example2361M1
>>    Fetching submodule example2361M1/example2361M2
>>    Already up to date.
> 
> Yes, I agree that _probably_ what the user wanted is to initialize the
> submodules recursively.
> 
> Having said that, I vaguely remember that e.g. Boost has this insane
> forest of submodules, and I am almost certain that no sane person wants
> to clone them all. _I_ wouldn't.
> 
>>> ACTUAL RESULT
>>>
>>> "git --recurse-submodules=true" command fails with message "Could not access submodule":
>>>
>>> $ git --recurse-submodules=true
>>> Fetching submodule example2361M1
>>> Could not access submodule 'example2361M2'
>>>
>>> EXPECTED RESULT
>>>
>>> All submodules are successfully updated by "git --recurse-submodules=true" command.
>>>
>>> ADDITIONAL INFORMATION
>>>
>>> Git version 2.20.1 does not have this problem.
>>> So we had to downgrade Git to work with submodules.
>>
>> The behavior was indeed different with v2.20.1, but that version
>> didn't show the behavior you expected.  When running your commands
>> with v2.20.1 I get:
>>
>>    $ ~/src/git/bin-wrappers/git pull --recurse-submodules=true
>>    Fetching submodule example2361M1
>>    Already up to date.
>>    $ find example2361M1/example2361M2/
>>    example2361M1/example2361M2/
>>
>> So while that 'git pull' didn't error out, it didn't even look at the
>> nested submodule, which is still uninitialized and empty.
> 
> I would actually argue that this is what is expected: the entire _point_
> of submodules is that they can be inactive.
> 
> Coming back to the Boost example, what I would want Git to do when only
> a fraction of the submodules is active is to skip the inactive ones
> during a `git pull --recurse-submodules=true`.
> 
> Which v2.20.1 apparently did, and I would call the current behavior a
> regression.

I agree, and this regression was fixed recently by Peter Kaestle.
This bug was in fact reported twice recently ([1], [2]) and fixed
earlier this month [3] (full threads: [4], [5]). The patch series is currently cooking
in the 'next' branch [6]. You can check out just the fix from [7].
It should be included in Git 2.31.

Cheers,

Philippe.

[1] https://lore.kernel.org/git/op.0vvmwohypvqxoc@damia/
[2] https://lore.kernel.org/git/1604413399-63090-1-git-send-email-peter.kaestle@nokia.com/
[3] https://lore.kernel.org/git/20201209105844.7019-1-peter.kaestle@nokia.com/
[4] https://lore.kernel.org/git/CAN0XMOLiS_8JZKF_wW70BvRRxkDHyUoa=Z3ODtB_Bd6f5Y=7JQ@mail.gmail.com/t/#u
[5] https://lore.kernel.org/git/1604413399-63090-1-git-send-email-peter.kaestle@nokia.com/t/#u
[6] https://github.com/git/git/blob/730f2a8a60960c30a79e00ebe034836f60befbf0/whats-cooking.txt#L765-L771
[7] https://github.com/gitster/git/tree/pk/subsub-fetch-fix-take-2

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-01-01 15:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-23  7:22 Issue: "Could not access submodule" error when pulling recursively with Git 2.22.0 Aleksey Mikhaylov
2019-10-23 10:04 ` SZEDER Gábor
2019-10-25 12:41   ` Johannes Schindelin
2021-01-01 15:42     ` Philippe Blain

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).