git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: "Paolo Pettinato (ppettina)" <ppettina@cisco.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Git fetch bug in git 2.21+ "Could not access submodule '%s'"
Date: Wed, 14 Aug 2019 11:36:07 -0400	[thread overview]
Message-ID: <20190814153607.GB12093@sigill.intra.peff.net> (raw)
In-Reply-To: <951a0ac4-592f-d71c-df6a-53a806249f7b@cisco.com>

On Wed, Aug 14, 2019 at 09:57:50AM +0000, Paolo Pettinato (ppettina) wrote:

> The issue happens when fetching an updated ref from a remote, and that 
> ref updates a submodule which is not checked out but whose folder is dirty.
> 
> Steps to reproduce (on *nix) with repositories on GitHub:
> [...]
> # Repo now contains a folder named "sm" which is bound to contain a 
> submodule checkout. But the submodule is not checked out yet.
> # Dirty that folder:
> $ touch sm/test
> 
> # Fetching another branch will fail
> $ git fetch origin branch_1

Thanks, I was able to reproduce here. Since this worked in v2.18.1, it
was an easy candidate for bisecting.  The resulting commit is 26f80ccfc1
(submodule: migrate get_next_submodule to use repository structs,
2018-11-28).

It looks like your case falls afoul of this logic added by that commit
in get_next_submodule():

  repo = get_submodule_repo_for(spf->r, submodule);
  if (repo) {
          ...
  } else {
          /*
	   * An empty directory is normal,
	   * the submodule is not initialized
	   */
	  if (S_ISGITLINK(ce->ce_mode) &&
	      !is_empty_dir(ce->name)) {
		  spf->result = 1;
		  strbuf_addf(err,
			      _("Could not access submodule '%s'"),
			      ce->name);
	  }
  }

Because you created a file in the uninitialized submodule directory, it
fools the is_empty_dir() check. It seems like there should be a more
robust way to check whether the submodule is initialized. Maybe:

diff --git a/submodule.c b/submodule.c
index 77ace5e784..748ebe5909 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1294,6 +1294,9 @@ static int get_next_submodule(struct child_process *cp,
 		if (!S_ISGITLINK(ce->ce_mode))
 			continue;
 
+		if (!is_submodule_active(spf->r, ce->name))
+			continue;
+
 		submodule = submodule_from_path(spf->r, &null_oid, ce->name);
 		if (!submodule) {
 			const char *name = default_name_or_path(ce->name);

but that seems to fail t5526's "on-demand works without .gitmodules
entry" test.

I think is_submodule_populated_gently() more exactly matches what the
current code is trying to do, like so:

diff --git a/submodule.c b/submodule.c
index 77ace5e784..4b26faee5d 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1347,8 +1347,7 @@ static int get_next_submodule(struct child_process *cp,
 			 * An empty directory is normal,
 			 * the submodule is not initialized
 			 */
-			if (S_ISGITLINK(ce->ce_mode) &&
-			    !is_empty_dir(ce->name)) {
+			if (is_submodule_populated_gently(ce->name, NULL)) {
 				spf->result = 1;
 				strbuf_addf(err,
 					    _("Could not access submodule '%s'"),

but it feels odd to me. Even if the submodule is not currently checked
out, we'd presumably still want to do the recursive fetch as long as we
have a repo under $GIT_DIR/modules? And anyway, it fails another test in
t5526 ("fetching submodule into a broken repository").

Maybe somebody with more submodule expertise can jump in.

> # Re-issuing the command succeeds
> $ git fetch origin branch_1
> [...]
> I'd expect the command not to fail, or to fail consistently.

I think that part is expected. After receiving new objects, `fetch`
tries to see if it found out about any new submodules that might need to
be recursively fetched. In your first command, we actually received the
new objects (but then erroneously complained because the submodule was
not initialized). In the second, we already had those objects and didn't
need to do that check.

-Peff

  reply	other threads:[~2019-08-14 15:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-14  9:57 Git fetch bug in git 2.21+ "Could not access submodule '%s'" Paolo Pettinato (ppettina)
2019-08-14 15:36 ` Jeff King [this message]
2019-08-14 16:40   ` Junio C Hamano
2019-08-14 17:03     ` Paolo Pettinato (ppettina)
2019-08-14 17:56       ` Junio C Hamano
2019-08-14 19:31     ` Jeff King
2019-08-14 15:59 ` [PATCH] get_next_submodule(): format error string as an error Jeff King
2019-08-14 16:15   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190814153607.GB12093@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=ppettina@cisco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).