git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jacob Keller <jacob.keller@gmail.com>
To: Glen Choo <chooglen@google.com>
Cc: Git mailing list <git@vger.kernel.org>,
	Emily Shaffer <emilyshaffer@google.com>,
	Philippe Blain <levraiphilippeblain@gmail.com>
Subject: Re: issue with submodules using origin remote unexpectadly
Date: Tue, 11 Oct 2022 17:13:50 -0700	[thread overview]
Message-ID: <CA+P7+xqf_Q35C0VT8A-zCRf46zSbXHhH5EhTo2vLvTJ9B6jyow@mail.gmail.com> (raw)
In-Reply-To: <kl6lilkq0zpr.fsf@chooglen-macbookpro.roam.corp.google.com>

On Tue, Oct 11, 2022 at 3:20 PM Glen Choo <chooglen@google.com> wrote:
>
> Jacob Keller <jacob.keller@gmail.com> writes:
>
> > On Tue, Oct 4, 2022 at 11:12 AM Glen Choo <chooglen@google.com> wrote:
> >>
> >> Hi Jacob! Thanks for the report!
> >>
> >
> > Thanks for responding!
>
> :)
>
> >> Or, if you could include a reproduction script, that would be really
> >> helpful :)
> >>
> >
> > I'm not sure how to do this, because it is only an intermittent
> > failure. I suspect it has to do with when the submodule actually needs
> > to update.
> >
> > Perhaps I can come up with something though. If I can, I'll send it as
> > a new test.
>
> That would be greatly appreciated, thanks!
>
> If you find code pointers useful,
>
> - builtin/submodule--helper.c:fetch_in_submodule() contains the logic
>   for fetching during "git submodule update"
>
> - submodule.c:fetch_submodules() contains the logic for fetching during
>   "git fetch --recurse-submodules" (which is invoked by "git pull
>   --recurse-submodules").
>

I was able to get a test highlighting the failure. It shows the case
of a single remote working but adding another remote causes it to fail
as it falls back to the 'origin'.

>
> >> >
> >> > remote: Enumerating objects: 210, done.
> >> > remote: Counting objects: 100% (207/207), done.
> >> > remote: Compressing objects: 100% (54/54), done.
> >> > remote: Total 210 (delta 123), reused 197 (delta 119), pack-reused 3
> >> > Receiving objects: 100% (210/210), 107.20 KiB | 4.29 MiB/s, done.
> >> > Resolving deltas: 100% (123/123), completed with 48 local objects.
> >> > From <redacted>
> >> > ...
> >> > Fetching submodule submodule
> >> > From <redacted>
> >> >    85e0da7533d9..80cc886f1187  <redacted>
> >> > Fetching submodule submodule2
> >> > fatal: 'origin' does not appear to be a git repository
> >> > fatal: Could not read from remote repository.
> >> >
> >> > Please make sure you have the correct access rights
> >> > and the repository exists.
> >> > Errors during submodule fetch:
> >> >         submodule2
> >>
> >> I assume this is `git fetch` running in the superproject?
> >>
> >
> > Its git pull --rebase, but I suppose as part of this it will run
> > something equivalent to git fetch?
>
> Unfortunately, this doesn't narrow it down much because "git pull
> --recurse-submodules" runs _both_ "git fetch --recurse-submodules" _and_
> "git submodule update [--rebase]" ;) Without more context, it's not
> clear which of those is failing.
>

It's definitely "git fetch --recurse-submodules", the new test should show this.

> >> When fetching with `git fetch`, submodules are fetched without
> >> specifying the remote name, which means Git guesses which remote you
> >> want to fetch from, which is documented at
> >> https://git-scm.com/docs/git-fetch. I believe (I haven't reread this
> >> very closely) this is, in order:
> >>
> >> - The remote of your branch, i.e. the value of the config value
> >>   `branch.<name>.remote`
> >
> > So basically if its checked out to a branch it will fetch from the
> > remote of that branch, but...
> >
> >> - origin
> >>
> >
> > It defaults to origin, so if you have the usual "checked out as a
> > detached head" style of submodule, it can't find the remote branch.
>
> Yes, this sounds about right. I was quite certain that we only default
> to "origin", but I observe that "git fetch" doesn't fail if there is
> only one remote and it is not named "origin". Perhaps I'm mistaken, or I
> simply couldn't track down that logic.
>

We definitely default to the single/lone remote, I have two tests, one
which shows the single remote working and another which shows the
additional remote causing the failure.

> >> But... I'll mention another wrinkle for completeness' sake (though I
> >> don't think it applies to you). If you fetch using `git submodule
> >> update`, the submodule is fetched using a _named_ remote, specifically:
> >>
> >> - If the superproject has a branch checked out, it uses the name of the
> >>   superproject branch's remote.
> >
> > Right, so that explains why I can re-run git submodule update after a
> > git pull --rebase and it works.
> >
> > In theory wouldn't it make more sense to use the remote based on the
> > URL of the .gitmodules file?
>
> Ah, yes that's one possibility we (the folks working on an improved
> Submodules UX) have considered. Another would be to teach submodules to
> actually use branches correctly and to use the remotes of the branches.
>

Yes, if we can have it checkout on a branch and just rewind that
branch to match the expected commit instead of having it in a detached
state, things would be much easier. I recall work being done on this
years ago, but it is quite a thorny problem.

> In general, the project tries not to respect config coming directly from
> .gitmodules (c.f. [1]), but I agree that there's a lot of room for
> improvement.
>

Right. I think I'd rather go with a config option inside the
.git/config [submodule] section. I don't think gitmodules itself needs
to know this, just that the parent project could be informed of what
remote to default to when fetching inside the submodule. That or
somehow unify the git submodule update code with the recursive
fetching?

> [1] https://lore.kernel.org/git/xmqq35bze3rr.fsf@gitster.g
>
> >> - If the superproject does not have a branch checked out, it uses
> >>   "origin".
> >>
> >
> > I suppose one option would be to make this configurable. I started
> > using "upstream" as the default remote name for most of my
> > repositories when I began working with forks a lot more.
>
> My hope is that the work I mentioned earlier makes this code obsolete
> and nobody ever has to configure this ;)
>

Yea. I definitely like the idea of using branches instead of a
detached head state.

I think for now I can avoid this by just disabling recursive fetch in
my config, which at least gets around the problem well enough.

Another alternative I thought was maybe "try to fetch every remote"
instead of trying to fetch only a single remote?

> >
> >> >
> >> > Thanks,
> >> > Jake

      reply	other threads:[~2022-10-12  0:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-04 17:43 issue with submodules using origin remote unexpectadly Jacob Keller
2022-10-04 18:12 ` Glen Choo
2022-10-11 21:15   ` Jacob Keller
2022-10-11 22:20     ` Glen Choo
2022-10-12  0:13       ` Jacob Keller [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+P7+xqf_Q35C0VT8A-zCRf46zSbXHhH5EhTo2vLvTJ9B6jyow@mail.gmail.com \
    --to=jacob.keller@gmail.com \
    --cc=chooglen@google.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=levraiphilippeblain@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).