git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Philippe Blain <levraiphilippeblain@gmail.com>
To: Bryce Glover <randomdsdevel@gmail.com>, git@vger.kernel.org
Cc: Emily Shaffer <emilyshaffer@google.com>
Subject: Re: Automatically Handling Using/Checking Out Branches With One or More Different Contained Submodules?
Date: Fri, 27 Aug 2021 18:48:21 -0400	[thread overview]
Message-ID: <5a70d535-47b0-a4ea-b4e4-572a1bcfe997@gmail.com> (raw)
In-Reply-To: <CALH-JHvKjK7KU+Z_R7kG291DQKyb3f=LwxcbP4fn-qL2eeosBQ@mail.gmail.com>

Hi Bryce,

Le 2021-08-24 à 08:00, Bryce Glover a écrit :
> (Note:  If this question would fit better on the git-users Google
> Group, I apologize, but I saw that, unlike there — unless I overlooked
> something? —, you could send messages here even if you weren't a list
> subscriber.)
> 
> To whom it may concern,
> 
>       Currently, the only method I've seen that you can reliably use to
> switch between different branches when they don't all have the same
> contained submodules comes from the Stack Overflow answer at
> <https://stackoverflow.com/a/64690495/3319611>.  I'll reproduce the
> Bash snippet it presents as a solution here for completeness's sake:
> 
> ```
> export TARGET_BRANCH="my-branch-name"
> export CURRENT_BRANCH=$(git rev-parse --abbrev-ref HEAD)
> if [ -f ".gitmodules" ]; then
>    git submodules deinit --all
>    mkdir -p .git/git-tools/modules
>    mv .git/modules .git/git-tools/modules/$CURRENT_BRANCH
> fi
> 
> git checkout $TARGET_BRANCH
> 
> if [ -f ".gitmodules" ]; then
>    if [ -f ".git/git-tools/modules/$TARGET_BRANCH" ]; then
>      git mv .git/git-tools/modules/$TARGET_BRANCH .git/modules
>    fi
> 
>    git submodule sync && git submodule update --init
> fi
> ```
> 
> This involves invoking some actions before '`git checkout`,' so I
> couldn't have a couple of Git hooks handle this since, per '`git help
> hooks`,' Git doesn't implement a 'pre-checkout' hook, only a
> post-checkout one.  That wouldn't be enough of a use case to motivate
> adding that, though, would it?  Alternatively, '`git checkout`'
> would, ideally, handle this automatically, perhaps when requested by
> flag if it wouldn't make sense for this behavior to be the default
> one.  I don't know if I'd personally be up to contributing either one
> or both of either of those approaches, at least not right away, but,
> hypothetically, how involved might that turn out to be?
The script above is in my opinion complicated and uneeded. I will copy below some content
from the original stackoverflow post, and reply inline. The bottom line is:

1. 'git checkout --recurse-submodules $ref' should always work in an ideal
world, but there are still some missing pieces as of today.
2. 'git checkout $ref && git submodule sync --recursive && git submodule update
--recursive' should always work.

Here goes:

> The problem I am facing is that I cannot figure out a set of git commands that
> will consistently work in all cases when switching between branches if the
> branch has submodules. In my test, I have one repo with 3 branches, one branch
> has no submodules (master), another module [sic, should be "branch"] has 1 submodule (submodule-test),
> and another branch has another submodule (submodule-test2) that points to a
> different repository (at the same path). 
[...]

> In example here is one set of commands and their failure in git 2.20.1
> 
> cd /tmp
> git clone git@github.com:simpleviewinc/git-tools-test.git ./checkout --recurse-submodules
> cd checkout
> git checkout submodule-test
> git submodule sync
> git submodule update
> # branch submodule-test fully checked out, all submodules downloaded, looking good!
> git submodule deinit --all
> git checkout submodule-test2
> git submodule sync
> git submodule update
> fatal: remote error: upload-pack: not our ref c1bba6e3969937125248ee46e308a8efec8ac654
> Fetched in submodule path 'submodule', but it did not contain c1bba6e3969937125248ee46e308a8efec8ac654. Direct fetching of that commit failed.
> It fails because it uses the wrong submodule remote, even though I thought that was the explicit purpose of submodule sync. 

The reason it fails is because the 'git submodule sync' did *not* update the
'remote.origin.url' configuration in the Git config file of the submodule, i.e.
/tmp/checkout/.git/modules/submodule/config, despite (confusingly) outputing
"Synchronizing submodule url for 'submodule'".  It does change the
'submodule.$name.url' value in the config file *of the superproject*, though,
(/tmp/checkout/.git/config) but 'git submodule update' only uses
'remote.origin.url' in the submodule's config file (if it exists).

The reason why the second 'git submodule sync' does not change
remote.origin.url is because the submodule has been deinitialized when the
command is run, because the previous command is 'git submodule deinit --all',
which deinitializes all submodules. So this second step of changing
'remote.origin.url' is skipped for deinitialized submodules (this fact is
missing from the doc).

The following sequence would be the correct one:

git clone git@github.com:simpleviewinc/git-tools-test.git ./checkout [--recurse-submodules] # --recurse-submodules is optional
cd checkout
git checkout submodule-test
git submodule update --init
git checkout submodule-test2
git submodule sync
git submodule update
# and we can switch back
git checkout submodule-test
git submodule sync
git submodule update

Note that the correct command to initialize  all submodules is 'git submodule
init' or, to initialize, clone and check them out in a single step, 'git
submodule update --init'. The fact that the first 'git submodule sync' works to
initialize the submodule is in fact due to using '--recurse-submodules' for
'git clone', which sets 'submodule.active' to the match-all pathspec '.' in the
superproject's config, and 'git submodule sync' recurses into *active*
submodules (this is also missing from the doc).

So for 'git checkout', what you want to achieve can be done with a
post-checkout hook:

```shell
#!/bin/sh

# If the checkout was a branch checkout [1], update the submodules
# to the commits recorded in the superproject
# [1] https://git-scm.com/docs/githooks#_post_checkout

previous_head=$1
new_head=$2
checkout_type_flag=$3

if [ "$checkout_type_flag" -eq 1 ] ; then
   git submodule sync --recursive
   git submodule update --init --recursive
fi
```

Here I add '--recursive' to both commands for extra safety in case any of your
submodules themselve contain submodules.

Now, about 'git checkout --recurse-submodules':

> I tried --recurse-submodules but that fails too, but this time when checking
> from a branch without submodules to a branch with submodules.
> 
> cd /tmp
> git clone git@github.com:simpleviewinc/git-tools-test.git ./checkout --recurse-submodules
> cd checkout
> git checkout submodule-test --recurse-submodules
> fatal: not a git repository: ../.git/modules/submodule
> fatal: could not reset submodule index

Yeah, that one is bad. The reason it fails is because:

1. 'git clone --recurse-submodules' in fact runs a simple 'git submodule update
--init --recursive' at the end of the process, at the step where it's checking
out the working tree. This means that only submodules present *in the branch
being checked out* get initialized and cloned.

2. 'git clone --recurse-submodules' *always* writes 'submodule.active=.' to the
superproject's config.

3. 'git checkout --recurse-submodules' recurses into active submodules, and for
that it needs access to the Git repository of the submodule, which does not
exist yet since it was not cloned.

This is the same error you get when you try to recursively checkout an older
revision that contains a submodule that was since deleted [1], [2].

I suggested in those threads a few ways all this could be improved; here is my
up to date take on the subject:

1. 'git clone --recurse-submodules' should be able to at least clone *all*
submodules for *all* branches that are cloned, and put their Git directory in
.git/modules/.  This would allow your use case to "just work" with 'git
checkout --recurse-submodules'

2. 'git clone --recurse-submodules' could also be taught to clone all
submodules for *all* revisions of *all* branches that are cloned. This would
allow the "deleted submodule" cases I mentioned to work, but would not be
wanted in all situations, so it could be a supplementary flag to 'git clone'.

3. 'git fetch' should be taught to clone new submodules.

4. 'git checkout --recurse-submodules' could be taught to clone missing
submodules and fetch missing submodules commits.  This would cover both your
use case as well as the "deleted submodule" use case.

5. In any case, 'git checkout --recurse-submodules' should not abort midway if
it can't find the Git repository of the submodule and leave '.gitmodules'
untracked in the working tree, and a 'config' file in .git/modules/$name/ with
only 'core.worktree' set.

Hope that helps,

Philippe Blain.

[1] https://lore.kernel.org/git/20200501005432.h62dnpkx7feb7rto@glandium.org/T/#u
[2] https://lore.kernel.org/git/CAE5ih78zCR0ZdHAjoxguUb3Y6KFkZcoxJjhS7rkbtZpr+d1n=g@mail.gmail.com/t/#u

  reply	other threads:[~2021-08-27 22:48 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CALH-JHvNHVvPWap8eiMaQ6HAJOBa4W5xuA9V_s7zPLubgwaRow@mail.gmail.com>
2021-08-24 12:00 ` Automatically Handling Using/Checking Out Branches With One or More Different Contained Submodules? Bryce Glover
2021-08-27 22:48   ` Philippe Blain [this message]
2021-08-28  1:23     ` Bryce Glover

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5a70d535-47b0-a4ea-b4e4-572a1bcfe997@gmail.com \
    --to=levraiphilippeblain@gmail.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=randomdsdevel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).