git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Johan Herland <johan@herland.net>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: RFC: Making submodules "track" branches
Date: Tue, 08 Jun 2010 09:12:31 +0200	[thread overview]
Message-ID: <201006080912.31448.johan@herland.net> (raw)
In-Reply-To: <AANLkTilBQPHgkCLJ7ppNo5TwC9Bdmqo-OMRpaDFwbQPd@mail.gmail.com>

On Tuesday 08 June 2010, Ævar Arnfjörð Bjarmason wrote:
> On Fri, May 21, 2010 at 16:10, Ævar Arnfjörð Bjarmason <avarab@gmail.com> 
wrote:
> > Add a $toplevel variable accessible to `git submodule foreach`, it
> > contains the absolute path of the top level directory (where
> > .gitmodules is).
> > 
> > This makes it possible to e.g. read data in .gitmodules from within
> > foreach commands. I'm using this to configure the branch names I want
> > to track for each submodule:
> > 
> >    git submodule foreach 'git checkout $(git config --file
> > $toplevel/.gitmodules submodule.$name.branch) && git pull'
> > 
> > For a little history: This patch is borne out of my continuing fight
> > of trying to have Git track the branches of submodules, not just their
> > commits.
> > 
> > Obviously that's not how they work (they only track commits), but I'm
> > just interested in being able to do:
> > 
> >    git submodule foreach 'git pull'
> > 
> > Of course that won't work because the submodule is in a disconnected
> > head, so I first have to connect it, but connect it *to what*.
> > 
> > For a while I was happy with this because as fate had it, it just so
> > happened to do what I meant:
> > 
> >    git submodule foreach 'git checkout $(git describe --all --always)
> > && git pull'
> > 
> > But then that broke down, if there's a tag and a branch the tag will
> > win out, and I can't git pull a branch:
> > 
> >    $ git branch -a
> >    * master
> >      remotes/origin/HEAD -> origin/master
> >      remotes/origin/master
> >    $ git tag -l
> >    release-0.0.6
> >    $ git describe --always --all
> >    release-0.0.6
> > 
> > So I figured that I might as well start tracking the branches I want
> > in .gitmodules itself:
> > 
> >    [submodule "yaml-mode"]
> >        path = yaml-mode
> >        url = git://github.com/yoshiki/yaml-mode.git
> >        branch = master
> > 
> > So now I can just do (as stated above):
> > 
> >    git submodule foreach 'git checkout $(git config --file
> > $toplevel/.gitmodules submodule.$name.branch) && git pull'
> > 
> > Maybe there's a less painful way to do *that* (I'd love to hear about
> > it). But regardless of that I think it's a good idea to be able to
> > know what the top-level is from git submodule foreach.
> 
> This patch is getting merged to next as per the June 2 What's cooking
> in Git post.
> 
> But I wonder how evil it would be to expand this this idea to allow
> the porcelain to track branches instead of commits at the porcelain
> level.
> 
> That /could/ work like this. The tree format would be exactly the
> same, i.e. bound to a specific commit:
> 
>     $ git ls-tree HEAD | grep subthing
>     160000 commit 37469ca3fae264e790e4daac0fa8f2ddf8039c93  subthing
> 
> *But*, the user could add some new submodule.*.* config key/values
> that specify what branch the module should track and whether 'git
> pull' on the master project should also pull new changes (from the
> 'newstuff' branch) into the submodule:
> 
>     [submodule "subthing"]
>         path = subthing
>         url = git://github.com/avar/subthing.git
>         branch = newstuff
>         update-on-pull = true

I certainly like the idea, and so far this is the best way I've seen for 
associating submodules to branches. I don't like the last "update-on-pull" 
option, though. It should probably be somewhat more general set of 
options/triggers with a richer set of values than true/false. What about 
something like this?

    [submodule "subthing"]
        path = subthing
        url = git://github.com/avar/subthing.git
        branch = newstuff
        on-pull = checkout,pull
        on-checkout = checkout
        on-commit = ignore (or commit?)
        ...

See below for more discussion...

> Coupled with .gitignore this would allow for SVN-like externals that
> always track the latest version of upstream, but it'd all be done on
> the porcelain side.
> 
> The checked out copy wouldn't match the commit in the tree, but the
> user could still git add && git commit it to record the new commit in
> the master repository history.
> 
> The lack of this ability seems to be a fairly common complaint about
> submodules in Git, that you always have to do something in the parent
> project to update the submodules, even if you don't care about
> specific revisions, or the ability to roll back.
> 
> I couldn't find a prior discussion of this on the list, maybe this has
> been beaten to death already.

There are a lot of non-trivial challenges when you want to aggregate several 
submodule operations into a single "toplevel" command. Here are some off the 
top of my head:

- When submodule pulls result in conflicts, these must be presented to the 
user in a way that's simple and straightforward for the user to resolve.

- When switching branches in the superrepo, you sometimes also want to 
switch branches in the submodule. This is signalled by changing the 
submodules.subthing.branch variable in .gitmodules between the two branches. 
However, it means that the submodule's update/pull operation must also be 
done on 'checkout' in the superrepo.

- How to handle local/uncommitted (staged or unstaged) modifications in a 
submodule when pulling or switching branches in the superrepo? The right 
answer here is probably to do the same as in the no-submodule case, i.e. to 
refuse if it would clobber/conflict with the local modifications.

- When you track submodule branches instead of commits, the actual commit 
referenced in the superrepo is no longer as important (provided it's part of 
the ancestry of the submodule branch you're tracking). However, diff/status 
will still list the submodule as changed because you checked out a different 
commit from what Git has recorded. This raises two concerns: (1) What 
_should_ be considered "changed" from the diff/status perspective when 
tracking submodule branches? and (2) When do you update the commit reference 
in the submodule? "never" would work (since you're checking out a different 
commit anyway), "always" would also work (for the same reason), but would 
litter the superrepo history with submodule updates. There may be a better 
alternative somewhere in between.

- If you want to give the illusion of "one big repo" then maybe it should 
also be possible to trigger submodule commits from a superrepo commit? (i.e. 
having a single toplevel "git commit" also trigger commits in submodules). 
Some users will want to specify the commit message for each submodule 
separately (IMHO the better approach), while some will want to give only one 
commit message that is reused in every submodule commit.

- As always with submodules, keep the case of nested submodules in mind.

There are probably more issues that escape me now...

Thanks for resurrecting the discussion.


Have fun! :)

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

  reply	other threads:[~2010-06-08  7:12 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-07 23:29 RFC: Making submodules "track" branches Ævar Arnfjörð Bjarmason
2010-06-08  7:12 ` Johan Herland [this message]
2010-06-08 15:34   ` Marc Branchaud
2010-06-08 16:09     ` Ævar Arnfjörð Bjarmason
2010-06-08 19:32       ` Marc Branchaud
2010-06-08 20:23         ` Ævar Arnfjörð Bjarmason
2010-06-09 14:36           ` Marc Branchaud
2010-06-08 16:06   ` Jens Lehmann
2010-06-08 21:52     ` Johan Herland
2010-06-09  7:23       ` Jens Lehmann
2010-06-09  8:22         ` Johan Herland
2010-06-09 12:47           ` Steven Michalske
2010-06-09 14:37             ` Johan Herland
2010-06-08 23:09     ` Junio C Hamano
2010-06-08 23:19       ` Ævar Arnfjörð Bjarmason
2010-06-09  7:09         ` Jens Lehmann
2010-06-09  7:15       ` Jens Lehmann
2010-06-09 15:36         ` Marc Branchaud
2010-06-09 18:54           ` Ævar Arnfjörð Bjarmason
2012-11-20 11:16             ` nottrobin
2012-11-20 12:04               ` W. Trevor King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201006080912.31448.johan@herland.net \
    --to=johan@herland.net \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).