git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* GIT submodules
       [not found]       ` <s5h7ihhknez.wl%tiwai@suse.de>
@ 2008-02-07 21:24         ` Rene Herman
  0 siblings, 0 replies; 36+ messages in thread
From: Rene Herman @ 2008-02-07 21:24 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: ALSA development, git

On 07-02-08 12:37, Takashi Iwai wrote:

(sorry, that's not git-devel@, but simply git@)

>> I believe the git submodule stuff would also nicely allow all of ALSA to be 
>> one giant repo basically, with kernel, lib, ..., as submodules.
> 
> Just out of curiosity, what could be a merit of submodules in the case
> of ALSA?

Given that they're used for larger projects, I can't say I've used them but
I read about them when Linus mentioned them in the context of KDE maybe
switching:

http://lwn.net/Articles/246381/

Basically, submodules are the  actual git repositories with one organizing
superproject. This seems to be a fairly nice description of the submodule
support:

http://www.ishlif.org/blog/linux/git-submodules/

What they provide is stitching the parts together nicely into one coherent
release. In this case, you'd have alsa-driver, alsa-lib, alsa-utils and so
on repos, and an "alsa-project" superproject tying them together, where you
could do checkouts of a complete coherent release off all the modules for
example.

As said, I haven't actually used them, so I've added the git list (*) to see
if anyone has something to add, correct or explain (please do!). Submodules
seem to be intended exactly for the kind of setup that ALSA is using with
the many semi-independent parts...

(*) git list: alsa-devel is moderated for non-subscribers but you'll be
whitelisted after landing in the queue once if you're not a subscriber

Rene.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* git submodules.
@ 2008-04-28 19:50 Victor Bogado da Silva Lins
  2008-04-28 21:01 ` Miklos Vajna
  0 siblings, 1 reply; 36+ messages in thread
From: Victor Bogado da Silva Lins @ 2008-04-28 19:50 UTC (permalink / raw)
  To: git

Is there any documentation about how those work?

What I need is this, I have a already existing git repository, that have
a subdir that could be seen as submodule (by this I mean that he is
related, but could have a different commit tree). The git repository
already exists and has many commits that apply to either the submodule
or the main module (I would say that there is no commit that touch
both). So is it possible to separate them easily? Would it keep my older
commits? 

bellow is a shell script that samples my setup: 

=========================== cut here =================================
#!/bin/bash

gitdir=git_submodules_dir

if [[ -d $gitdir ]]; then 
	rm -rf $gitdir;
fi

mkdir $gitdir

cd $gitdir

git init

echo "testing 1 2 3" > file_a.txt
git add file_a.txt
git commit -m "initial setup"

mkdir submodule

echo "submodule file" > submodule/file_b.txt
git add submodule/file_b.txt
git commit -m "submodule file"

echo "updated main file" >> file_a.txt
git commit -a -m "updated file in main module"

echo "updated sumodule file" >> submodule/file_b.txt
git commit -a -m "updated file in the submodule"
=========================== cut here =================================

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules.
  2008-04-28 19:50 Victor Bogado da Silva Lins
@ 2008-04-28 21:01 ` Miklos Vajna
  0 siblings, 0 replies; 36+ messages in thread
From: Miklos Vajna @ 2008-04-28 21:01 UTC (permalink / raw)
  To: Victor Bogado da Silva Lins; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 920 bytes --]

On Mon, Apr 28, 2008 at 04:50:20PM -0300, Victor Bogado da Silva Lins <victor@bogado.net> wrote:
> Is there any documentation about how those work?

Yes. There is a chapter in the user manual and there is the
git-submodule manpage.

> What I need is this, I have a already existing git repository, that have
> a subdir that could be seen as submodule (by this I mean that he is
> related, but could have a different commit tree). The git repository
> already exists and has many commits that apply to either the submodule
> or the main module (I would say that there is no commit that touch
> both). So is it possible to separate them easily?

Yes, see git-filter-branch.

> Would it keep my older commits? 

Not really, git-filter-branch will rewrite history, so the commit hashes
will change. Of course no code will be lost, but after such a migration,
people will not be able to easily just pull.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* git submodules
@ 2008-07-28 16:20 Pierre Habouzit
  2008-07-28 16:23 ` Pierre Habouzit
  2008-07-28 20:23 ` Nigel Magnay
  0 siblings, 2 replies; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-28 16:20 UTC (permalink / raw)
  To: Git ML

[-- Attachment #1: Type: text/plain, Size: 2813 bytes --]


While trying to sum up some things I'd like submodules to do, and things
like that, I came to ask myself why the heck we were doing things the
way we currently do wrt submodules.

This question is related to the `.git` directories of submodules. I
wonder why we didn't chose to use a new reference namespace
(refs/submodules/$path/$remote/$branch).

This would have the net benefit that most of the plumbing tasks would be
easier if they have to deal with submodules, because they aren't in this
uncomfortable situation where they have to recurse into another git
directory to know what to do.

It also has the absolutely nice property to share objects, so that
projects that replaced a subdirectory with a submodule don't see their
checkouts grow too large.

We probably still want submodules to act like plain independant git
repositories, but one can still *fake* that this way: submodules have
only a .git/config file (also probably an index and a couple of things
like that, but that's almost a different issue for what I'm considering
now) that has the setting:

    [core]
        submodule = true

This could make all the builtins look for the real $GIT_DIR up, which in
turn gives the submodule "name". Then, for this submodule, every
reference, remote name, ... would be virtualized using the
"remote/$submodule_name" prefix. IOW, in a submodule "some/sub/module"
the branch "origin/my/topic/branch" is under:
  refs/submodules/some/sub/module/origin/my/topic/branch
  <-- submod. --><-- submod.  --><-- --><--  branch  -->
     namespace 	     path/name   remote
Note that this doesn't mean that we must rip out .gitmodules, because
it's needed to help splitting the previous reference name properly, and
for bootstrapping purposes.


Having that, one can probably extend most of the porcelains in _very_
straightforward ways. For example, a local topic branch `topic` would be
the union of the supermodule `topic` branch, and all the
`refs/submodules/$names/topic` ones.

Most importantly, it would help implementing that tries to make your
submodules stay _on branch_. One irritating problem with submodules, is
that when someone else commited, and that you git submodule update,
you're on a detached head. Absolutely horrible. If you see your current
branch (assume it's master), then when you do that, you would update
your `refs/submodules/$name/master` references instead and keep the
submodule HEADs `on branch`. Of course we can _probably_ hack something
together along those lines with the current setup, but it would be _so_
much more convenient this way...

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 16:20 git submodules Pierre Habouzit
@ 2008-07-28 16:23 ` Pierre Habouzit
  2008-07-28 20:23 ` Nigel Magnay
  1 sibling, 0 replies; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-28 16:23 UTC (permalink / raw)
  To: Git ML

[-- Attachment #1: Type: text/plain, Size: 749 bytes --]

On Mon, Jul 28, 2008 at 04:20:03PM +0000, Pierre Habouzit wrote:
> It also has the absolutely nice property to share objects, so that
> projects that replaced a subdirectory with a submodule don't see their
> checkouts grow too large.

  Especially it "fixes" git-new-workdir, which becomes really
inefficient (storage and typing-wise) when submodules are in use, since
it doesn't share git_dir's for the submodules (this could also be hacked
together, but again, it's so much more convenient if we have only _one_
git_dir per repository that ... oh well)



-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 16:20 git submodules Pierre Habouzit
  2008-07-28 16:23 ` Pierre Habouzit
@ 2008-07-28 20:23 ` Nigel Magnay
  2008-07-28 20:55   ` Pierre Habouzit
  1 sibling, 1 reply; 36+ messages in thread
From: Nigel Magnay @ 2008-07-28 20:23 UTC (permalink / raw)
  To: Pierre Habouzit, Git ML

>
> While trying to sum up some things I'd like submodules to do, and things
> like that, I came to ask myself why the heck we were doing things the
> way we currently do wrt submodules.
>
> This question is related to the `.git` directories of submodules. I
> wonder why we didn't chose to use a new reference namespace
> (refs/submodules/$path/$remote/$branch).
>
I'm maybe being a bit slow - what would be the contents of (say)
refs/submodules/moduleA/remotes/origin/master ? The ref
that's currently in moduleA/.git/refs/remotes/origin/master ?

> This would have the net benefit that most of the plumbing tasks would be
> easier if they have to deal with submodules, because they aren't in this
> uncomfortable situation where they have to recurse into another git
> directory to know what to do.
>
> It also has the absolutely nice property to share objects, so that
> projects that replaced a subdirectory with a submodule don't see their
> checkouts grow too large.
>

Ah.. are you meaning that the top-level repository contains all the
commits in all the submodules?

> We probably still want submodules to act like plain independant git
> repositories, but one can still *fake* that this way: submodules have
> only a .git/config file (also probably an index and a couple of things
> like that, but that's almost a different issue for what I'm considering
> now) that has the setting:
>
>    [core]
>        submodule = true
>
> This could make all the builtins look for the real $GIT_DIR up, which in
> turn gives the submodule "name". Then, for this submodule, every
> reference, remote name, ... would be virtualized using the
> "remote/$submodule_name" prefix. IOW, in a submodule "some/sub/module"
> the branch "origin/my/topic/branch" is under:
>  refs/submodules/some/sub/module/origin/my/topic/branch
>  <-- submod. --><-- submod.  --><-- --><--  branch  -->
>     namespace       path/name   remote
> Note that this doesn't mean that we must rip out .gitmodules, because
> it's needed to help splitting the previous reference name properly, and
> for bootstrapping purposes.
>

I was thinking a bit about submodules (because of the earlier
discussions about submodule update only pulling from origin, and the
associated difficulties) and started wondering if the best place for
the git repository for (say) submoduleA was really
<...>/submoduleA/.git/<> and not (say) something like
.git/submodules/submoduleA/<>. This would be nicer for people trying
to pull revisions from you because they could easily find submodule
repositories regardless or not of whether they currently exist in your
WC.

I got as far as looking at discussions around .gitlink but ran out of
avaiable time.

>
> Having that, one can probably extend most of the porcelains in _very_
> straightforward ways. For example, a local topic branch `topic` would be
> the union of the supermodule `topic` branch, and all the
> `refs/submodules/$names/topic` ones.
>
> Most importantly, it would help implementing that tries to make your
> submodules stay _on branch_. One irritating problem with submodules, is
> that when someone else commited, and that you git submodule update,
> you're on a detached head. Absolutely horrible. If you see your current
> branch (assume it's master), then when you do that, you would update
> your `refs/submodules/$name/master` references instead and keep the
> submodule HEADs `on branch`. Of course we can _probably_ hack something
> together along those lines with the current setup, but it would be _so_
> much more convenient this way...
>

For me, if I'm on heads/blah in the superproject, I probably want to
be on heads/blah in *all* submodules. But that's maybe just me.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 20:23 ` Nigel Magnay
@ 2008-07-28 20:55   ` Pierre Habouzit
  2008-07-28 20:59     ` Pierre Habouzit
  0 siblings, 1 reply; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-28 20:55 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Git ML

[-- Attachment #1: Type: text/plain, Size: 4157 bytes --]

On Mon, Jul 28, 2008 at 08:23:39PM +0000, Nigel Magnay wrote:
> >
> > While trying to sum up some things I'd like submodules to do, and things
> > like that, I came to ask myself why the heck we were doing things the
> > way we currently do wrt submodules.
> >
> > This question is related to the `.git` directories of submodules. I
> > wonder why we didn't chose to use a new reference namespace
> > (refs/submodules/$path/$remote/$branch).
> >
> I'm maybe being a bit slow - what would be the contents of (say)
> refs/submodules/moduleA/remotes/origin/master ? The ref
> that's currently in moduleA/.git/refs/remotes/origin/master ?

  Yes.

> > It also has the absolutely nice property to share objects, so that
> > projects that replaced a subdirectory with a submodule don't see their
> > checkouts grow too large.
> >
> 
> Ah.. are you meaning that the top-level repository contains all the
> commits in all the submodules?

  Yes. My suggestion is to share all the references, prefixing the
submodules ones with a distinct prefix (namely
refs/submodules/$name-of-the-submodule) to avoid any conflict, and share
the object store. You get coherent reflogs and stuff like that for free
on top.

> I was thinking a bit about submodules (because of the earlier
> discussions about submodule update only pulling from origin, and the
> associated difficulties) and started wondering if the best place for
> the git repository for (say) submoduleA was really
> <...>/submoduleA/.git/<> and not (say) something like
> ..git/submodules/submoduleA/<>. This would be nicer for people trying
> to pull revisions from you because they could easily find submodule
> repositories regardless or not of whether they currently exist in your
> WC.

  That too indeed (the "easier to clone" bit). OTOH, I don't like the
.git/submodules idea a lot, if you mean to put a usual $GIT_DIR layout
inside of it. With what I propose, you find objects for all your
super/sub-modules in the usual store, which eases many things.
Especially, I believe that when you replace a subdirectory of a project
with a submodule, git-blame could benefit quite a lot from this to be
able to glue history back through the submodule limits, without having
to refactor a _lot_ of code: it would merely have to dereference so
called "gitlinks" to the commit then tree, hence twice, and just do its
usual work, with your proposal, we still rely on having to recurse in
subdirectories which requires more boilerplate code.

> I got as far as looking at discussions around .gitlink but ran out of
> avaiable time.

  I shall say I never followed them, as I was uninterested with such
subjects before, (but now is as I use them at work). But I don't recall
such an idea to have been discussed at all, so...

> > Having that, one can probably extend most of the porcelains in _very_
> > straightforward ways. For example, a local topic branch `topic` would be
> > the union of the supermodule `topic` branch, and all the
> > `refs/submodules/$names/topic` ones.
> >
> > Most importantly, it would help implementing that tries to make your
> > submodules stay _on branch_. One irritating problem with submodules, is
> > that when someone else commited, and that you git submodule update,
> > you're on a detached head. Absolutely horrible. If you see your current
> > branch (assume it's master), then when you do that, you would update
> > your `refs/submodules/$name/master` references instead and keep the
> > submodule HEADs `on branch`. Of course we can _probably_ hack something
> > together along those lines with the current setup, but it would be _so_
> > much more convenient this way...
> >
> 
> For me, if I'm on heads/blah in the superproject, I probably want to
> be on heads/blah in *all* submodules. But that's maybe just me.

  Yes, that's what I tried to say, so if it wasn't clear, it's exactly
what I would like to do/have.

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 20:55   ` Pierre Habouzit
@ 2008-07-28 20:59     ` Pierre Habouzit
  2008-07-28 21:40       ` Avery Pennarun
  0 siblings, 1 reply; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-28 20:59 UTC (permalink / raw)
  To: Nigel Magnay, Git ML

[-- Attachment #1: Type: text/plain, Size: 1286 bytes --]

On Mon, Jul 28, 2008 at 08:55:45PM +0000, Pierre Habouzit wrote:
> On Mon, Jul 28, 2008 at 08:23:39PM +0000, Nigel Magnay wrote:
>   That too indeed (the "easier to clone" bit). OTOH, I don't like the
> .git/submodules idea a lot, if you mean to put a usual $GIT_DIR layout
> inside of it. With what I propose, you find objects for all your
> super/sub-modules in the usual store, which eases many things.
> Especially, I believe that when you replace a subdirectory of a project
> with a submodule, git-blame could benefit quite a lot from this to be
> able to glue history back through the submodule limits, without having
> to refactor a _lot_ of code: it would merely have to dereference so
> called "gitlinks" to the commit then tree, hence twice, and just do its
> usual work, with your proposal, we still rely on having to recurse in
> subdirectories which requires more boilerplate code.

  And of _course_ this is also true for git-log, which is like 10x as
important for me (like I don't remember if I used git-blame this year,
whereas I used git-log in the last 10 minutes ;p)


-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 20:59     ` Pierre Habouzit
@ 2008-07-28 21:40       ` Avery Pennarun
  2008-07-28 22:03         ` Pierre Habouzit
  2008-07-29  5:51         ` Benjamin Collins
  0 siblings, 2 replies; 36+ messages in thread
From: Avery Pennarun @ 2008-07-28 21:40 UTC (permalink / raw)
  To: Pierre Habouzit, Nigel Magnay, Git ML

On 7/28/08, Pierre Habouzit <madcoder@debian.org> wrote:
> On Mon, Jul 28, 2008 at 08:55:45PM +0000, Pierre Habouzit wrote:
> >   That too indeed (the "easier to clone" bit). OTOH, I don't like the
>  > .git/submodules idea a lot, if you mean to put a usual $GIT_DIR layout
>  > inside of it. With what I propose, you find objects for all your
>  > super/sub-modules in the usual store, which eases many things.
>  > Especially, I believe that when you replace a subdirectory of a project
>  > with a submodule, git-blame could benefit quite a lot from this to be
>  > able to glue history back through the submodule limits, without having
>  > to refactor a _lot_ of code: it would merely have to dereference so
>  > called "gitlinks" to the commit then tree, hence twice, and just do its
>  > usual work, with your proposal, we still rely on having to recurse in
>  > subdirectories which requires more boilerplate code.
>
>   And of _course_ this is also true for git-log, which is like 10x as
>  important for me (like I don't remember if I used git-blame this year,
>  whereas I used git-log in the last 10 minutes ;p)

I don't think you're going to get away with *not* having a separate
.git directory for each submodule.  You'll just plain lose almost all
the features of submodules if you try to do that.

Most importantly in my case, my submodules (libraries shared between
apps) have a very different branching structure than my supermodules.
It wouldn't be particularly meaningful to force them to use the same
branch names.

Further, if you don't have a separate .git directory for each
submodule, you can't *switch* branches on the submodule independently
of the supermodule in any obvious way.  This is also useful; I might
want to test updating to the latest master of my submodule, see if it
still works with my supermodule, and if so, commit the new gitlink in
the supermodule.  This is a very common workflow for me.

On the other hand, your thought about combining the "git log" messages
is quite interesting.  That *is* something I'd benefit from, along
with being able to git-bisect across submodules.  If I'm in the
supermodule, I want to see *all* the commits that might have changed
in my application, not just the ones in the supermodule itself.  I
suspect this isn't simple at all to implement, however, as you'd have
to look inside the file tree of a given commit in order to find
whether any submodule links have changed in that commit.  It's
unfortunate that submodules involve a commit->tree->commit link
structure.

> One irritating problem with submodules, is
> that when someone else commited, and that you git submodule update,
> you're on a detached head. Absolutely horrible.

I think that roughly everyone agrees with the above statement by now.
It would also be trivial to fix it, if only we knew what "fix" means.
So far, I haven't seen any good suggestions for what branch name to
use automatically in a submodule, and believe me, I've been looking
for one :)

Have fun,

Avery

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 21:40       ` Avery Pennarun
@ 2008-07-28 22:03         ` Pierre Habouzit
  2008-07-28 22:26           ` Jakub Narebski
  2008-07-28 22:32           ` Avery Pennarun
  2008-07-29  5:51         ` Benjamin Collins
  1 sibling, 2 replies; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-28 22:03 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Nigel Magnay, Git ML

[-- Attachment #1: Type: text/plain, Size: 5928 bytes --]

On Mon, Jul 28, 2008 at 09:40:22PM +0000, Avery Pennarun wrote:
> On 7/28/08, Pierre Habouzit <madcoder@debian.org> wrote:
> > On Mon, Jul 28, 2008 at 08:55:45PM +0000, Pierre Habouzit wrote:
> > >   That too indeed (the "easier to clone" bit). OTOH, I don't like the
> >  > .git/submodules idea a lot, if you mean to put a usual $GIT_DIR layout
> >  > inside of it. With what I propose, you find objects for all your
> >  > super/sub-modules in the usual store, which eases many things.
> >  > Especially, I believe that when you replace a subdirectory of a project
> >  > with a submodule, git-blame could benefit quite a lot from this to be
> >  > able to glue history back through the submodule limits, without having
> >  > to refactor a _lot_ of code: it would merely have to dereference so
> >  > called "gitlinks" to the commit then tree, hence twice, and just do its
> >  > usual work, with your proposal, we still rely on having to recurse in
> >  > subdirectories which requires more boilerplate code.
> >
> >   And of _course_ this is also true for git-log, which is like 10x as
> >  important for me (like I don't remember if I used git-blame this year,
> >  whereas I used git-log in the last 10 minutes ;p)
> 
> I don't think you're going to get away with *not* having a separate
> ..git directory for each submodule.  You'll just plain lose almost all
> the features of submodules if you try to do that.
> 
> Most importantly in my case, my submodules (libraries shared between
> apps) have a very different branching structure than my supermodules.
> It wouldn't be particularly meaningful to force them to use the same
> branch names.

Why not ? We're talking local branches, that can track whatever you like
on the remote side. Of course, the globing refspec are probably going to
be too simple for you if your branching scheme is _that_ different, but
if you can deal with that by hand _now_, I can't see why writing the
adequate tracking maps by hand would be any harder.

> Further, if you don't have a separate .git directory for each
> submodule, you can't *switch* branches on the submodule independently
> of the supermodule in any obvious way.

Yes you can, in what I propose you have a dummy .git in each submodule,
with probably an index, a HEAD and a config file (maybe some other
things along) to allow that especially.

> This is also useful; I might want to test updating to the latest
> master of my submodule, see if it still works with my supermodule, and
> if so, commit the new gitlink in the supermodule.  This is a very
> common workflow for me.

I agree.

> It's unfortunate that submodules involve a commit->tree->commit link
> structure.

Actually it's not a big problem, you just have to "dereference" twice
instead of one, and be prepared to the fact that the second dereference
may fail (because you miss some objects). I instead believe that
gitlinks are a good idea.

> > One irritating problem with submodules, is
> > that when someone else commited, and that you git submodule update,
> > you're on a detached head. Absolutely horrible.
> 
> I think that roughly everyone agrees with the above statement by now.
> It would also be trivial to fix it, if only we knew what "fix" means.
> So far, I haven't seen any good suggestions for what branch name to
> use automatically in a submodule, and believe me, I've been looking
> for one :)

Well, using the same as the supermodule is probably the less confusing
way. Of course, not being in the "same" branch as the supermodule would
clearly be a case of your tree being "dirty", and it would prevent a
"git checkout" to work in the very same way that git checkout doesn't
work if you have locally modified files.

If your submodule branching layout uses the same names as the
supermodule branches then yes, it's going to hurt, but I believe it to
be unlikely (else you would become insane just trying to remember what
you are doing ;p). So even if say your work in `master` in your
supermodule, but for some reason use what is on the remote
`release/10.2` branch for a given submodule, nothing prevents you to
have a local `master` branch in that submodule as well that tracks
`release/10.2`. It's actually a quite sensible thing to do I believe.

It doesn't prevent you from switching your submodule to the `devel`
branch to perform some tests, or even make it the new state of the
submodule inside your supermodule. This operation would just then move
what is `master` in your submodule to track `devel` instead of
`release/10.2`[0].

I fail to see which current submodules features you would lose with this
scheme. In fact, said differently, I more or less propose that when you
commit your supermodule state (providing some checks hold see [0]) it
updates the 'associated' branch in the submodule. The goal here, is that
you're always on a branch, set up to track a given remote branch, that
we can check against so that we can:

  (1) avoid the usual caveats of git-submodules (one can check when
      pushing the supermodule that all submodules are indeed pushed in
      the remote branch they track);

  (2) user can commit and don't bother resetting branches and doing
      tricks to be "on branch again".

I don't think it prevents you from e.g. having topic branches in your
submodules, provided that before commiting a new submodule change, you
somehow merge those in the "matching" branch that was set up for you.



  [0] of course we probably want to refuse such a thing if `devel` isn't
      a fast-forward from release/10.2. But that's not the point of the
      explanation so I skipped this bit for clarity of my point.
-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 22:03         ` Pierre Habouzit
@ 2008-07-28 22:26           ` Jakub Narebski
  2008-07-28 22:41             ` Junio C Hamano
  2008-07-28 22:32           ` Avery Pennarun
  1 sibling, 1 reply; 36+ messages in thread
From: Jakub Narebski @ 2008-07-28 22:26 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: Avery Pennarun, Nigel Magnay, Git ML

Pierre Habouzit <madcoder@debian.org> writes:
> On Mon, Jul 28, 2008 at 09:40:22PM +0000, Avery Pennarun wrote:

> > Further, if you don't have a separate .git directory for each
> > submodule, you can't *switch* branches on the submodule independently
> > of the supermodule in any obvious way.
> 
> Yes you can, in what I propose you have a dummy .git in each submodule,
> with probably an index, a HEAD and a config file (maybe some other
> things along) to allow that especially.

What you are (re)inventing here is something called gitlink (.git which
is a file, or .gitlink file); not to be confused with 'sumbodule'/'commit'
entry in a tree which is sometimes called gitlink.  Alternate idea was
'unionfs' like "shadowing" .git, with 'core.gitdir' in .git/config
(which would contain .git/HEAD and .git/index, and all missing files
and config would be taken from `core.gitdir').

There was even some preliminary implementation IIRC, but AFAIR it
was abandoned because of no "real usage".

See
  http://permalink.gmane.org/gmane.comp.version-control.msysgit/1868
  http://permalink.gmane.org/gmane.comp.version-control.git/72449
  http://permalink.gmane.org/gmane.comp.version-control.git/72457
  http://permalink.gmane.org/gmane.comp.version-control.git/72296
-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 22:03         ` Pierre Habouzit
  2008-07-28 22:26           ` Jakub Narebski
@ 2008-07-28 22:32           ` Avery Pennarun
  2008-07-28 23:12             ` Pierre Habouzit
  1 sibling, 1 reply; 36+ messages in thread
From: Avery Pennarun @ 2008-07-28 22:32 UTC (permalink / raw)
  To: Pierre Habouzit, Avery Pennarun, Nigel Magnay, Git ML

On 7/28/08, Pierre Habouzit <madcoder@debian.org> wrote:
>  > It's unfortunate that submodules involve a commit->tree->commit link
>  > structure.
>
> Actually it's not a big problem, you just have to "dereference" twice
>  instead of one, and be prepared to the fact that the second dereference
>  may fail (because you miss some objects). I instead believe that
>  gitlinks are a good idea.

It's actually complicated to generate the log, however.  To be 100%
accurate in creating a combined log of the supermodule and submodule,
you'd have to check *for each supermodule commit* whether there were
any changes in gitlinks.  And gitlinks might move around between
revisions, so you can't just look up a particular path in each
revision; you have to traverse the entire tree.  And you can't just
look at the start and end supermodule commits to see if the gitlinks
changed; they might have changed and then changed back, which is quite
relevant to log messages.

Probably it's more useful to just commit the git-shortlog of the
submodule whenever you update the gitlink.  It won't work with bisect,
exactly, but that's less important than generally having an idea of
what happened by reading the log.  ISTR somenoe submitted a
git-submodule patch for that already somewhere, but I've been known to
imagine things.

> Well, using the same [branch] as the supermodule is probably the less confusing
>  way. Of course, not being in the "same" branch as the supermodule would
>  clearly be a case of your tree being "dirty", and it would prevent a
>  "git checkout" to work in the very same way that git checkout doesn't
>  work if you have locally modified files.
>
>  If your submodule branching layout uses the same names as the
>  supermodule branches then yes, it's going to hurt, but I believe it to
>  be unlikely (else you would become insane just trying to remember what
>  you are doing ;p).

I think this is much more common than you think.  An easy example is
that I'm developing a new version of my application in the
supermodule's "master", but it relies on a released version of my
submodule, definitely not the experimental "master" version.  Using
your logic, the local branch of the submodule would be called master,
but wouldn't correspond at all to the remote submodule's master.

I believe such a situation would be even worse than no branch at all.
It could lead to people pushing/pulling all sorts of bad things from
the wrong places.  At least right now, people become confused and ask
for help instead of becoming confused and making a mess.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 22:26           ` Jakub Narebski
@ 2008-07-28 22:41             ` Junio C Hamano
  2008-08-17 20:13               ` Pierre Habouzit
  0 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2008-07-28 22:41 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Pierre Habouzit, Avery Pennarun, Nigel Magnay, Git ML

Jakub Narebski <jnareb@gmail.com> writes:

> Pierre Habouzit <madcoder@debian.org> writes:
>> On Mon, Jul 28, 2008 at 09:40:22PM +0000, Avery Pennarun wrote:
>
>> > Further, if you don't have a separate .git directory for each
>> > submodule, you can't *switch* branches on the submodule independently
>> > of the supermodule in any obvious way.
>> 
>> Yes you can, in what I propose you have a dummy .git in each submodule,
>> with probably an index, a HEAD and a config file (maybe some other
>> things along) to allow that especially.
>
> What you are (re)inventing here is something called gitlink (.git which
> is a file, or .gitlink file); not to be confused with 'sumbodule'/'commit'
> entry in a tree which is sometimes called gitlink....
> ...
> There was even some preliminary implementation IIRC, but AFAIR it
> was abandoned because of no "real usage".

I am afraid you are confused.  I think you are talking about "gitfile",
not "gitlink".

It is not abandoned; see e.g. read_gitfile_gently() in setup.c.

I suspect the use of it may help the use case Pierre proposes, but its
main attractiveness as I understood it back when we discussed the facility
was that you could switch branches between 'maint' that did not have a
submodule at "path" back then, and 'master' that does have one now,
without losing the submodule repository.  When checking out 'master' (and
that would probably mean you would update 'git-submodule init' and
'git-submodule update' implementation), you would instanciate subdirectory
"path", create "path/.git" that is such a regular file that that points at
somewhere inside the $GIT_DIR of superproject (say ".git/submodules/foo").
By storing refs and object store are all safely away in the superproject
$GIT_DIR, you can now safely switch back to 'maint', which would involve
making sure there is no local change that will be lost and then removing
the "path" and everything underneath it.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 22:32           ` Avery Pennarun
@ 2008-07-28 23:12             ` Pierre Habouzit
  0 siblings, 0 replies; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-28 23:12 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Nigel Magnay, Git ML

[-- Attachment #1: Type: text/plain, Size: 5156 bytes --]

On Mon, Jul 28, 2008 at 10:32:54PM +0000, Avery Pennarun wrote:
> On 7/28/08, Pierre Habouzit <madcoder@debian.org> wrote:
> >  > It's unfortunate that submodules involve a commit->tree->commit link
> >  > structure.
> >
> > Actually it's not a big problem, you just have to "dereference" twice
> >  instead of one, and be prepared to the fact that the second dereference
> >  may fail (because you miss some objects). I instead believe that
> >  gitlinks are a good idea.
> 
> It's actually complicated to generate the log, however.  To be 100%
> accurate in creating a combined log of the supermodule and submodule,
> you'd have to check *for each supermodule commit* whether there were
> any changes in gitlinks.  And gitlinks might move around between
> revisions, so you can't just look up a particular path in each
> revision; you have to traverse the entire tree.  And you can't just
> look at the start and end supermodule commits to see if the gitlinks
> changed; they might have changed and then changed back, which is quite
> relevant to log messages.

I'm pretty clueless about how git-log works, but I fail to see how this
is harder than following file moves e.g. Of course it's more expensive
than git log, but it shouldn't really be more expensive than
`git log -M -C -C` already is.

> > Well, using the same [branch] as the supermodule is probably the less confusing
> >  way. Of course, not being in the "same" branch as the supermodule would
> >  clearly be a case of your tree being "dirty", and it would prevent a
> >  "git checkout" to work in the very same way that git checkout doesn't
> >  work if you have locally modified files.
> >
> >  If your submodule branching layout uses the same names as the
> >  supermodule branches then yes, it's going to hurt, but I believe it to
> >  be unlikely (else you would become insane just trying to remember what
> >  you are doing ;p).
> 
> I think this is much more common than you think.  An easy example is
> that I'm developing a new version of my application in the
> supermodule's "master", but it relies on a released version of my
> submodule, definitely not the experimental "master" version.  Using
> your logic, the local branch of the submodule would be called master,
> but wouldn't correspond at all to the remote submodule's master.

Probably indeed, otoh the "remote" (assume it's origin) master state is
stored in "origin/master", not "master".

> I believe such a situation would be even worse than no branch at all.
> It could lead to people pushing/pulling all sorts of bad things from
> the wrong places.  At least right now, people become confused and ask
> for help instead of becoming confused and making a mess.

Indeed. But that's only a name issue, I'm sure we can come up with
something decent. What I (we ?) want is actually a way to make
git-checkout/git-reset work so that when you switch branches (reset
--hard to a previous state) you remain on branch, because human brains
usually don't remember those silly detached HEADs commits sha1 well ;)
The problem is, `the branch I'm on in my submodule when the supermodule
is on $foo` is a quite local information. But that's really what we
would like to remember so that when you git checkout somewhere and git
checkout master back, submodules switches branch accordingly.

Saying that, I realize that we probably _really_ want to name submodule
branches the same as the supermodule ones, but should manage to find UIs
that don't confuse users wrt the fact that it may be disconnected from
the remote branch nameing.

I reckon that for my use, I would not have those problems, because we
have this kind of layout:

lib-foo/ <-- submodule to share the foo library.
lib-bar/ <-- submodule to share the bar library.
app-frotz/ <-- the frotz product

In another repository we have app-quux and the same submodules, and so
on.

We always try that our `master`s (where devel happens unlike git.git ;p)
use the most recent `master` versions from the submodules. And the
stable branches (IOW the software that we sold and released and for
which we provide support), have branches named $product/$version. We
have $product/$version branches in those submodules as well. If we have
a bug that needs a patch in say, lib-foo, then we push the patch into a
topic branch, that we merge in all the $product/$version that need it,
and into master as well. For such a setup that I believe to be sane,
then well, corresponding names fit the job perfectly.

Of course if one of your submodule is git.git where the not too unstable
code lives in `next` and not `master` and another one is one of my
project at work, where not too unstable code lives in `master` then
indeed you're somehow screwed because indeed the whole `master` concept
would be quite confusing. But honestly, I don't think it's less
confusing with the current way submodules work either.


-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 21:40       ` Avery Pennarun
  2008-07-28 22:03         ` Pierre Habouzit
@ 2008-07-29  5:51         ` Benjamin Collins
  2008-07-29  6:04           ` Shawn O. Pearce
  2008-07-29  8:21           ` Pierre Habouzit
  1 sibling, 2 replies; 36+ messages in thread
From: Benjamin Collins @ 2008-07-29  5:51 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Pierre Habouzit, Nigel Magnay, Git ML

On Mon, Jul 28, 2008 at 4:40 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
> Most importantly in my case, my submodules (libraries shared between
> apps) have a very different branching structure than my supermodules.
> It wouldn't be particularly meaningful to force them to use the same
> branch names.
>
> Further, if you don't have a separate .git directory for each
> submodule, you can't *switch* branches on the submodule independently
> of the supermodule in any obvious way.  This is also useful; I might
> want to test updating to the latest master of my submodule, see if it
> still works with my supermodule, and if so, commit the new gitlink in
> the supermodule.  This is a very common workflow for me.

I second this sentiment.  I happen to very much *like* the fact that
the coupling between submodules and their super-projects is minimal.
The flexibility this allows is very useful.  Of course, it brings to
mind the comment Stroustrup once made about C++ blowing off your whole
leg.

> On the other hand, your thought about combining the "git log" messages
> is quite interesting.  That *is* something I'd benefit from, along
> with being able to git-bisect across submodules.  If I'm in the
> supermodule, I want to see *all* the commits that might have changed
> in my application, not just the ones in the supermodule itself.  I
> suspect this isn't simple at all to implement, however, as you'd have
> to look inside the file tree of a given commit in order to find
> whether any submodule links have changed in that commit.  It's
> unfortunate that submodules involve a commit->tree->commit link
> structure.

Let my contrariness begin...
I can see how someone might find such a feature in "git log" useful,
but I don't think I would.  I have 3 submodules in my project right
now, and I don't always want to see the changes.  Most of the time, I
don't care, actually.  When I do care, I can search the output of "git
log" for commits that touch the path where my submodule lives (through
Gitk, usually), and I can open another Gitk for details.

As for "git bisect": I haven't done this and I'm too busy to try to
contrive something for the purposes of this email, but wouldn't it
basically already do what you want?  Seems that you'd just run "git
submodule update" after each step of the bisect.

>
> > One irritating problem with submodules, is
> > that when someone else commited, and that you git submodule update,
> > you're on a detached head. Absolutely horrible.
>
> I think that roughly everyone agrees with the above statement by now.
> It would also be trivial to fix it, if only we knew what "fix" means.
> So far, I haven't seen any good suggestions for what branch name to
> use automatically in a submodule, and believe me, I've been looking
> for one :)

I disagree with this completely.  I think the detached head is
actually fantastic because it tells you all the right things:
a) the branch your submodule is on is ultimately irrelevant
b) it reminds you that this is not your project.  It's part of your
project managed in a special way by Git, but your project is in ..
c) if you want to do work in this part of your project that comes from
somewhere else, you need to be thoughtful about how you manage its
branches.

I try to keep all my submodules on (no branch) as much as possible.
In a way, I feel like that kind of relieves me of the chore of keeping
mapping superproject branches to submodule branches in my head.

I pretty much support submodules as they are, with the exception of
wanting "git submodule update" to be executed automatically at times.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29  5:51         ` Benjamin Collins
@ 2008-07-29  6:04           ` Shawn O. Pearce
  2008-07-29  8:18             ` Nigel Magnay
  2008-07-29  8:21           ` Pierre Habouzit
  1 sibling, 1 reply; 36+ messages in thread
From: Shawn O. Pearce @ 2008-07-29  6:04 UTC (permalink / raw)
  To: Benjamin Collins; +Cc: Avery Pennarun, Pierre Habouzit, Nigel Magnay, Git ML

Benjamin Collins <aggieben@gmail.com> wrote:
> On Mon, Jul 28, 2008 at 4:40 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
> >
> > > One irritating problem with submodules, is
> > > that when someone else commited, and that you git submodule update,
> > > you're on a detached head. Absolutely horrible.
> >
> > I think that roughly everyone agrees with the above statement by now.
> > It would also be trivial to fix it, if only we knew what "fix" means.
> > So far, I haven't seen any good suggestions for what branch name to
> > use automatically in a submodule, and believe me, I've been looking
> > for one :)
> 
> I disagree with this completely. I think the detached head is
> actually fantastic [...]

Ditto with Benjamin.  Detached head is a fantastic idea.

> [...] because it tells you all the right things:
> a) the branch your submodule is on is ultimately irrelevant
> b) it reminds you that this is not your project.  It's part of your
> project managed in a special way by Git, but your project is in ..
> c) if you want to do work in this part of your project that comes from
> somewhere else, you need to be thoughtful about how you manage its
> branches.
> 
> I try to keep all my submodules on (no branch) as much as possible.
> In a way, I feel like that kind of relieves me of the chore of keeping
> mapping superproject branches to submodule branches in my head.

At my former day-job we wrote our own "git submodule" in our
build system before gitlink was available in the core, let alone
git-submodule was a Porcelain command.

Many developers who were new to Git found having a sea of 11 Git
repositories+working directories in a single build area difficult to
manage.  They quickly found the detached HEAD feature in a submodule
to be a really handy way to know if they made changes there or not.

Most of our developers also modified __git_ps1() in their bash
completion to use `git name-rev HEAD` to try and pick up a remote
branch name when on a detached HEAD.  This slowed down their bash
prompts a little bit, but they found that "origin/foo" hint very
valuable to let them know they should start a new branch before
making changes.

So I'm just echoing what Benjamin said above, only we did it
independently, and came to the same conclusion.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29  6:04           ` Shawn O. Pearce
@ 2008-07-29  8:18             ` Nigel Magnay
  2008-07-29  8:45               ` Pierre Habouzit
  0 siblings, 1 reply; 36+ messages in thread
From: Nigel Magnay @ 2008-07-29  8:18 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Benjamin Collins, Avery Pennarun, Pierre Habouzit, Git ML

>> I try to keep all my submodules on (no branch) as much as possible.
>> In a way, I feel like that kind of relieves me of the chore of keeping
>> mapping superproject branches to submodule branches in my head.
>
> At my former day-job we wrote our own "git submodule" in our
> build system before gitlink was available in the core, let alone
> git-submodule was a Porcelain command.
>
> Many developers who were new to Git found having a sea of 11 Git
> repositories+working directories in a single build area difficult to
> manage.  They quickly found the detached HEAD feature in a submodule
> to be a really handy way to know if they made changes there or not.
>
> Most of our developers also modified __git_ps1() in their bash
> completion to use `git name-rev HEAD` to try and pick up a remote
> branch name when on a detached HEAD.  This slowed down their bash
> prompts a little bit, but they found that "origin/foo" hint very
> valuable to let them know they should start a new branch before
> making changes.
>
> So I'm just echoing what Benjamin said above, only we did it
> independently, and came to the same conclusion.
>

Hm.
My developers are (mostly) on windows, so "altering PS1" or even
writing "shell scripts" is way beyond them. They want it to "just
work" (where their previous experience is SVN superprojects with
multiple svn:externals). I have a hard time justifying the experience
that if we're all working on master, then as soon as Joe Q developer
does 'submodule update' then poof - his heads are disconnected.

That said, I do also like the flexibility that having the superproject
on heads/foo and a submodule on heads/bar as it allows you to
integration test divergent submodule branches (indeed our CI system
automatically picks them up and tries all possible combinations).

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29  5:51         ` Benjamin Collins
  2008-07-29  6:04           ` Shawn O. Pearce
@ 2008-07-29  8:21           ` Pierre Habouzit
  2008-07-29  8:37             ` Pierre Habouzit
  1 sibling, 1 reply; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-29  8:21 UTC (permalink / raw)
  To: Benjamin Collins; +Cc: Avery Pennarun, Nigel Magnay, Git ML

[-- Attachment #1: Type: text/plain, Size: 4939 bytes --]

On Tue, Jul 29, 2008 at 05:51:31AM +0000, Benjamin Collins wrote:
> I try to keep all my submodules on (no branch) as much as possible.
> In a way, I feel like that kind of relieves me of the chore of keeping
> mapping superproject branches to submodule branches in my head.

  Why would _you_ map them to superproject branches ? I mean it's pretty
much Git's matter. In fact, maybe calling them branches is not a
brilliant idea, what I would like would probably rather be some kind of
reflog like thing, but with one reflog per submodule and supermodule
branch.

  I agree with you than when you don't have to do any change in the
submodule, detached HEADs just work. But when you often have to push
fixes in, it's a nightmare. Instead of just having to:

  $EDITOR mysubmodule/file.c
  git commit mysubmodule/file.c # that would ideally do a commit in the
                                # submodule then update the submodule
                                # state in the supermodule

  git submodule mysubmodule push origin HEAD # push the submodule mysubmodule
                                             # changes to the appropriate branch
  git push origin HEAD

  You have to:

      cd submodule
      git branch -D master
      git checkout -b master
      git commit file.c
      cd ..
      git commit submodule
      cd submodule
      git push origin HEAD:remote/branch/we/want/to/push/to
      cd ..
      git push origin HEAD

      *phew*

  I'm sorry but this is nowhere near a good UI. Of course the detached
head *currently* prevents you to shoot yourself in the foot, because
submodules are _that_ dangerous. But those also are tedious to work
with, like a lot, which makes currently our answer to big projects "do
not have GB-big repositories, split them in submodules" a bad joke,
because their ergonomy is nowhere near what you have with a monolithic
repository yet.


  I'm trying to see what to do better. I believe we _need_ those things:

  * a way to name the successive states of the submodule, a branch looks
    like a good idea, but maybe we can "invent" some different idea so
    that it looks and tastes like a branch, but is more automagic in the
    sense that it's just a prettier name than a sha1.

    This would allow to inspect the submodule history using 
    `gitk $this_name`. The $PS1 thing is nice, but you have to cd into
    the submodule to see where it currently lives. So you rather need
    something else.


  * a way to remember where you want to push changes you do in the
    submodule to. That's a bit like branch tracking, but not quite. This
    is required so that we can (and I strongly believe we want that in
    the end) make many porcelain commands act on the full
    (super+sub)modules in a unified way, somehow hiding the submodules
    boundaries.

    For example, git commit file1 file2 file3 ... would do the
    submodules commits if any, and then the supermodule one. Alternatively, if
    you have e.g.:

      $ git add mysubmodule/file1.c
      $ git add superfile.c
      $ git add mysubmodule     # tell the supermodule we want to commit what
                                # is in the submodule index at the same time
      $ git commit

    Then if you run:

      $ git push                # fails complaining that mysubmodule is
                                # not pushed

      $ git submodule mysubmodule push
      $ git push                # works


  * What you "track" must be a per supermodule branch thing, so that if
    you do things like that:

    # you are in master in the supermodule with non pushed commits in
    # the submodule

    <.. oh crap there is a bug in the supermodule that I need to fix in
        the production branch..>

    $ git checkout production # would checkout in the submodule what
                              # matches
    $ $EDITOR mysubmodule/something
    $ git commit !$
    $ ..push everything..

    <.. okay let's now go back to master ..>

    $ git checkout master

    <... hack hack hack to finish the current WIP ...>
    <... okay we're ready to merge production in ...>

    $ git merge production    # will DWYM with the submodules, IOW merge the
                              # `production` state into the current `master` one.
    $ git sm mysubmodule push
    $ git push


    Try to write the same workflow with the current submodules, you'll
    end up with a script at least 3 times as long, because you would
    need to do everything by hand, including switching submodules,
    naming temporary branches in them so that you can work decently and
    perform the merges and so on.



-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29  8:21           ` Pierre Habouzit
@ 2008-07-29  8:37             ` Pierre Habouzit
  2008-07-29  8:51               ` Petr Baudis
  0 siblings, 1 reply; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-29  8:37 UTC (permalink / raw)
  To: Benjamin Collins, Avery Pennarun, Nigel Magnay, Git ML

[-- Attachment #1: Type: text/plain, Size: 2805 bytes --]

On mar, jui 29, 2008 at 08:21:35 +0000, Pierre Habouzit wrote:
>   * a way to remember where you want to push changes you do in the
>     submodule to. That's a bit like branch tracking, but not quite. This
>     is required so that we can (and I strongly believe we want that in
>     the end) make many porcelain commands act on the full
>     (super+sub)modules in a unified way, somehow hiding the submodules
>     boundaries.
<snip>
> 
> 
>   * What you "track" must be a per supermodule branch thing, so that if
>     you do things like that:
<snip>

  In fact, nowhere I used the name of the current submodule branch in my
examples, so maybe we don't really need it. What we need though, is a
way to tell where the submodules are pushed to, IO what they (try to)
track remotely, IOW of which remote reference they should always be a
parent.

  Such an information is probably to be put in .gitmodules, this way,
you have the per-supermodule-branch setting I would like to see. And
then one would not care about the submodules be in a detached HEAD
because I believe those scenarii work well:

  * If you do no changes in the submodules, all just works like it does
    now.

  * If your only work in the submodule is to refresh its state to the
    tip of what it currently track, then well, we probably want a git
    submodule command for that, and no further ado is done.

  * If you just want a simple fix to go in the submodule, work from your
    supermodule, as if there was no submodule. git-commit your changes
    (which with a submodule aware git-commit would be transparent), then
    you can push your work. And in the worst case scenario where you
    cannot push because it's not a fast forward, you would fetch, merge
    and push again.

    You don't really need a name for the submodule, even if you want to
    reset to the state before the merge because you screwed it, as
    basically, git-reset would _also_ be submodule aware and DWYM
    without an explicit reference for the submodule.

  * If you have heavy works in the submodule, then you probably will
    setup many submodule topic branches, work inside of it like you
    already do, and the extra step of reattaching the HEAD somewhere is
    not as bad as it is with (3) as it's a tiny overhead compared to all
    you're going to do with your topic branches.


So okay, let's scratch this "automatic reference" thing, I see its
limits now, so what about having a .gitmodule entry look like:

    [submodule "$path"]
	path = "$path"
	url = git://somewhere/
	tracks = master


-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29  8:18             ` Nigel Magnay
@ 2008-07-29  8:45               ` Pierre Habouzit
  0 siblings, 0 replies; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-29  8:45 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Shawn O. Pearce, Benjamin Collins, Avery Pennarun, Git ML

[-- Attachment #1: Type: text/plain, Size: 2247 bytes --]

On Tue, Jul 29, 2008 at 08:18:12AM +0000, Nigel Magnay wrote:
> >> I try to keep all my submodules on (no branch) as much as possible.
> >> In a way, I feel like that kind of relieves me of the chore of keeping
> >> mapping superproject branches to submodule branches in my head.
> >
> > At my former day-job we wrote our own "git submodule" in our
> > build system before gitlink was available in the core, let alone
> > git-submodule was a Porcelain command.
> >
> > Many developers who were new to Git found having a sea of 11 Git
> > repositories+working directories in a single build area difficult to
> > manage.  They quickly found the detached HEAD feature in a submodule
> > to be a really handy way to know if they made changes there or not.
> >
> > Most of our developers also modified __git_ps1() in their bash
> > completion to use `git name-rev HEAD` to try and pick up a remote
> > branch name when on a detached HEAD.  This slowed down their bash
> > prompts a little bit, but they found that "origin/foo" hint very
> > valuable to let them know they should start a new branch before
> > making changes.
> >
> > So I'm just echoing what Benjamin said above, only we did it
> > independently, and came to the same conclusion.
> >
> 
> Hm.
> My developers are (mostly) on windows, so "altering PS1" or even
> writing "shell scripts" is way beyond them.

  More importantly, you don't have all your submodule states in your PS1
so this argument is already moot for *nix users as well.

> They want it to "just work" (where their previous experience is SVN
> superprojects with multiple svn:externals). I have a hard time
> justifying the experience that if we're all working on master, then as
> soon as Joe Q developer does 'submodule update' then poof - his heads
> are disconnected.

  Well, maybe it's not as hard, maybe what we lack are just submodule
aware porcelains (I mean we lack those for sure, but maybe it's also
the _only_ thing we miss to have a better user experience, and I begin
to believe it).

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29  8:37             ` Pierre Habouzit
@ 2008-07-29  8:51               ` Petr Baudis
  2008-07-29 12:15                 ` Johannes Schindelin
  0 siblings, 1 reply; 36+ messages in thread
From: Petr Baudis @ 2008-07-29  8:51 UTC (permalink / raw)
  To: Pierre Habouzit, Benjamin Collins, Avery Pennarun, Nigel Magnay,
	Git ML

On Tue, Jul 29, 2008 at 10:37:55AM +0200, Pierre Habouzit wrote:
> So okay, let's scratch this "automatic reference" thing, I see its
> limits now, so what about having a .gitmodule entry look like:
> 
>     [submodule "$path"]

This is not a "$path" but arbitrary string. Please keep that in mind.

> 	path = "$path"
> 	url = git://somewhere/
> 	tracks = master

I do like this (well, I'd just name it "branch" instead of "tracks").
I use submodules very "traditionally" just to bind external projects of
certain version to my project, but I have been already thinking about
implementing this merely as a hint for others to know what branch should
the other developers follow when updating the submodule to a newer
version.

-- 
				Petr "Pasky" Baudis
As in certain cults it is possible to kill a process if you know
its true name.  -- Ken Thompson and Dennis M. Ritchie

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29  8:51               ` Petr Baudis
@ 2008-07-29 12:15                 ` Johannes Schindelin
  2008-07-29 13:07                   ` Pierre Habouzit
  0 siblings, 1 reply; 36+ messages in thread
From: Johannes Schindelin @ 2008-07-29 12:15 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Pierre Habouzit, Benjamin Collins, Avery Pennarun, Nigel Magnay,
	Git ML

Hi,

On Tue, 29 Jul 2008, Petr Baudis wrote:

> On Tue, Jul 29, 2008 at 10:37:55AM +0200, Pierre Habouzit wrote:
> 
> > 	path = "$path"
> > 	url = git://somewhere/
> > 	tracks = master
> 
> I do like this (well, I'd just name it "branch" instead of "tracks"). I 
> use submodules very "traditionally" just to bind external projects of 
> certain version to my project, but I have been already thinking about 
> implementing this merely as a hint for others to know what branch should 
> the other developers follow when updating the submodule to a newer 
> version.

As long as you only use it in "submodule status" to say what the relation 
of the current revision is with respect to the "tracks" branch...

But then, how does the relation to the currently _committed_ state get 
displayed?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29 12:15                 ` Johannes Schindelin
@ 2008-07-29 13:07                   ` Pierre Habouzit
  2008-07-29 13:15                     ` Johannes Schindelin
  0 siblings, 1 reply; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-29 13:07 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Petr Baudis, Benjamin Collins, Avery Pennarun, Nigel Magnay,
	Git ML

[-- Attachment #1: Type: text/plain, Size: 983 bytes --]

On Tue, Jul 29, 2008 at 12:15:05PM +0000, Johannes Schindelin wrote:
> On Tue, 29 Jul 2008, Petr Baudis wrote:
> > On Tue, Jul 29, 2008 at 10:37:55AM +0200, Pierre Habouzit wrote:
> > 
> > > 	path = "$path"
> > > 	url = git://somewhere/
> > > 	tracks = master
[...]
> But then, how does the relation to the currently _committed_ state get 
> displayed?

Hmm _that's_ why you need a name for it. Or you need the submodule to be
aware he's one, and then one would have some kind of "magic" word to
name this sha1. And tools would find out in the supermodule what it
translates into. I don't have any briliant idea for a proposal
(COMMITED_HEAD is clearly too long ;p, BASE is not very explicit, ...)
but someone will have one.

Ideally gitk would show some litle tag for this state too.

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29 13:07                   ` Pierre Habouzit
@ 2008-07-29 13:15                     ` Johannes Schindelin
  2008-07-29 13:19                       ` Pierre Habouzit
  2008-07-29 13:31                       ` Nigel Magnay
  0 siblings, 2 replies; 36+ messages in thread
From: Johannes Schindelin @ 2008-07-29 13:15 UTC (permalink / raw)
  To: Pierre Habouzit
  Cc: Petr Baudis, Benjamin Collins, Avery Pennarun, Nigel Magnay,
	Git ML

Hi,

On Tue, 29 Jul 2008, Pierre Habouzit wrote:

> On Tue, Jul 29, 2008 at 12:15:05PM +0000, Johannes Schindelin wrote:
> > On Tue, 29 Jul 2008, Petr Baudis wrote:
> > > On Tue, Jul 29, 2008 at 10:37:55AM +0200, Pierre Habouzit wrote:
> > > 
> > > > 	path = "$path"
> > > > 	url = git://somewhere/
> > > > 	tracks = master
> [...]
> > But then, how does the relation to the currently _committed_ state get 
> > displayed?
> 
> Hmm _that's_ why you need a name for it.

I do not understand.  We are talking about three different things here:

1) the committed state of the submodule
2) the local state of the submodule
3) the state of the "tracks" branch

We always have 1) and we have 2) _iff_ the submodule was checked out.  We 
only will have 3) if "tracks" is set in .git/config (for consistency's 
sake, we should not read that information directly from the .gitmodules 
file, but let the user override it in .git/config after "submodule init".

> Or you need the submodule to be aware he's one, and then one would have 
> some kind of "magic" word to name this sha1. And tools would find out in 
> the supermodule what it translates into.

You lost me there.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29 13:15                     ` Johannes Schindelin
@ 2008-07-29 13:19                       ` Pierre Habouzit
  2008-07-29 13:31                       ` Nigel Magnay
  1 sibling, 0 replies; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-29 13:19 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Petr Baudis, Benjamin Collins, Avery Pennarun, Nigel Magnay,
	Git ML

[-- Attachment #1: Type: text/plain, Size: 990 bytes --]

On Tue, Jul 29, 2008 at 01:15:10PM +0000, Johannes Schindelin wrote:
> Hi,
> 
> On Tue, 29 Jul 2008, Pierre Habouzit wrote:
> 
> > On Tue, Jul 29, 2008 at 12:15:05PM +0000, Johannes Schindelin wrote:
> > > On Tue, 29 Jul 2008, Petr Baudis wrote:
> > > > On Tue, Jul 29, 2008 at 10:37:55AM +0200, Pierre Habouzit wrote:
> > > > 
> > > > > 	path = "$path"
> > > > > 	url = git://somewhere/
> > > > > 	tracks = master
> > [...]
> > > But then, how does the relation to the currently _committed_ state get 
> > > displayed?

> > Or you need the submodule to be aware he's one, and then one would have 
> > some kind of "magic" word to name this sha1. And tools would find out in 
> > the supermodule what it translates into.
> 
> You lost me there.

  Then I didn't understand your question.

-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29 13:15                     ` Johannes Schindelin
  2008-07-29 13:19                       ` Pierre Habouzit
@ 2008-07-29 13:31                       ` Nigel Magnay
  2008-07-29 14:49                         ` Pierre Habouzit
  2008-07-29 14:53                         ` Junio C Hamano
  1 sibling, 2 replies; 36+ messages in thread
From: Nigel Magnay @ 2008-07-29 13:31 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Pierre Habouzit, Petr Baudis, Benjamin Collins, Avery Pennarun,
	Git ML

> I do not understand.  We are talking about three different things here:
>
> 1) the committed state of the submodule
> 2) the local state of the submodule
> 3) the state of the "tracks" branch
>
> We always have 1) and we have 2) _iff_ the submodule was checked out.  We
> only will have 3) if "tracks" is set in .git/config (for consistency's
> sake, we should not read that information directly from the .gitmodules
> file, but let the user override it in .git/config after "submodule init".
>

I think the implication is that .gitconfig states "I'm expecting that
submodule X will always be tracking branch name 'Y'" and that you
wouldn't ever override it in .git/config. If you then switched
submodule X to branch Z, then committed the superproject, that commit
would contain a change to .gitconfig also (to say I'm expecting to
track Z rather than X') ?

>> Or you need the submodule to be aware he's one, and then one would have
>> some kind of "magic" word to name this sha1. And tools would find out in
>> the supermodule what it translates into.
>
> You lost me there.
>

This sounds like it relates to the problem that what I call X and Z,
you might call Bibble and Bobble; you could use some kind of SHA1 in
lieu of a textual name to make sure everyone was talking about the
same thing ?

> Ciao,
> Dscho
>
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29 13:31                       ` Nigel Magnay
@ 2008-07-29 14:49                         ` Pierre Habouzit
  2008-07-29 14:53                         ` Junio C Hamano
  1 sibling, 0 replies; 36+ messages in thread
From: Pierre Habouzit @ 2008-07-29 14:49 UTC (permalink / raw)
  To: Nigel Magnay
  Cc: Johannes Schindelin, Petr Baudis, Benjamin Collins,
	Avery Pennarun, Git ML

[-- Attachment #1: Type: text/plain, Size: 1112 bytes --]

On Tue, Jul 29, 2008 at 01:31:23PM +0000, Nigel Magnay wrote:
> > I do not understand.  We are talking about three different things here:
> >
> > 1) the committed state of the submodule
> > 2) the local state of the submodule
> > 3) the state of the "tracks" branch
> >
> > We always have 1) and we have 2) _iff_ the submodule was checked out.  We
> > only will have 3) if "tracks" is set in .git/config (for consistency's
> > sake, we should not read that information directly from the .gitmodules
> > file, but let the user override it in .git/config after "submodule init".
> >
> 
> I think the implication is that .gitconfig states "I'm expecting that
> submodule X will always be tracking branch name 'Y'" and that you
> wouldn't ever override it in .git/config. If you then switched
> submodule X to branch Z, then committed the superproject, that commit
> would contain a change to .gitconfig also (to say I'm expecting to
> track Z rather than X') ?

  Yes, tracks branch in .git/config doesn't fly. Or you need a
branch.$supermodule_branch.$submodule_name.tracks setting (oh god!)

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-29 13:31                       ` Nigel Magnay
  2008-07-29 14:49                         ` Pierre Habouzit
@ 2008-07-29 14:53                         ` Junio C Hamano
  1 sibling, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2008-07-29 14:53 UTC (permalink / raw)
  To: Nigel Magnay
  Cc: Johannes Schindelin, Pierre Habouzit, Petr Baudis,
	Benjamin Collins, Avery Pennarun, Git ML

"Nigel Magnay" <nigel.magnay@gmail.com> writes:

>> I do not understand.  We are talking about three different things here:
>>
>> 1) the committed state of the submodule
>> 2) the local state of the submodule
>> 3) the state of the "tracks" branch
>>
>> We always have 1) and we have 2) _iff_ the submodule was checked out.  We
>> only will have 3) if "tracks" is set in .git/config (for consistency's
>> sake, we should not read that information directly from the .gitmodules
>> file, but let the user override it in .git/config after "submodule init".
>
> I think the implication is that .gitconfig states "I'm expecting that
> submodule X will always be tracking branch name 'Y'" and that you
> wouldn't ever override it in .git/config. If you then switched
> submodule X to branch Z, then committed the superproject, that commit
> would contain a change to .gitconfig also (to say I'm expecting to
> track Z rather than X') ?

You are right.  I think letting the user override with .git/config is a
good idea, but it shouldn't be ".gitmodules may say X or whatever, but I
want to use Y".

Instead, it should be more like "On branches where .gitmodules says X, I
want to use Y."

This comment actually applies to the existing override of .gitmodules item
with .git/config (I think I've been saying it since the design phase).

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-07-28 22:41             ` Junio C Hamano
@ 2008-08-17 20:13               ` Pierre Habouzit
  2008-08-17 22:54                 ` Avery Pennarun
  2008-08-17 23:08                 ` Junio C Hamano
  0 siblings, 2 replies; 36+ messages in thread
From: Pierre Habouzit @ 2008-08-17 20:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, Avery Pennarun, Nigel Magnay, Git ML

[-- Attachment #1: Type: text/plain, Size: 6785 bytes --]

On Mon, Jul 28, 2008 at 10:41:17PM +0000, Junio C Hamano wrote:
> I suspect the use of it may help the use case Pierre proposes, but its
> main attractiveness as I understood it back when we discussed the facility
> was that you could switch branches between 'maint' that did not have a
> submodule at "path" back then, and 'master' that does have one now,
> without losing the submodule repository.  When checking out 'master' (and
> that would probably mean you would update 'git-submodule init' and
> 'git-submodule update' implementation), you would instanciate subdirectory
> "path", create "path/.git" that is such a regular file that that points at
> somewhere inside the $GIT_DIR of superproject (say ".git/submodules/foo").
> By storing refs and object store are all safely away in the superproject
> $GIT_DIR, you can now safely switch back to 'maint', which would involve
> making sure there is no local change that will be lost and then removing
> the "path" and everything underneath it.

gitfiles looks nifty for sure, though I've thought about it a bit, and
I'm not sure if we don't want something a bit more powerful, though
still in the same vein.

If we look at submodules, I quite believe that we would benefit a lot
from sharing the object directory accross the supermodule and all its
submodules, because of the following reasons:

  * It could make things like git-blame better: at work, it's common for
    us to move files across submodules: we have a stable library shared
    accross projects, and move there C modules that have staged for
    quite some time in the applications and are stable enough, and it's
    pity to loose history then, whereas git could really guess about the
    move if it sees through GITLINKS in the same object repository.
    GITLINKS are not very different from trees actually if you can look
    through them, it's just a matter of dereferencing twice instead of
    once.

  * For people that have made a subdirectory become a submodule (and
    it's also something that can happen) it's likely that lots of blobs
    are shared. It would end up taking less disk space.

  * It helps people fixing situations where they pushed a supermodule
    with a substate that never existed without seeing it. Since the
    object store is shared, this commit that actually never existed will
    never ever be pruned, and at _least_ one person on earth will never
    lose it. With detached heads everywhere it's very easy to not name a
    detached head, and have it pruned at some point.

  * I _believe_ (just a hunch) that it helps knowing if it's possible to
    perform a "recursive" (wrt submodules) checkout/reset/$whatever,
    without having to spawn subcommands and quite unpleasant similar
    stuff.


Though we would not like to have submodules suffer from reachability
issues after a prune in the supermodule. That means that all references
and reflogs of the submodules shall be accessible from the supermodule
so that everything that could mess with the object store by removing
objects cannot remove interesting objects (that should limit the code
paths to really seldom places actually).


So what I've thinked about was to extend gitfiles so that it can also
define where to find not only the git_dir but also the object store.
Moving the current "faked symlink" approach to a less terse file looking
like a standard git-config one. E.g.:

    [gitfile]
	git_dir = some/path/.git/submodules/foo/
        objects = some/path/.git/objects
        # why not other settings in the future ?

This part is quite easy and straightforward (and it can be done while
keeping backward compatibility with the current way gitfiles work).
What I can't decide is how we deal with the reflogs and references. I
see two choices. Assuming the submodules git_dir's are under the
supermodule $GIT_DIR/submodules/$name_of_the_super_module/:

  (1) we do nothing more.

  (2) we melt the submodules reflogs and references into the supermodule
      ones with appropriate namespacing. For example, for a submodule
      named "foo/bar" we would have its reflogs live in the supermodule
      .git/logs/submodules/foo/bar/logs/* and its references under
      .git/refs/submodules/foo/bar/refs/*. For that we add 'logs =' and
      'refs =' to the gitfile.

The first approach need us to be able to somehow recurse under
.git/submodules to understand what inside that looks like a git_dir, and
teach reachability commands to look at the refs inside them. It can be
quite a lot of work, especially since we can have submodules inside
submodules at some point.

The second approach has the net benefit that no pruning command has to
be modified to work. Many commands that we want to act on the global
repository will just work. Though, we have to fix a couple of issues
too:
  (1) be able to have a references directory that is not .git/refs. I
      looked at the source, I believe only 3 or 4 places in the C code
      have to be fixed for that to work, maybe a bit more in the shell
      commands, but that should be fairly easy.

  (2) it will break reference packing, because the submodules won't see
      the supermodule packed-refs file, and we will probably have to
      draft a new packed-refs thingy because of this issue. A simple
      possibility I see is to move packed-refs as refs/.packed-refs (as
      a starting dot cannot be a reference name). Then teach
      git-pack-refs to generate a .packed-refs each time it crosses a
      'refs/' directory name, and finally learn how to load those (and
      no it won't require to recurse into the whole refs/, we can mark
      in the toplevel refs/.packed-refs that it has submodules and that
      there is a .packed-refs under
      refs/submodules/foo/bar/refs/.packed-refs and avoid the costly
      recursion).

  (3) we will have to teach for_each_ref to skip "/submodules",
      which is I believe fairly easy.


I personnaly like the second approach better because it will scale
better (I believe) when people will do submodules into submodules into
submodules. But I'm unsure if it's too disruptive or not.

So .. comments thoughts remarks are welcomed :)



Note: with enhanced gitfiles, and making workdirs use gitfiles, with any
      of those approaches, it's easy to make workdirs that won't have
      the "if we repack we may lose things referenced from other
      workdir's reflogs" problem anymore. Which is kind of a nifty side
      effect ;)
-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-08-17 20:13               ` Pierre Habouzit
@ 2008-08-17 22:54                 ` Avery Pennarun
  2008-08-17 23:08                 ` Junio C Hamano
  1 sibling, 0 replies; 36+ messages in thread
From: Avery Pennarun @ 2008-08-17 22:54 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: Junio C Hamano, Jakub Narebski, Nigel Magnay, Git ML

On Sun, Aug 17, 2008 at 4:13 PM, Pierre Habouzit <madcoder@debian.org> wrote:
>  * It could make things like git-blame better: at work, it's common for
>    us to move files across submodules: we have a stable library shared
>    accross projects, and move there C modules that have staged for
>    quite some time in the applications and are stable enough, and it's
>    pity to loose history then, whereas git could really guess about the
>    move if it sees through GITLINKS in the same object repository.
>    GITLINKS are not very different from trees actually if you can look
>    through them, it's just a matter of dereferencing twice instead of
>    once.

That would be cool.  I expect you could implement it independently of
everything else by simply *trying* to dereference gitlinks in the
local object repository if they exist, and not erroring out if they
don't.

The other reasons for combining the repos seem fine, but they mostly
seem to come down to saving disk space.  I like saving disk space, but
it's not really that important to me.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-08-17 20:13               ` Pierre Habouzit
  2008-08-17 22:54                 ` Avery Pennarun
@ 2008-08-17 23:08                 ` Junio C Hamano
  2008-08-18  0:46                   ` Pierre Habouzit
  1 sibling, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2008-08-17 23:08 UTC (permalink / raw)
  To: Pierre Habouzit; +Cc: Jakub Narebski, Avery Pennarun, Nigel Magnay, Git ML

Pierre Habouzit <madcoder@debian.org> writes:

> On Mon, Jul 28, 2008 at 10:41:17PM +0000, Junio C Hamano wrote:
>
>> I suspect the use of it may help the use case Pierre proposes, but its
>> main attractiveness as I understood it back when we discussed the facility
>> was that you could switch branches between 'maint' that did not have a
>> submodule at "path" back then, and 'master' that does have one now,
>> without losing the submodule repository.  When checking out 'master' (and
>> that would probably mean you would update 'git-submodule init' and
>> 'git-submodule update' implementation), you would instanciate subdirectory
>> "path", create "path/.git" that is such a regular file that that points at
>> somewhere inside the $GIT_DIR of superproject (say ".git/submodules/foo").
>> By storing refs and object store are all safely away in the superproject
>> $GIT_DIR, you can now safely switch back to 'maint', which would involve
>> making sure there is no local change that will be lost and then removing
>> the "path" and everything underneath it.
>
> gitfiles looks nifty for sure, though I've thought about it a bit, and
> I'm not sure if we don't want something a bit more powerful, though
> still in the same vein.
>
> If we look at submodules, I quite believe that we would benefit a lot
> from sharing the object directory accross the supermodule and all its
> submodules, because of the following reasons:

I know there are cases where sharing object store is useful.  Being able
to share is one thing.  Always having to share, without any other option,
is another.

Using gitlink to keep the true repository data out of submodule checkout
area so that branch switching can safely be done is orthogonal to the
issue of how repositories of submodules and the superproject share their
object store.  IOW, you would always use gitlink to solve the "branch
switching may make the submodule checkout disappear" issue, while you
could use alternates mechanism (or direct symlinking of $GIT_DIR/objects)
across these repositories *if* you want to share their object store.

> Though we would not like to have submodules suffer from reachability
> issues after a prune in the supermodule. That means that all references
> and reflogs of the submodules shall be accessible from the supermodule
> so that everything that could mess with the object store by removing
> objects cannot remove interesting objects (that should limit the code
> paths to really seldom places actually).

I do not think this issue is limited to use of submodules.  I'd imagine
that if you build this reachability protection into the alternates
mechanism, you would automatically solve both "multiple checkout of the
same project, via git-new-workdir" issue as well as "submodules that share
its objects with the superproject" issue.

Which leads me to conclude, at least for now, that it would not be a good
idea to make this related to gitfile in any way.  Object sharing between
equal repositories (aka new-workdir) does not use gitfile, but it still
needs to have the same kind of reachability protection.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2008-08-17 23:08                 ` Junio C Hamano
@ 2008-08-18  0:46                   ` Pierre Habouzit
  0 siblings, 0 replies; 36+ messages in thread
From: Pierre Habouzit @ 2008-08-18  0:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, Avery Pennarun, Nigel Magnay, Git ML

[-- Attachment #1: Type: text/plain, Size: 3815 bytes --]

On Sun, Aug 17, 2008 at 11:08:39PM +0000, Junio C Hamano wrote:
> I know there are cases where sharing object store is useful.  Being able
> to share is one thing.  Always having to share, without any other option,
> is another.
> 
> Using gitlink to keep the true repository data out of submodule checkout
> area so that branch switching can safely be done is orthogonal to the
> issue of how repositories of submodules and the superproject share their
> object store.  IOW, you would always use gitlink to solve the "branch
> switching may make the submodule checkout disappear" issue, while you
> could use alternates mechanism (or direct symlinking of $GIT_DIR/objects)
> across these repositories *if* you want to share their object store.

  Fair enough. Though I'm not only interested into the branch switching
issue. I'm seeing a bit farther, like in having many commands working as
if there is no submodule involved. And having the same object store for
all {sup,super}modules helps a lot. For example, there is probably quite
some plumbing to write if we have separate object stores if we expect
(and frankly I do) to have git-commit work across submodule boundaries
(doing what it should, IOW commit in the submodules, and then commit in
the supermodule).

  But maybe it's not as hard as it looks.

> > Though we would not like to have submodules suffer from reachability
> > issues after a prune in the supermodule. That means that all references
> > and reflogs of the submodules shall be accessible from the supermodule
> > so that everything that could mess with the object store by removing
> > objects cannot remove interesting objects (that should limit the code
> > paths to really seldom places actually).
> 
> I do not think this issue is limited to use of submodules.  I'd imagine
> that if you build this reachability protection into the alternates
> mechanism, you would automatically solve both "multiple checkout of the
> same project, via git-new-workdir" issue as well as "submodules that share
> its objects with the superproject" issue.
> 
> Which leads me to conclude, at least for now, that it would not be a good
> idea to make this related to gitfile in any way.  Object sharing between
> equal repositories (aka new-workdir) does not use gitfile, but it still
> needs to have the same kind of reachability protection.

  Well somehow the repository that is the alternate (or the symlink, but
the latter isn't very windows friendly, not to mention vfat) has to know
about the other repository:
  * index ;
  * references ;
  * reflogs.

  Which means that alternates users have to register into the provider,
which seems to be _usually_ brittle. I mean, for the current way of
how git-new-workdir works, if you register workdirs into the real
repository, if you just rename this workdir at some point, or move it to
some other place, you're screwed, silentely.

  If instead you force this workdir to use a gitfile, you _don't need_
to register your workdir in the "real" repository, because all the data
belongs to the "real" repository. The workdir is just a "detached"
workdir, with only the checkout stuff, no index, no references no
nothing. And if you move this workdir to a new place, it still works.
Only the central repository should not budge, which is already a
limitation of current workdirs and alternates anyways.

  Of course with submodules it's less of an issue since those arent as
loosely coupled to the supermodule as workdir are to the main
repository, and it's unlikely that a submodule will move very often ;)


-- 
·O·  Pierre Habouzit
··O                                                madcoder@debian.org
OOO                                                http://www.madism.org

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* git submodules
@ 2009-10-17 17:15 Steven Noonan
  2009-10-17 17:27 ` Jakub Narebski
  2009-10-21 19:38 ` Avery Pennarun
  0 siblings, 2 replies; 36+ messages in thread
From: Steven Noonan @ 2009-10-17 17:15 UTC (permalink / raw)
  To: Git Mailing List; +Cc: crawl-ref-discuss

One of the open source projects I work on (CC'd) recently moved to
git, but we're having some slight problems, and I believe that it's
the fault of git's UI in this case.

We're using git submodules for the contributing libraries. When I
commit changes to those contribs, it correctly shows in the parent
repository that those folders have different revisions than what's
currently committed. However, if someone pulls those changes, it
doesn't automatically update the contribs to match the committed
version. But doing a pull or merge _should_ update the working tree to
match the committed versions. It does with file data, so why not
update the submodules? Especially if the submodule revision matched
the committed version -before- the pull. Why are we forced into using
'git submodule update'?

- Steven

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2009-10-17 17:15 Steven Noonan
@ 2009-10-17 17:27 ` Jakub Narebski
  2009-10-17 22:30   ` Nanako Shiraishi
  2009-10-21 19:38 ` Avery Pennarun
  1 sibling, 1 reply; 36+ messages in thread
From: Jakub Narebski @ 2009-10-17 17:27 UTC (permalink / raw)
  To: Steven Noonan; +Cc: Git Mailing List, crawl-ref-discuss

Steven Noonan <steven@uplinklabs.net> writes:

> We're using git submodules for the contributing libraries. When I
> commit changes to those contribs, it correctly shows in the parent
> repository that those folders have different revisions than what's
> currently committed. However, if someone pulls those changes, it
> doesn't automatically update the contribs to match the committed
> version. But doing a pull or merge _should_ update the working tree to
> match the committed versions. It does with file data, so why not
> update the submodules? Especially if the submodule revision matched
> the committed version -before- the pull. Why are we forced into using
> 'git submodule update'?

Because you might want not to use most current version of submodule,
so git-pull shouldn't update submodules by default.  And because
git-pull didn't learn --recursive option yet.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2009-10-17 17:27 ` Jakub Narebski
@ 2009-10-17 22:30   ` Nanako Shiraishi
  0 siblings, 0 replies; 36+ messages in thread
From: Nanako Shiraishi @ 2009-10-17 22:30 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Git Mailing List, Steven Noonan, crawl-ref-discuss

Quoting Jakub Narebski <jnareb@gmail.com>

> Steven Noonan <steven@uplinklabs.net> writes:
>
>> We're using git submodules for the contributing libraries. When I
>> commit changes to those contribs, it correctly shows in the parent
>> repository that those folders have different revisions than what's
>> currently committed. However, if someone pulls those changes, it
>> doesn't automatically update the contribs to match the committed
>> version. But doing a pull or merge _should_ update the working tree to
>> match the committed versions. It does with file data, so why not
>> update the submodules? Especially if the submodule revision matched
>> the committed version -before- the pull. Why are we forced into using
>> 'git submodule update'?
>
> Because you might want not to use most current version of submodule,
> so git-pull shouldn't update submodules by default.  And because
> git-pull didn't learn --recursive option yet.

I don't think your description is correct. Steven is talking about what the command should do by default. If you checked out the current superproject, by default you should get the submodule that matches. If you don't want the most current version, you can checkout an older submodule yourself.

You may want to follow this discussion:

  http://thread.gmane.org/gmane.comp.version-control.git/130155/focus=130330

After stating that he isn't against the idea to make it automatic, Junio describes what needs to be done for it to happen and what are the corner cases that needs to be treated with care.

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: git submodules
  2009-10-17 17:15 Steven Noonan
  2009-10-17 17:27 ` Jakub Narebski
@ 2009-10-21 19:38 ` Avery Pennarun
  1 sibling, 0 replies; 36+ messages in thread
From: Avery Pennarun @ 2009-10-21 19:38 UTC (permalink / raw)
  To: Steven Noonan; +Cc: Git Mailing List, crawl-ref-discuss

On Sat, Oct 17, 2009 at 1:15 PM, Steven Noonan <steven@uplinklabs.net> wrote:
> We're using git submodules for the contributing libraries. When I
> commit changes to those contribs, it correctly shows in the parent
> repository that those folders have different revisions than what's
> currently committed. However, if someone pulls those changes, it
> doesn't automatically update the contribs to match the committed
> version. But doing a pull or merge _should_ update the working tree to
> match the committed versions. It does with file data, so why not
> update the submodules? Especially if the submodule revision matched
> the committed version -before- the pull. Why are we forced into using
> 'git submodule update'?

<advertisement>
git-subtree (http://github.com/apenwarr/git-subtree) is an alternative
to submodules that doesn't have this problem.
</advertisement>

But it probably has other problems. :)  Works great for my purposes,
though, and quite a few people have contacted me to say they're using
it happily.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2009-10-21 19:38 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-28 16:20 git submodules Pierre Habouzit
2008-07-28 16:23 ` Pierre Habouzit
2008-07-28 20:23 ` Nigel Magnay
2008-07-28 20:55   ` Pierre Habouzit
2008-07-28 20:59     ` Pierre Habouzit
2008-07-28 21:40       ` Avery Pennarun
2008-07-28 22:03         ` Pierre Habouzit
2008-07-28 22:26           ` Jakub Narebski
2008-07-28 22:41             ` Junio C Hamano
2008-08-17 20:13               ` Pierre Habouzit
2008-08-17 22:54                 ` Avery Pennarun
2008-08-17 23:08                 ` Junio C Hamano
2008-08-18  0:46                   ` Pierre Habouzit
2008-07-28 22:32           ` Avery Pennarun
2008-07-28 23:12             ` Pierre Habouzit
2008-07-29  5:51         ` Benjamin Collins
2008-07-29  6:04           ` Shawn O. Pearce
2008-07-29  8:18             ` Nigel Magnay
2008-07-29  8:45               ` Pierre Habouzit
2008-07-29  8:21           ` Pierre Habouzit
2008-07-29  8:37             ` Pierre Habouzit
2008-07-29  8:51               ` Petr Baudis
2008-07-29 12:15                 ` Johannes Schindelin
2008-07-29 13:07                   ` Pierre Habouzit
2008-07-29 13:15                     ` Johannes Schindelin
2008-07-29 13:19                       ` Pierre Habouzit
2008-07-29 13:31                       ` Nigel Magnay
2008-07-29 14:49                         ` Pierre Habouzit
2008-07-29 14:53                         ` Junio C Hamano
  -- strict thread matches above, loose matches on Subject: below --
2009-10-17 17:15 Steven Noonan
2009-10-17 17:27 ` Jakub Narebski
2009-10-17 22:30   ` Nanako Shiraishi
2009-10-21 19:38 ` Avery Pennarun
2008-04-28 19:50 Victor Bogado da Silva Lins
2008-04-28 21:01 ` Miklos Vajna
     [not found] <s5hwspjzbt0.wl%tiwai@suse.de>
     [not found] ` <Pine.LNX.4.61.0802061437190.8113@tm8103.perex-int.cz>
     [not found]   ` <Pine.LNX.4.61.0802061505470.8113@tm8103.perex-int.cz>
     [not found]     ` <47AA1361.7070201@keyaccess.nl>
     [not found]       ` <s5h7ihhknez.wl%tiwai@suse.de>
2008-02-07 21:24         ` GIT submodules Rene Herman

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).