git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Feature Request: Branch-Aware Submodules
@ 2016-08-25  9:00 Hedges  Alexander
  2016-08-25 17:45 ` Stefan Beller
  2016-08-25 17:46 ` Junio C Hamano
  0 siblings, 2 replies; 9+ messages in thread
From: Hedges  Alexander @ 2016-08-25  9:00 UTC (permalink / raw)
  To: git@vger.kernel.org

Dear Git Developers,

First of all, thanks for this great project, it has made my life a lot easier
as a developer!

I'm writing this email to suggest some improvements to git submodules. In my
eyes how git handles submodules could be improved to be more intuitive to a
novice and require less manual management.


Right now updating a submodule in a topic branch and merging it into master
will not change the submodule index in master leading to at least two commit
for the same change (one in any active branch). This happened to me quite a few
times. To a newcomer this behavior is confusing and it leads to unnecessary
commits.


The proposed change would be to have a submodule either ignored or tracked by
the .gitmodules file.
If it is ignored, as for instance after a clone of the superproject, git simply
ignores all files in the submodule directory. The content of the gitmodules
file is then also not updated by git.
If it is not ignored, the .gitmodules is updated every time a commit happens in
the submodule. On branch switches the revision shown in the gitmodules from
that branch is checked out.
This change would have submodules conceptually behave more like files to the
superproject.


Like current behavior, git status would display whether the submodule has
uncommitted changes or is at a new commit. A repository is in a dirty state if
there are changes to the gitmodules file or any tracked submodule is in a dirty
state. Every time a commit happens in a submodule, the parents gitmodules is
updated. Uncommitted changes are not reflected in the parent's gitmodules file.

When the user manually edits the .gitmodules, git switches to that revision
after commit. But the user would have to stash or commit all uncommitted
changes in the submodule first.

When checking out a commit in a submodule, if there is currently a branch
pointing to that commit, HEAD could point to that branch instead (Is there a
case where that doesn't make sense? What about multiple branches pointing to
the commit?). It could also support branch names as references where the branch
(or tag) would be checked out instead.

With git submodule init you could have the submodule tracked. Using deinit
would put the submodule into the ignored state.

And while we're at it, it is quite some work to completely delete a submodule.
You have to manually remove all the associated files in the git repository
(StackOverflow lists 7 steps). Obviously it's not encouraged, as everything
that removes data without recovery method, but it should be possible.
git submodule rm --force could remove the repository and the associated nested
.git tree. git submodule rm could keep the .git directory but move it to another
location.

The behavior of git submodule sync and git submodule update would stay the same.


Migrating existing repositories to the new behavior should be quite straight
forward. Submodules that are not init'ed yet would be ignored. All others
behave accordingly to the new rules. Maybe a message with a note about the
changes could be displayed by the appropriate git-submodule commands or even by
git status.


An alternative considered was to have submodules decoupled stronger from the
superproject. That would mean having the .gitmodules only tracked by master and
leaving the other behaviors unchanged. For consistency one could do the same
thing for the .gitignore.

The drawback of this option are obviously no per branch submodules, if you want
to experiment with external libraries, topic branches would not be the place to
go. Also there would be a lot of intricacies that would have to be worked out.


I couldn't find any discussions on the initial implementation of git-submodule
or any previous proposals related to this in nature due to gmane being down
right now and the mailing list archives on the other sites are not great for
searching. So please excuse me if I'm bringing up already discussed stuff.

Until now I only worked on projects with few submodules. I expect the
proposed changes to have a larger effect on projects containing lots of
submodules. So it would be nice if maybe somebody with experience working on
projects with lots of submodules could weigh into the discussion.


Best Regards,
Alexander Hedges




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature Request: Branch-Aware Submodules
  2016-08-25  9:00 Feature Request: Branch-Aware Submodules Hedges  Alexander
@ 2016-08-25 17:45 ` Stefan Beller
  2016-08-25 20:50   ` Junio C Hamano
  2016-08-25 17:46 ` Junio C Hamano
  1 sibling, 1 reply; 9+ messages in thread
From: Stefan Beller @ 2016-08-25 17:45 UTC (permalink / raw)
  To: Hedges Alexander; +Cc: git@vger.kernel.org, Jacob Keller, Lars Schneider

+cc Jacob and Lars who work with submodules as well.

On Thu, Aug 25, 2016 at 2:00 AM, Hedges  Alexander
<ahedges@student.ethz.ch> wrote:
>
> Right now updating a submodule in a topic branch and merging it into master
> will not change the submodule index in master leading to at least two commit
> for the same change (one in any active branch). This happened to me quite a few
> times. To a newcomer this behavior is confusing and it leads to unnecessary
> commits.

So you roughly do

    git checkout -b new-topic
    # change the submodule to point at the latest upstream version:
    git submodule update --remote <submodule-path>
    git commit -a -m "update submodule"
    git checkout master
    git merge new-topic
    # here seems to be your point of critic?
    # now the submodule pointer would still point to the latest
upstream version?

>
>
> The proposed change would be to have a submodule either ignored or tracked by
> the .gitmodules file.
> If it is ignored, as for instance after a clone of the superproject, git simply
> ignores all files in the submodule directory. The content of the gitmodules
> file is then also not updated by git.
> If it is not ignored, the .gitmodules is updated every time a commit happens in
> the submodule.

So

    git -C <submodule-path> commit

should trigger a commit in the superproject as well, that changes the gitmodules
file? What do you record in the git modules file that needs updating?
As the version is tracked via the gitlink entry, I do not see the
information that
needs tracking here?

> On branch switches the revision shown in the gitmodules from
> that branch is checked out.

So you are proposing to put the revision into the gitmodules file?
That would be redundant with the actual gitlink entry in your tree.
(as shown via `git submodule status`)
What would happen if the recorded revision in the gitmodules file and the
gitlink are out of sync?

Oh, are you just proposing to actually make `git checkout` aware of the
submodules? See[1]. I would welcome such a change and be happy th

[1] https://github.com/jlehmann/git-submod-enhancements
which has some attempts for checkout including the submodules.
I also tried writing some patches which integrate checking out submodules
via checkout as well. A quicker `solution` would be a config option that
just runs `git submodule update` after each checkout/pull etc.


> This change would have submodules conceptually behave more like files to the
> superproject.
>
>
> Like current behavior, git status would display whether the submodule has
> uncommitted changes or is at a new commit.

See config options diff.submodule and status.submoduleSummary.


>
> I couldn't find any discussions on the initial implementation of git-submodule
> or any previous proposals related to this in nature due to gmane being down
> right now and the mailing list archives on the other sites are not great for
> searching. So please excuse me if I'm bringing up already discussed stuff.

https://public-inbox.org/git for reading on the web, or

    git clone https://public-inbox.org/git

for reading offline.

>
> Until now I only worked on projects with few submodules. I expect the
> proposed changes to have a larger effect on projects containing lots of
> submodules. So it would be nice if maybe somebody with experience working on
> projects with lots of submodules could weigh into the discussion.
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature Request: Branch-Aware Submodules
  2016-08-25  9:00 Feature Request: Branch-Aware Submodules Hedges  Alexander
  2016-08-25 17:45 ` Stefan Beller
@ 2016-08-25 17:46 ` Junio C Hamano
  1 sibling, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2016-08-25 17:46 UTC (permalink / raw)
  To: Hedges Alexander; +Cc: git@vger.kernel.org

"Hedges  Alexander" <ahedges@student.ethz.ch> writes:

> Right now updating a submodule in a topic branch and merging it into master
> will not change the submodule index in master leading to at least two commit
> for the same change (one in any active branch).

I stopped reading here because I am not getting this.

I guess I am confused because I do not understand what you mean by
"the submodule index in master".  The concept of "index" does not
belong to each branch (or even a commit), so by "index" you are
trying to point at something else, but I cannot guess what it is.

You have a top-level superproject that has another project as its
submodule.  The superproject has topic and master branches (or it
may only have master).  The project that is used as its submodule
also has topic and master branches (it may have more).  You do your
development in the submodule, e.g.

    cd submoduledir
    git checkout topic
    hack hack hack
    git commit
    git checkout master
    git merge topic

and merge the topic branch into its master when the topic is
polished enough.

And then?  The 'master' in the submodule is good enough, so you'd
go back to the top-level superproject and bind that merged result 
in its place?  e.g.

    cd ..
    git add submoduledir
    git commit -m "Updated submoduledir with the topic"

That is only one commit each in the superproject and the submodule
project.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature Request: Branch-Aware Submodules
  2016-08-25 17:45 ` Stefan Beller
@ 2016-08-25 20:50   ` Junio C Hamano
  2016-08-25 20:55     ` Stefan Beller
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2016-08-25 20:50 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Hedges Alexander, git@vger.kernel.org, Jacob Keller,
	Lars Schneider

Stefan Beller <sbeller@google.com> writes:

> +cc Jacob and Lars who work with submodules as well.
>
> On Thu, Aug 25, 2016 at 2:00 AM, Hedges  Alexander
> <ahedges@student.ethz.ch> wrote:
>>
>> Right now updating a submodule in a topic branch and merging it into master
>> will not change the submodule index in master leading to at least two commit
>> for the same change (one in any active branch). This happened to me quite a few
>> times. To a newcomer this behavior is confusing and it leads to unnecessary
>> commits.
>
> So you roughly do
>
>     git checkout -b new-topic
>     # change the submodule to point at the latest upstream version:
>     git submodule update --remote <submodule-path>
>     git commit -a -m "update submodule"
>     git checkout master
>     git merge new-topic
>     # here seems to be your point of critic?
>     # now the submodule pointer would still point to the latest
> upstream version?

Isn't <submodule-path> subject to the usual 3-way merge when the
last step (i.e. a merge of new-topic branch into master in the
superproject) is made?  If 'master' hasn't changed <submodule-path>
since 'new-topic' forked from it, because 'new-topic' updated the
commit bound at <submodule-path>, doesn't "git merge new-topic" just
take that change as the normal "One side updated, the other did not
touch; take the update" merge?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature Request: Branch-Aware Submodules
  2016-08-25 20:50   ` Junio C Hamano
@ 2016-08-25 20:55     ` Stefan Beller
  2016-08-25 21:25       ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Stefan Beller @ 2016-08-25 20:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Hedges Alexander, git@vger.kernel.org, Jacob Keller,
	Lars Schneider

On Thu, Aug 25, 2016 at 1:50 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> +cc Jacob and Lars who work with submodules as well.
>>
>> On Thu, Aug 25, 2016 at 2:00 AM, Hedges  Alexander
>> <ahedges@student.ethz.ch> wrote:
>>>
>>> Right now updating a submodule in a topic branch and merging it into master
>>> will not change the submodule index in master leading to at least two commit
>>> for the same change (one in any active branch). This happened to me quite a few
>>> times. To a newcomer this behavior is confusing and it leads to unnecessary
>>> commits.
>>
>> So you roughly do
>>
>>     git checkout -b new-topic
>>     # change the submodule to point at the latest upstream version:
>>     git submodule update --remote <submodule-path>
>>     git commit -a -m "update submodule"
>>     git checkout master
>>     git merge new-topic
>>     # here seems to be your point of critic?
>>     # now the submodule pointer would still point to the latest
>> upstream version?
>
> Isn't <submodule-path> subject to the usual 3-way merge when the
> last step (i.e. a merge of new-topic branch into master in the
> superproject) is made?  If 'master' hasn't changed <submodule-path>
> since 'new-topic' forked from it, because 'new-topic' updated the
> commit bound at <submodule-path>, doesn't "git merge new-topic" just
> take that change as the normal "One side updated, the other did not
> touch; take the update" merge?

Yes. I was unclear here.
By "latest upstream version" I meant the version you pulled in in the new-topic
branch via the "submodule update --remote" and that is preserved as is.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature Request: Branch-Aware Submodules
  2016-08-25 20:55     ` Stefan Beller
@ 2016-08-25 21:25       ` Junio C Hamano
  0 siblings, 0 replies; 9+ messages in thread
From: Junio C Hamano @ 2016-08-25 21:25 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Hedges Alexander, git@vger.kernel.org, Jacob Keller,
	Lars Schneider

Stefan Beller <sbeller@google.com> writes:

>>> So you roughly do
>>>
>>>     git checkout -b new-topic
>>>     # change the submodule to point at the latest upstream version:
>>>     git submodule update --remote <submodule-path>
>>>     git commit -a -m "update submodule"
>>>     git checkout master
>>>     git merge new-topic
>>>     # here seems to be your point of critic?
>>>     # now the submodule pointer would still point to the latest
>>> upstream version?
>>
>> Isn't <submodule-path> subject to the usual 3-way merge when the
>> last step (i.e. a merge of new-topic branch into master in the
>> superproject) is made?  If 'master' hasn't changed <submodule-path>
>> since 'new-topic' forked from it, because 'new-topic' updated the
>> commit bound at <submodule-path>, doesn't "git merge new-topic" just
>> take that change as the normal "One side updated, the other did not
>> touch; take the update" merge?
>
> Yes. I was unclear here.
> By "latest upstream version" I meant the version you pulled in in the new-topic
> branch via the "submodule update --remote" and that is preserved as is.

I do not think you were unclear at all.

What else is desired?  "git merge new-topic" leaves a result that is
not a merge of the changes made on that new-topic branch, by leaving
a stale <submodule-path> that was in 'master' as-is?

After all, the new-topic branch committed that "update submodule",
showing its desire that the latest-from-upstream commit it just
obtained must be at <submodule-path> from then on in the top-level
project.  If that change is not propagated (or at least "taken into
account") when merging it to 'master', the result is not a proper
"merge".  If new-topic didn't want the updated commit from the
submodule, it shouldn't have recorded that in its commit in the
first place.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature Request: Branch-Aware Submodules
       [not found] <5EA7D232-5D41-4653-9E35-21C502C79C92@student.ethz.ch>
@ 2016-08-26 15:12 ` Hedges  Alexander
  2016-08-29  2:17   ` Jacob Keller
  0 siblings, 1 reply; 9+ messages in thread
From: Hedges  Alexander @ 2016-08-26 15:12 UTC (permalink / raw)
  To: git@vger.kernel.org


> On 25 Aug 2016, at 19:45, Stefan Beller <sbeller@google.com> wrote:
> 
> +cc Jacob and Lars who work with submodules as well.
> 
> On Thu, Aug 25, 2016 at 2:00 AM, Hedges  Alexander
> <ahedges@student.ethz.ch> wrote:
>> 
>> Right now updating a submodule in a topic branch and merging it into master
>> will not change the submodule index in master leading to at least two commit
>> for the same change (one in any active branch). This happened to me quite a few
>> times. To a newcomer this behavior is confusing and it leads to unnecessary
>> commits.
> 
> So you roughly do
> 
>   git checkout -b new-topic
>   # change the submodule to point at the latest upstream version:
>   git submodule update --remote <submodule-path>
>   git commit -a -m "update submodule"
>   git checkout master
>   git merge new-topic
>   # here seems to be your point of critic?
>   # now the submodule pointer would still point to the latest
> upstream version?
> 

Excuse my poor wording above. The problem is the following:

# assume a repo with a few branches and one submodules
git checkout -b new_feature
git commit -am "some new commits"
cd submodule/path
git commit -am "dirty hacking on a library"
cd ../..
git commit -am "changes and update library"
git status
# all is well
git checkout master
git status
# it says new submodule commits ??
git commit -am "update library again…"
git merge new_feature
git checkout old_feature_that_never_made_it
git status
# still ???
git commit -am …

Now reading the comments below, I overlooked git submodule update. I used update
only the first time after a clone with the init flag. As a remedy I could just
run git submodule update after every merge, but then I always get a detached
head which is also not ideal.
The second thing I overlooked is just merging without worrying about the git
status telling me the repository is dirty. But here my muscle memory does a
commit when the repository is dirty, before running any other git commands.

Obviously, its confusing to people without a certain amount of experience.

>> The proposed change would be to have a submodule either ignored or tracked by
>> the .gitmodules file.
>> If it is ignored, as for instance after a clone of the superproject, git simply
>> ignores all files in the submodule directory. The content of the gitmodules
>> file is then also not updated by git.
>> If it is not ignored, the .gitmodules is updated every time a commit happens in
>> the submodule.
> 
> So
> 
>   git -C <submodule-path> commit
> 
> should trigger a commit in the superproject as well, that changes the gitmodules
> file? What do you record in the git modules file that needs updating?
> As the version is tracked via the gitlink entry, I do not see the
> information that
> needs tracking here?

I guess nothing has to be done here. I mistakenly thought the .gitmodules stores
the SHA.

> 
>> On branch switches the revision shown in the gitmodules from
>> that branch is checked out.
> 
> So you are proposing to put the revision into the gitmodules file?
> That would be redundant with the actual gitlink entry in your tree.
> (as shown via `git submodule status`)
> What would happen if the recorded revision in the gitmodules file and the
> gitlink are out of sync?
> 
> Oh, are you just proposing to actually make `git checkout` aware of the
> submodules? See[1]. I would welcome such a change and be happy th
> 
> [1] https://github.com/jlehmann/git-submod-enhancements
> which has some attempts for checkout including the submodules.
> I also tried writing some patches which integrate checking out submodules
> via checkout as well. A quicker `solution` would be a config option that
> just runs `git submodule update` after each checkout/pull etc.
> 

I see. The quick fix is almost what I’m looking for, except that it leaves
the repo in a detached head state. Could the submodule update be made 
automatically and intelligently pick the branch?

> 
>> This change would have submodules conceptually behave more like files to the
>> superproject.
>> 
>> 
>> Like current behavior, git status would display whether the submodule has
>> uncommitted changes or is at a new commit.
> 
> See config options diff.submodule and status.submoduleSummary.
> 

I meant that git status works fine the way it is implemented right now.

> 
>> 
>> I couldn't find any discussions on the initial implementation of git-submodule
>> or any previous proposals related to this in nature due to gmane being down
>> right now and the mailing list archives on the other sites are not great for
>> searching. So please excuse me if I'm bringing up already discussed stuff.
> 
> https://public-inbox.org/git for reading on the web, or
> 
>   git clone https://public-inbox.org/git
> 
> for reading offline.
> 

Thanks.

Best Regards,
Alexander Hedges


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature Request: Branch-Aware Submodules
  2016-08-26 15:12 ` Hedges  Alexander
@ 2016-08-29  2:17   ` Jacob Keller
  2016-09-01 11:34     ` Hedges  Alexander
  0 siblings, 1 reply; 9+ messages in thread
From: Jacob Keller @ 2016-08-29  2:17 UTC (permalink / raw)
  To: Hedges Alexander; +Cc: git@vger.kernel.org

On Fri, Aug 26, 2016 at 8:12 AM, Hedges  Alexander
<ahedges@student.ethz.ch> wrote:
>> On 25 Aug 2016, at 19:45, Stefan Beller <sbeller@google.com> wrote:
>> [1] https://github.com/jlehmann/git-submod-enhancements
>> which has some attempts for checkout including the submodules.
>> I also tried writing some patches which integrate checking out submodules
>> via checkout as well. A quicker `solution` would be a config option that
>> just runs `git submodule update` after each checkout/pull etc.
>>
>
> I see. The quick fix is almost what I’m looking for, except that it leaves
> the repo in a detached head state. Could the submodule update be made
> automatically and intelligently pick the branch?
>

You probably want "git submodule update --rebase" or "git submodule
update --merge" See git help submodule under the update section, or
even a custom command variant where you can write your own bit of
shell that does what your project expects.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Feature Request: Branch-Aware Submodules
  2016-08-29  2:17   ` Jacob Keller
@ 2016-09-01 11:34     ` Hedges  Alexander
  0 siblings, 0 replies; 9+ messages in thread
From: Hedges  Alexander @ 2016-09-01 11:34 UTC (permalink / raw)
  To: Jacob Keller; +Cc: git@vger.kernel.org

Since I don’t want to change any history in the subproject, to me the most 
expected behavior would be:

git submodule update —-recursive

with submodule.*.update set to the command:

```
#!/bin/bash

branches=`git branch --points-at "$1"`

if [ ! $branches ] ; then
    git checkout "$1"
    echo "do normal checkout"
else
    points_to_master=
    other_branch=
    for b in $branches ; do
        if [ "$b" = "master" ] ; then
            points_to_master="true"
        else
            other_branch="$b"
        fi
    done
    if [ points_to_master ] ; then
        git checkout master
    else
        git checkout "$other_branch"
    fi
fi
```

Now, this is not perfect and I’m sure I’ll refine it whenever I find it doesn’t
suit my needs, but I’m sure you can see the intentions here. I’m also not quite
sure whether to prioritize tags over branches or the other way around.

Thanks for the suggestion. I hope this or a similar behavior could sometime
become the default in git. Until the suggested quick fix will do for me.

Best Regards,
Alexander Hedges

> On 29 Aug 2016, at 04:17, Jacob Keller <jacob.keller@gmail.com> wrote:
> 
> On Fri, Aug 26, 2016 at 8:12 AM, Hedges  Alexander
> <ahedges@student.ethz.ch> wrote:
>>> On 25 Aug 2016, at 19:45, Stefan Beller <sbeller@google.com> wrote:
>>> [1] https://github.com/jlehmann/git-submod-enhancements
>>> which has some attempts for checkout including the submodules.
>>> I also tried writing some patches which integrate checking out submodules
>>> via checkout as well. A quicker `solution` would be a config option that
>>> just runs `git submodule update` after each checkout/pull etc.
>>> 
>> 
>> I see. The quick fix is almost what I’m looking for, except that it leaves
>> the repo in a detached head state. Could the submodule update be made
>> automatically and intelligently pick the branch?
>> 
> 
> You probably want "git submodule update --rebase" or "git submodule
> update --merge" See git help submodule under the update section, or
> even a custom command variant where you can write your own bit of
> shell that does what your project expects.
> 
> Thanks,
> Jake


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-09-01 11:34 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-25  9:00 Feature Request: Branch-Aware Submodules Hedges  Alexander
2016-08-25 17:45 ` Stefan Beller
2016-08-25 20:50   ` Junio C Hamano
2016-08-25 20:55     ` Stefan Beller
2016-08-25 21:25       ` Junio C Hamano
2016-08-25 17:46 ` Junio C Hamano
     [not found] <5EA7D232-5D41-4653-9E35-21C502C79C92@student.ethz.ch>
2016-08-26 15:12 ` Hedges  Alexander
2016-08-29  2:17   ` Jacob Keller
2016-09-01 11:34     ` Hedges  Alexander

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).