git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [RFC] Branches with --recurse-submodules
@ 2021-11-08 22:33 Glen Choo
  2021-11-10 18:21 ` Glen Choo
  2021-11-12  3:19 ` Philippe Blain
  0 siblings, 2 replies; 11+ messages in thread
From: Glen Choo @ 2021-11-08 22:33 UTC (permalink / raw)
  To: git


Original Submodule UX RFC/Discussion:
https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com/

Contributor Summit submodules Notes:
https://lore.kernel.org/git/nycvar.QRO.7.76.6.2110211148060.56@tvgsbejvaqbjf.bet/

Submodule UX overhaul updates:
https://lore.kernel.org/git/?q=Submodule+UX+overhaul+update

Hi all! Building on Emily’s original RFC, here is a more fleshed out
vision of how `git {switch,checkout,branch}` will work with
submodule-native branches.

The "Background" section reframes the justification and mental model
behind our proposed workflow in more explicit terms (see "Submodule UX
RFC:Overview"). The "Design" section presents the rules we are using to
implement "Submodule UX RFC:Detailed Design", and how certain corner
cases should be handled.

I’d appreciate any and all feedback :) In particular, readers may be
interested in the "dirty worktree" approach behind `git switch`. If
anything stands out as good, bad or missing, do let us know. Thanks!

== Background

The purpose of this effort is to bring the benefits of branches to
superprojects. In Git, branches are used to name and track progress;
submodules are used to incorporate other repos. However, because of how
submodules are tracked by superprojects, submodules usually operate in
detached HEAD and the benefits of branches are lost. For users
uncomfortable with detached HEAD, this workflow seems risky and
unintuitive. Other users may still prefer branches because they can have
branch reflog and they can be confident that submodule work is being
tracked by some branch and won’t be gc-ed.

The main ideas are:

* there is a single set of branch names that are used throughout the
  repo tree
* progress can be made on submodules and/or the superproject without
  requiring a gitlink update on the superproject
* the user can switch between branches like they would for a
  non-submodule-using repo.

We do not require the branches to move in lockstep, thus this UX may be
suboptimal for logical monorepos that are implemented as submodules.

== Design

This design uses the same branch name in the superproject and
submodules; a user who sees the branch `topic` in the superproject and
submodules knows that they are the same logical thing. Commands with
--recurse-submodules maintain the invariant that branches in the
superproject and submodules are {read,created,modified,deleted}
together.

e.g.

* `git branch --recurse-submodules topic` should create the branch
  `topic` in each of the repositories.
* `git switch --recurse-submodules topic` should checkout the branch
  `topic` in each of the repositories

In a superproject-submodule relationship there is some ambiguity in what
‘checkout the branch `topic`’ should mean (does the submodule use its
topic branch, or the version recorded in the superproject’s gitlink?).
Our approach is to preserve existing semantics where reasonable - the
ref name refers to the superproject’s ref, just as it does without
--recurse-submodules.

One wrinkle is that a user can act on submodules _without_ going through
the superproject (e.g. by cd-ing into the submodule), thus the branch
tips may not match the expected commits in the superproject or the set
of submodules branches may not match the set of superproject branches.
As such, submodule branch names are resolved on a best-effort basis:

* If the submodule branch commit matches the one in the superproject, we
  can safely use the submodule branch.
* If the branch is in an unexpected state, we either:
** Fallback to the version that the user would expect (if it is safe to
    do so).
** Reject the operation (if it is not safe).

As we expand submodule branches to other commands (merge, rebase,
reset), the notions of ‘unexpected state’ and ‘safety’ become
increasingly nebulous and difficult to define because they depend on the
command being run. To manage this, we will start by supporting submodule
branching under a limited set of circumstances and try to loosen them in
the future. We will manage the user’s expectations by warning them if
Git detects an unexpected state.

The proposed rules for submodule branching are as follows:

=== Switching _from_ a branch `topic`, i.e. `git {switch,checkout}`

Check `topic` if each submodule’s worktree is clean (except for
gitlinks), and has one of the following checked out:

* `topic`
* the commit id in the superproject gitlink

This allows the user to switch with a dirty worktree (with respect to
the superproject). We consider this acceptable because the submodule
commits are tracked by the submodule branch. This is helpful when a user
needs to switch branches before they are ready to commit to the
superproject.

=== Switching _to_ a branch `topic`, i.e. `git {switch,checkout} topic`

Switch to `topic` in the superproject. Then in each submodule, switch to:

* `topic`, if it exists
* Otherwise, the commit id in the superproject gitlink (and warn the
  user that HEAD is detached)

If the submodule `topic` points to a different commit from the
superproject gitlink, this will leave the superproject with a dirty
worktree with respect to the gitlinks. This allows a user to recover
work if they had previously switched _away from_ "topic".

If a dirty worktree is unacceptable, we may need an option that is
guaranteed to check out the superproject’s `topic`.

=== Creating a branch `topic`, i.e. `git branch topic start-point`

Check each submodule at the superproject’s `start-point` (not the
submodule’s `start-point`) for the following:

* The submodule is initialized (in .git/modules)
* `topic` is a valid branch name

If so, create `topic` in the superproject and submodules based on the
superproject’s `start-point`. Else, do not create any `topic` branches
and guide the user towards a possible fix:

* A --force option that will move the branch tip to the commit in the
  superproject. This will let the user overwrite the history of `topic`.
* An --ignore option that ignores the existing `topic` branch. If used,
  `git switch topic` would result in a dirty worktree.
* (If needed) An --adopt option that creates a new superproject commit
  that points to the existing submodule `topic` branch. This will let
  the user checkout `topic` without ending up with a dirty worktree.
* For uninitialized submodules, prompt them to initialize it via git
  checkout start-point && git submodule update (we are working to
  eliminate manual initialization in the long run, so this will become
  obsolete eventually).

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Branches with --recurse-submodules
  2021-11-08 22:33 [RFC] Branches with --recurse-submodules Glen Choo
@ 2021-11-10 18:21 ` Glen Choo
  2021-11-10 18:35   ` rsbecker
  2021-11-12  3:22   ` Philippe Blain
  2021-11-12  3:19 ` Philippe Blain
  1 sibling, 2 replies; 11+ messages in thread
From: Glen Choo @ 2021-11-10 18:21 UTC (permalink / raw)
  To: git


I found some points that I should have given more attention to in the
RFC. I'd appreciate any and all feedback :)

Glen Choo <chooglen@google.com> writes:

> In a superproject-submodule relationship there is some ambiguity in what
> ‘checkout the branch `topic`’ should mean (does the submodule use its
> topic branch, or the version recorded in the superproject’s gitlink?).
> Our approach is to preserve existing semantics where reasonable - the
> ref name refers to the superproject’s ref, just as it does without
> --recurse-submodules.

Because a gitlink only contains a commit id, the submodule branch will
use a plain commit id as the branch point. This gives the correct ref,
but it gives no hints as to what the submodule branch should track.

The current thought process is to set up tracking using the ref name and
the submodule's config. Thus, a more complete description of

  git branch --recurse-submodules topic origin/main

is something like:

* for each repository, create the 'topic' branch where each 'topic'
  branch points to the version recorded in the superproject's
  'origin/main'
* for each repository, setup tracking for the 'topic' branch using the
  repository's own 'origin/main' as the branch point

Note that there is no guarantee that a submodule's 'origin/main' points
to the same commit as the superproject's 'origin/main', or if the
submodule's 'origin/main' even exists. 

If tracking information cannot be setup, we will still create the
branch; we will only warn users when they run a command that requires
tracking information e.g. fetch or push.

> === Switching _from_ a branch `topic`, i.e. `git {switch,checkout}`
>
> Check `topic` if each submodule’s worktree is clean (except for
> gitlinks), and has one of the following checked out:
>
> * `topic`
> * the commit id in the superproject gitlink
>
> This allows the user to switch with a dirty worktree (with respect to
> the superproject). We consider this acceptable because the submodule
> commits are tracked by the submodule branch. This is helpful when a user
> needs to switch branches before they are ready to commit to the
> superproject.

Note that this is how git switch with submodules already works - users
can switch away from a dirty superproject worktree as long as the
submodule worktrees are not dirty. However, without branches, this is
perilous because a user could unintentionally switch away from their
submodule WIP and have no easy way of recovering their work.

The proposed UX solves this by making the WIP tracked by a branch by
default. If a user switches _away_ from their WIP 'topic' branch...

> === Switching _to_ a branch `topic`, i.e. `git {switch,checkout} topic`
>
> Switch to `topic` in the superproject. Then in each submodule, switch to:
>
> * `topic`, if it exists
> * Otherwise, the commit id in the superproject gitlink (and warn the
>   user that HEAD is detached)
>
> If the submodule `topic` points to a different commit from the
> superproject gitlink, this will leave the superproject with a dirty
> worktree with respect to the gitlinks. This allows a user to recover
> work if they had previously switched _away from_ "topic".

they can still recover their WIP state by switching _back_ to their WIP
'topic' branch.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [RFC] Branches with --recurse-submodules
  2021-11-10 18:21 ` Glen Choo
@ 2021-11-10 18:35   ` rsbecker
  2021-11-10 19:35     ` Glen Choo
  2021-11-12  3:22   ` Philippe Blain
  1 sibling, 1 reply; 11+ messages in thread
From: rsbecker @ 2021-11-10 18:35 UTC (permalink / raw)
  To: 'Glen Choo', git

On November 10, 2021 1:21 PM, Glen Choo wrote:
> I found some points that I should have given more attention to in the RFC. I'd
> appreciate any and all feedback :)
> 
> Glen Choo <chooglen@google.com> writes:
> 
> > In a superproject-submodule relationship there is some ambiguity in
> > what ‘checkout the branch `topic`’ should mean (does the submodule use
> > its topic branch, or the version recorded in the superproject’s gitlink?).
> > Our approach is to preserve existing semantics where reasonable - the
> > ref name refers to the superproject’s ref, just as it does without
> > --recurse-submodules.
> 
> Because a gitlink only contains a commit id, the submodule branch will use a
> plain commit id as the branch point. This gives the correct ref, but it gives no
> hints as to what the submodule branch should track.
> 
> The current thought process is to set up tracking using the ref name and the
> submodule's config. Thus, a more complete description of
> 
>   git branch --recurse-submodules topic origin/main
> 
> is something like:
> 
> * for each repository, create the 'topic' branch where each 'topic'
>   branch points to the version recorded in the superproject's
>   'origin/main'
> * for each repository, setup tracking for the 'topic' branch using the
>   repository's own 'origin/main' as the branch point
> 
> Note that there is no guarantee that a submodule's 'origin/main' points to
> the same commit as the superproject's 'origin/main', or if the submodule's
> 'origin/main' even exists.
> 
> If tracking information cannot be setup, we will still create the branch; we will
> only warn users when they run a command that requires tracking
> information e.g. fetch or push.
> 
> > === Switching _from_ a branch `topic`, i.e. `git {switch,checkout}`
> >
> > Check `topic` if each submodule’s worktree is clean (except for
> > gitlinks), and has one of the following checked out:
> >
> > * `topic`
> > * the commit id in the superproject gitlink
> >
> > This allows the user to switch with a dirty worktree (with respect to
> > the superproject). We consider this acceptable because the submodule
> > commits are tracked by the submodule branch. This is helpful when a
> > user needs to switch branches before they are ready to commit to the
> > superproject.
> 
> Note that this is how git switch with submodules already works - users can
> switch away from a dirty superproject worktree as long as the submodule
> worktrees are not dirty. However, without branches, this is perilous because
> a user could unintentionally switch away from their submodule WIP and have
> no easy way of recovering their work.
> 
> The proposed UX solves this by making the WIP tracked by a branch by
> default. If a user switches _away_ from their WIP 'topic' branch...
> 
> > === Switching _to_ a branch `topic`, i.e. `git {switch,checkout}
> > topic`
> >
> > Switch to `topic` in the superproject. Then in each submodule, switch to:
> >
> > * `topic`, if it exists
> > * Otherwise, the commit id in the superproject gitlink (and warn the
> >   user that HEAD is detached)
> >
> > If the submodule `topic` points to a different commit from the
> > superproject gitlink, this will leave the superproject with a dirty
> > worktree with respect to the gitlinks. This allows a user to recover
> > work if they had previously switched _away from_ "topic".
> 
> they can still recover their WIP state by switching _back_ to their WIP 'topic'
> branch.

While not mandatory, we use a practice as follows:
1. Clone the superproject
2. Update the submodules - checks out the commit referenced by the superproject and fetches all parent commits.
3. Fetch the main branch of each submodule.
4. If working on the submodule, use a branch, not a commit - typically off main.
5. The branches in the submodule "keep alive" any commits not referenced by the superproject.

We see HEAD moving in the submodule based on what is referenced in the superproject, but work is not lost because of a disconnected head.

What I could see as a possible improvement is to add the branch ref to the submodule ref file - not replacing the commit but adding to it. I do worry that there are unintended (unforeseen) side-effects that will result from this, however, including potential merge conflicts. Two people working on the same commit but different branches may mess the ref file, so not really a good idea.

So far, we have not lost any commits this way and it has worked for a very long time.

Just my musings.
-Randall


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [RFC] Branches with --recurse-submodules
  2021-11-10 18:35   ` rsbecker
@ 2021-11-10 19:35     ` Glen Choo
  2021-11-10 20:25       ` rsbecker
  0 siblings, 1 reply; 11+ messages in thread
From: Glen Choo @ 2021-11-10 19:35 UTC (permalink / raw)
  To: rsbecker, git

Overall, I think your workflow is not too dissimilar to the UX we are
proposing :)

<rsbecker@nexbridge.com> writes:

> 4. If working on the submodule, use a branch, not a commit - typically off main.

With the proposed UX, step (4) would happen automatically when using
"branch --recurse-submodules". Users would get a safer and more
convenient default.

> What I could see as a possible improvement is to add the branch ref to the submodule ref file - not replacing the commit but adding to it. I do worry that there are unintended (unforeseen) side-effects that will result from this, however, including potential merge conflicts. Two people working on the same commit but different branches may mess the ref file, so not really a good idea.

It's an interesting idea, but as you noted, it is quite thorny. I would
also like to see more information being captured by the superproject
tree (instead of just .gitmodules), but I'm also not sure how we might
do that.

> Just my musings.

I appreciate the effort taken :) Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [RFC] Branches with --recurse-submodules
  2021-11-10 19:35     ` Glen Choo
@ 2021-11-10 20:25       ` rsbecker
  2021-11-11 18:12         ` Glen Choo
  0 siblings, 1 reply; 11+ messages in thread
From: rsbecker @ 2021-11-10 20:25 UTC (permalink / raw)
  To: 'Glen Choo', git

On November 10, 2021 2:35 PM, Glen Choo wrote:
> 
> Overall, I think your workflow is not too dissimilar to the UX we are proposing
> :)
> 
> <rsbecker@nexbridge.com> writes:
> 
> > 4. If working on the submodule, use a branch, not a commit - typically off
> main.
> 
> With the proposed UX, step (4) would happen automatically when using
> "branch --recurse-submodules". Users would get a safer and more
> convenient default.

I think this might be more reliably done using a switch in .gitconfig to enable the capabilities. Perhaps something like:

submodule.fetch-branches=true

> > What I could see as a possible improvement is to add the branch ref to the
> submodule ref file - not replacing the commit but adding to it. I do worry that
> there are unintended (unforeseen) side-effects that will result from this,
> however, including potential merge conflicts. Two people working on the
> same commit but different branches may mess the ref file, so not really a
> good idea.
> 
> It's an interesting idea, but as you noted, it is quite thorny. I would also like to
> see more information being captured by the superproject tree (instead of
> just .gitmodules), but I'm also not sure how we might do that.

I'm suggesting this instead of a command-line option because this seems more like a policy-based process that you would either always want or never want. I would not like to depend on a developer making the call each time a clone occurred. I'm sad to admit that I don't really know where to start on this enhancement, though, even if approved.
-Randall


^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [RFC] Branches with --recurse-submodules
  2021-11-10 20:25       ` rsbecker
@ 2021-11-11 18:12         ` Glen Choo
  0 siblings, 0 replies; 11+ messages in thread
From: Glen Choo @ 2021-11-11 18:12 UTC (permalink / raw)
  To: rsbecker, git

<rsbecker@nexbridge.com> writes:

>> > 4. If working on the submodule, use a branch, not a commit - typically off
>> main.
>> 
>> With the proposed UX, step (4) would happen automatically when using
>> "branch --recurse-submodules". Users would get a safer and more
>> convenient default.
>
> I think this might be more reliably done using a switch in .gitconfig to enable the capabilities. Perhaps something like:
>
> submodule.fetch-branches=true

Correct me if I am wrong, but we might be suggesting different things
here. "submodule.fetch-branches" suggests to me that you're thinking of
submodule branches that track the remotes of the submodule. The proposed
UX is more about how we have submodule branches that work in tandem with
superproject branches.

A user who is only concerned about a single submodule can cd and make
their changes as if the submodule were a standalone repo - I think
that's pretty well-supported. The missing link being able to coordinate
this work with the superproject.

>> > What I could see as a possible improvement is to add the branch ref to the
>> submodule ref file - not replacing the commit but adding to it. I do worry that
>> there are unintended (unforeseen) side-effects that will result from this,
>> however, including potential merge conflicts. Two people working on the
>> same commit but different branches may mess the ref file, so not really a
>> good idea.
>> 
>> It's an interesting idea, but as you noted, it is quite thorny. I would also like to
>> see more information being captured by the superproject tree (instead of
>> just .gitmodules), but I'm also not sure how we might do that.
>
> I'm suggesting this instead of a command-line option because this seems more like a policy-based process that you would either always want or never want. I would not like to depend on a developer making the call each time a clone occurred. I'm sad to admit that I don't really know where to start on this enhancement, though, even if approved.

Agreed with regards to both points, i.e. using a config instead of CLI
option, and not knowing where to start..

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Branches with --recurse-submodules
  2021-11-08 22:33 [RFC] Branches with --recurse-submodules Glen Choo
  2021-11-10 18:21 ` Glen Choo
@ 2021-11-12  3:19 ` Philippe Blain
  2021-11-15 19:03   ` Glen Choo
  1 sibling, 1 reply; 11+ messages in thread
From: Philippe Blain @ 2021-11-12  3:19 UTC (permalink / raw)
  To: Glen Choo, git

Hi Glen,

Le 2021-11-08 à 17:33, Glen Choo a écrit :
> 
> Original Submodule UX RFC/Discussion:
> https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com/
> 
> Contributor Summit submodules Notes:
> https://lore.kernel.org/git/nycvar.QRO.7.76.6.2110211148060.56@tvgsbejvaqbjf.bet/
> 
> Submodule UX overhaul updates:
> https://lore.kernel.org/git/?q=Submodule+UX+overhaul+update
> 
> Hi all! Building on Emily’s original RFC, here is a more fleshed out
> vision of how `git {switch,checkout,branch}` will work with
> submodule-native branches.
> 
> The "Background" section reframes the justification and mental model
> behind our proposed workflow in more explicit terms (see "Submodule UX
> RFC:Overview"). The "Design" section presents the rules we are using to
> implement "Submodule UX RFC:Detailed Design", and how certain corner
> cases should be handled.
> 
> I’d appreciate any and all feedback :) In particular, readers may be
> interested in the "dirty worktree" approach behind `git switch`. If
> anything stands out as good, bad or missing, do let us know. Thanks!
> 
> == Background
> 
> The purpose of this effort is to bring the benefits of branches to
> superprojects. In Git, branches are used to name and track progress;
> submodules are used to incorporate other repos. However, because of how
> submodules are tracked by superprojects, submodules usually operate in
> detached HEAD and the benefits of branches are lost. For users
> uncomfortable with detached HEAD, this workflow seems risky and
> unintuitive. Other users may still prefer branches because they can have
> branch reflog and they can be confident that submodule work is being
> tracked by some branch and won’t be gc-ed.
> 
> The main ideas are:
> 
> * there is a single set of branch names that are used throughout the
>    repo tree
> * progress can be made on submodules and/or the superproject without
>    requiring a gitlink update on the superproject
> * the user can switch between branches like they would for a
>    non-submodule-using repo.
> 
> We do not require the branches to move in lockstep, thus this UX may be
> suboptimal for logical monorepos that are implemented as submodules.
> 
> == Design
> 
> This design uses the same branch name in the superproject and
> submodules; a user who sees the branch `topic` in the superproject and
> submodules knows that they are the same logical thing. Commands with
> --recurse-submodules maintain the invariant that branches in the
> superproject and submodules are {read,created,modified,deleted}
> together.
> 
> e.g.
> 
> * `git branch --recurse-submodules topic` should create the branch
>    `topic` in each of the repositories.

I guess for some workflow this would be the good, but for others you might
not need to create submodule branches for each new superproject branch you
create.  I think I pointed that out before; I don't necessarily think that
creating branches in all submodules should *not* be the default behaviour,
but I think that it should be configurable. I mean that if I have 'submodule.recurse'
set to true, I would not like 'git branch topic' to create a 'topic' branch
in each submodule. So I wish I'll be able to add 'branch.recurseSubmodules = false'
to my config (or something similar) to have more granularity in behaviour.

Also, I assume the new behaviour would carry over to 'git checkout -b' and
'git switch -c' ?

> * `git switch --recurse-submodules topic` should checkout the branch
>    `topic` in each of the repositories

Nit: I guess you also include 'git checkout --r topic' here also ?

> 
> In a superproject-submodule relationship there is some ambiguity in what
> ‘checkout the branch `topic`’ should mean (does the submodule use its
> topic branch, or the version recorded in the superproject’s gitlink?).
> Our approach is to preserve existing semantics where reasonable - the
> ref name refers to the superproject’s ref, just as it does without
> --recurse-submodules.
> 
> One wrinkle is that a user can act on submodules _without_ going through
> the superproject (e.g. by cd-ing into the submodule), thus the branch
> tips may not match the expected commits in the superproject or the set
> of submodules branches may not match the set of superproject branches.
> As such, submodule branch names are resolved on a best-effort basis:
> 
> * If the submodule branch commit matches the one in the superproject, we
>    can safely use the submodule branch.

That makes sense.

> * If the branch is in an unexpected state, we either:
> ** Fallback to the version that the user would expect (if it is safe to
>      do so).

What would be 'the version the user would expect' here ? checking out the 'topic' branch
in the submodule, even if it's ahead of the commit recorded in the superproject ?
(it could even be behind, if the submodule branch was manually resetted). Or falling
back to the old behaviour of checking out the commit recorded in the superproject
on a detached HEAD ? I think I would prefer that, with sufficient warning.

> ** Reject the operation (if it is not safe).
> 
> As we expand submodule branches to other commands (merge, rebase,
> reset), the notions of ‘unexpected state’ and ‘safety’ become
> increasingly nebulous and difficult to define because they depend on the
> command being run. To manage this, we will start by supporting submodule
> branching under a limited set of circumstances and try to loosen them in
> the future. We will manage the user’s expectations by warning them if
> Git detects an unexpected state.
> 
> The proposed rules for submodule branching are as follows:
> 
> === Switching _from_ a branch `topic`, i.e. `git {switch,checkout}`
> 
> Check `topic` if each submodule’s worktree is clean (except for
> gitlinks), and has one of the following checked out:
> 
> * `topic`
> * the commit id in the superproject gitlink

I'm not sure what you mean here, if we are switching away from 'topic',
why do we want to checkout 'topic' ? (assuming "out" is missing from your sentence above).

Or maybe you really mean "check" ? But then I'm not sure either what you mean...


Re-reading it, and your next email, maybe that should read:

> === Switching _away from_ a branch `topic`, i.e. `git {switch,checkout} other-branch`
> 
> Checkout `other-branch` if each submodule’s worktree is clean (except for
> gitlinks), and has one of the following checked out:
> 
> * `topic`
> * the commit id in the superproject gitlink at the tip of 'topic'

Is that what you meant ? (that would indeed make sense).

> 
> This allows the user to switch with a dirty worktree (with respect to
> the superproject). We consider this acceptable because the submodule
> commits are tracked by the submodule branch. This is helpful when a user
> needs to switch branches before they are ready to commit to the
> superproject.
> 
> === Switching _to_ a branch `topic`, i.e. `git {switch,checkout} topic`
> 
> Switch to `topic` in the superproject. Then in each submodule, switch to:
> 
> * `topic`, if it exists
> * Otherwise, the commit id in the superproject gitlink (and warn the
>    user that HEAD is detached)
> 
> If the submodule `topic` points to a different commit from the
> superproject gitlink, this will leave the superproject with a dirty
> worktree with respect to the gitlinks. This allows a user to recover
> work if they had previously switched _away from_ "topic".

OK, so you seem to answer my interrogation above about "what is the version
the user would expect ?" with "the commit at the tip of 'topic' in the submodule,
if that branch exists.".

> 
> If a dirty worktree is unacceptable, we may need an option that is
> guaranteed to check out the superproject’s `topic`.

Yes, I would think that should be configurable, maybe something like
'--recurse-submodules=branch' vs '--recurse-submodules=detached' (which
is the actual behaviour). Just thinking out loud here.

> 
> === Creating a branch `topic`, i.e. `git branch topic start-point`
> 
> Check each submodule at the superproject’s `start-point` (not the
> submodule’s `start-point`) for the following:
> 
> * The submodule is initialized (in .git/modules)

The submodule should also be active, no ? Maybe it was cloned before,
so exists in .git/modules, but was then set as inactive (submodule.<name>.active=false)...

> * `topic` is a valid branch name
> 
> If so, create `topic` in the superproject and submodules based on the
> superproject’s `start-point`. Else, do not create any `topic` branches
> and guide the user towards a possible fix:
> 
> * A --force option that will move the branch tip to the commit in the
>    superproject. This will let the user overwrite the history of `topic`.
> * An --ignore option that ignores the existing `topic` branch. If used,
>    `git switch topic` would result in a dirty worktree.
> * (If needed) An --adopt option that creates a new superproject commit
>    that points to the existing submodule `topic` branch. This will let
>    the user checkout `topic` without ending up with a dirty worktree.
> * For uninitialized submodules, prompt them to initialize it via git
>    checkout start-point && git submodule update (we are working to
>    eliminate manual initialization in the long run, so this will become
>    obsolete eventually).

I think if the submodule are not initialized, they should be left alone, without
prompting the user. Projects that use non-optional submodules already instruct
their users to clone with --recurse-submodules or run 'git submodule update --init --recursive' after the clone, so I'm not sure that sort of nagging would be necessary...


Cheers,

Philippe.  

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Branches with --recurse-submodules
  2021-11-10 18:21 ` Glen Choo
  2021-11-10 18:35   ` rsbecker
@ 2021-11-12  3:22   ` Philippe Blain
  1 sibling, 0 replies; 11+ messages in thread
From: Philippe Blain @ 2021-11-12  3:22 UTC (permalink / raw)
  To: Glen Choo, git

Hi Glen,

Le 2021-11-10 à 13:21, Glen Choo a écrit :
> 
> I found some points that I should have given more attention to in the
> RFC. I'd appreciate any and all feedback :)
> 
> Glen Choo <chooglen@google.com> writes:
> 
>> In a superproject-submodule relationship there is some ambiguity in what
>> ‘checkout the branch `topic`’ should mean (does the submodule use its
>> topic branch, or the version recorded in the superproject’s gitlink?).
>> Our approach is to preserve existing semantics where reasonable - the
>> ref name refers to the superproject’s ref, just as it does without
>> --recurse-submodules.
> 
> Because a gitlink only contains a commit id, the submodule branch will
> use a plain commit id as the branch point. This gives the correct ref,
> but it gives no hints as to what the submodule branch should track.
> 
> The current thought process is to set up tracking using the ref name and
> the submodule's config. Thus, a more complete description of
> 
>    git branch --recurse-submodules topic origin/main
> 
> is something like:
> 
> * for each repository, create the 'topic' branch where each 'topic'
>    branch points to the version recorded in the superproject's
>    'origin/main'
> * for each repository, setup tracking for the 'topic' branch using the
>    repository's own 'origin/main' as the branch point
> 
> Note that there is no guarantee that a submodule's 'origin/main' points
> to the same commit as the superproject's 'origin/main', or if the
> submodule's 'origin/main' even exists.
> 
> If tracking information cannot be setup, we will still create the
> branch; we will only warn users when they run a command that requires
> tracking information e.g. fetch or push.

OK. That makes sense. Another option could be to track the branch pointed
to by origin/HEAD in the submodule (in an ideal world, that would point to
the default branch, but that has to be done mostly manually as of today...)



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Branches with --recurse-submodules
  2021-11-12  3:19 ` Philippe Blain
@ 2021-11-15 19:03   ` Glen Choo
  2021-11-23 18:36     ` Philippe Blain
  0 siblings, 1 reply; 11+ messages in thread
From: Glen Choo @ 2021-11-15 19:03 UTC (permalink / raw)
  To: Philippe Blain, git

Thanks so much Philippe, your responses are very thoughtful.

Philippe Blain <levraiphilippeblain@gmail.com> writes:

>> * `git branch --recurse-submodules topic` should create the branch
>>    `topic` in each of the repositories.
>
> I guess for some workflow this would be the good, but for others you might
> not need to create submodule branches for each new superproject branch you
> create.  I think I pointed that out before; I don't necessarily think that
> creating branches in all submodules should *not* be the default behaviour,
> but I think that it should be configurable. I mean that if I have 'submodule.recurse'
> set to true, I would not like 'git branch topic' to create a 'topic' branch
> in each submodule. So I wish I'll be able to add 'branch.recurseSubmodules = false'
> to my config (or something similar) to have more granularity in behaviour.

Yes, as we discussed earlier, this behavior may not be desirable for
different workflows. I've come to suspect that the branching behavior
that I proposed should be the default, but I'm ambivalent on being able
to opt out of the branching.

In favor of letting users opt out: I'd imagine that behavior might be
disruptive to users who make frequent changes on the submodule and may
not appreciate having two sets of branch names (one from the
superproject and one from the submodule's remotes). I'm not clear on
whether or not this is disruptive primarily because it is a breaking
change, or if this just an objectively bad fit for what these users
want.

In favor of not letting users opt out: exposing fewer switches to users
makes it easier for them to get a good user experience. Instead of
giving users the ability to build-your-own UX, maintaining a small
configuration surface makes configuration easy and puts the onus back on
Git (or me, really :P) to make sure that the UX really works well for
all users, instead of opting out and saying "oh the user has
branches.recurseSubmodules=false, so I'll choose not to support them".
I think this stance is good from a product excellence perspective, but
it's an obvious risk.

A way forward might be:

* mitigate the breaking changes by flagging this with
  feature.experimental
* test this behavior with real users (aka internal) and iterate from
  there

Does that make sense? I'd like to make sure I'm not missing something
very big here.

> Also, I assume the new behaviour would carry over to 'git checkout -b' and
> 'git switch -c' ?
>> * `git switch --recurse-submodules topic` should checkout the branch
>>    `topic` in each of the repositories
>
> Nit: I guess you also include 'git checkout --r topic' here also ?

Yes and yes (I believe --r refers to --recurse-submodules?).

>> * If the branch is in an unexpected state, we either:
>> ** Fallback to the version that the user would expect (if it is safe to
>>      do so).
>
> What would be 'the version the user would expect' here ?

The issues is that defaulting to 'the version the user would expect' is
a fairly uncontroversial opinion, but it leaves a lot of room for
interpretation. I suspect that there won't be a single set of rules that
can apply in every single command and situation; we would never make
progress if we tried to start with a top down approach.

Instead, this RFC prescribes one consistent set of 'expected versions'
under a subset of operations {branch,switch,checkout}...

>> === Switching _to_ a branch `topic`, i.e. `git {switch,checkout} topic`
>> 
>> Switch to `topic` in the superproject. Then in each submodule, switch to:
>> 
>> * `topic`, if it exists
>> * Otherwise, the commit id in the superproject gitlink (and warn the
>>    user that HEAD is detached)
>> 
>> If the submodule `topic` points to a different commit from the
>> superproject gitlink, this will leave the superproject with a dirty
>> worktree with respect to the gitlinks. This allows a user to recover
>> work if they had previously switched _away from_ "topic".
>
> OK, so you seem to answer my interrogation above about "what is the version
> the user would expect ?" with "the commit at the tip of 'topic' in the submodule,
> if that branch exists.".

which you have noted here :)

>> === Switching _away from_ a branch `topic`, i.e. `git {switch,checkout} other-branch`
>> 
>> Checkout `other-branch` if each submodule’s worktree is clean (except for
>> gitlinks), and has one of the following checked out:
>> 
>> * `topic`
>> * the commit id in the superproject gitlink at the tip of 'topic'
>
> Is that what you meant ? (that would indeed make sense).

Yes, thanks for the wording suggestion.

>> If a dirty worktree is unacceptable, we may need an option that is
>> guaranteed to check out the superproject’s `topic`.
>
> Yes, I would think that should be configurable, maybe something like
> '--recurse-submodules=branch' vs '--recurse-submodules=detached' (which
> is the actual behaviour). Just thinking out loud here.

Yes, your wording is also similar to what I was thinking of. I'm holding
back from this because, as stated earlier, I'm worried about having a
build-your-own UX situation and bifurcating (or worse) our development
efforts.

>> Check each submodule at the superproject’s `start-point` (not the
>> submodule’s `start-point`) for the following:
>> 
>> * The submodule is initialized (in .git/modules)
>
> The submodule should also be active, no ? Maybe it was cloned before,
> so exists in .git/modules, but was then set as inactive (submodule.<name>.active=false)...

Ah yes, good catch.

>> * For uninitialized submodules, prompt them to initialize it via git
>>    checkout start-point && git submodule update (we are working to
>>    eliminate manual initialization in the long run, so this will become
>>    obsolete eventually).
>
> I think if the submodule are not initialized, they should be left alone, without
> prompting the user. Projects that use non-optional submodules already instruct
> their users to clone with --recurse-submodules or run 'git submodule
> update --init --recursive' after the clone, so I'm not sure that sort
> of nagging would be necessary...

To make sure we're on the same page, I'll give a motivating example.
Let's consider a superproject with remote-tracking branches
`origin/main` and `origin/topic`.

`origin/main` has submodules `sub1` and `sub2`.
`origin/topic` has submodules `sub1` and `sub3`.

Imagine a user has branched off `origin/main` and has initialized the
submodules using git submodule update --init. At this point, `sub1` and
`sub2` are initialized. `sub3` is not initialized (because it's not in
`origin/main`).

What happens when the user now wants to work off `origin/topic` with
`git branch --recurse-submodules topic origin/topic`? `sub3` isn't
initialized, so the branch can't be created.

So to get back to your main point, before we eliminate this problem,
what do we do? Do we abort and naggily warn the user? Do we try our best
and just ignore `sub3`?

Your suggestion might be superior to mine:

* for users who don't care about `sub3`, they are not blocked from
  creating branches
* for users who do care, they would checkout `topic`, realize
  that `sub3` isn't initialized and then perform the initialization and
  (at some point) realize that `sub3` doesn't have the `topic` branch
  and so they fix this manually.

I'll consider this more deeply, thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Branches with --recurse-submodules
  2021-11-15 19:03   ` Glen Choo
@ 2021-11-23 18:36     ` Philippe Blain
  2021-11-23 19:04       ` Glen Choo
  0 siblings, 1 reply; 11+ messages in thread
From: Philippe Blain @ 2021-11-23 18:36 UTC (permalink / raw)
  To: Glen Choo, git

Hi Glen,

Le 2021-11-15 à 14:03, Glen Choo a écrit :
> Thanks so much Philippe, your responses are very thoughtful.
> 
> Philippe Blain <levraiphilippeblain@gmail.com> writes:
> 
>>> * `git branch --recurse-submodules topic` should create the branch
>>>     `topic` in each of the repositories.
>>
>> I guess for some workflow this would be the good, but for others you might
>> not need to create submodule branches for each new superproject branch you
>> create.  I think I pointed that out before; I don't necessarily think that
>> creating branches in all submodules should *not* be the default behaviour,
>> but I think that it should be configurable. I mean that if I have 'submodule.recurse'
>> set to true, I would not like 'git branch topic' to create a 'topic' branch
>> in each submodule. So I wish I'll be able to add 'branch.recurseSubmodules = false'
>> to my config (or something similar) to have more granularity in behaviour.
> 
> Yes, as we discussed earlier, this behavior may not be desirable for
> different workflows. I've come to suspect that the branching behavior
> that I proposed should be the default, but I'm ambivalent on being able
> to opt out of the branching.
> 
> In favor of letting users opt out: I'd imagine that behavior might be
> disruptive to users who make frequent changes on the submodule and may
> not appreciate having two sets of branch names (one from the
> superproject and one from the submodule's remotes). I'm not clear on
> whether or not this is disruptive primarily because it is a breaking
> change, or if this just an objectively bad fit for what these users
> want.
> 
> In favor of not letting users opt out: exposing fewer switches to users
> makes it easier for them to get a good user experience. Instead of
> giving users the ability to build-your-own UX, maintaining a small
> configuration surface makes configuration easy and puts the onus back on
> Git (or me, really :P) to make sure that the UX really works well for
> all users, instead of opting out and saying "oh the user has
> branches.recurseSubmodules=false, so I'll choose not to support them".
> I think this stance is good from a product excellence perspective, but
> it's an obvious risk.
> 
> A way forward might be:
> 
> * mitigate the breaking changes by flagging this with
>    feature.experimental
> * test this behavior with real users (aka internal) and iterate from
>    there
> 
> Does that make sense? I'd like to make sure I'm not missing something
> very big here.

It does, but I think that we can still build a flexible product without
compromising UI/UX :)

> 
>> Also, I assume the new behaviour would carry over to 'git checkout -b' and
>> 'git switch -c' ?
>>> * `git switch --recurse-submodules topic` should checkout the branch
>>>     `topic` in each of the repositories
>>
>> Nit: I guess you also include 'git checkout --r topic' here also ?
> 
> Yes and yes (I believe --r refers to --recurse-submodules?).

Yes, and it works on the command-line ! ;) long options can be shortened
if unambiguous, see 'man gitcli'.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC] Branches with --recurse-submodules
  2021-11-23 18:36     ` Philippe Blain
@ 2021-11-23 19:04       ` Glen Choo
  0 siblings, 0 replies; 11+ messages in thread
From: Glen Choo @ 2021-11-23 19:04 UTC (permalink / raw)
  To: Philippe Blain, git

Philippe Blain <levraiphilippeblain@gmail.com> writes:

>> In favor of not letting users opt out: exposing fewer switches to users
>> makes it easier for them to get a good user experience. Instead of
>> giving users the ability to build-your-own UX, maintaining a small
>> configuration surface makes configuration easy and puts the onus back on
>> Git (or me, really :P) to make sure that the UX really works well for
>> all users, instead of opting out and saying "oh the user has
>> branches.recurseSubmodules=false, so I'll choose not to support them".
>> I think this stance is good from a product excellence perspective, but
>> it's an obvious risk.
>> 
>> A way forward might be:
>> 
>> * mitigate the breaking changes by flagging this with
>>    feature.experimental
>> * test this behavior with real users (aka internal) and iterate from
>>    there
>> 
>> Does that make sense? I'd like to make sure I'm not missing something
>> very big here.
>
> It does, but I think that we can still build a flexible product without
> compromising UI/UX :)

Agreed. The long term result might be that submodules + branches will
always live behind its own flag (though I hope not..). One person's
flexibility is another person's complexity, so we will need a lot of
finesse in order to find the right tradeoff.

>>> Also, I assume the new behaviour would carry over to 'git checkout -b' and
>>> 'git switch -c' ?
>>>> * `git switch --recurse-submodules topic` should checkout the branch
>>>>     `topic` in each of the repositories
>>>
>>> Nit: I guess you also include 'git checkout --r topic' here also ?
>> 
>> Yes and yes (I believe --r refers to --recurse-submodules?).
>
> Yes, and it works on the command-line ! ;) long options can be shortened
> if unambiguous, see 'man gitcli'.

Ah, TIL. Thanks :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-11-23 19:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08 22:33 [RFC] Branches with --recurse-submodules Glen Choo
2021-11-10 18:21 ` Glen Choo
2021-11-10 18:35   ` rsbecker
2021-11-10 19:35     ` Glen Choo
2021-11-10 20:25       ` rsbecker
2021-11-11 18:12         ` Glen Choo
2021-11-12  3:22   ` Philippe Blain
2021-11-12  3:19 ` Philippe Blain
2021-11-15 19:03   ` Glen Choo
2021-11-23 18:36     ` Philippe Blain
2021-11-23 19:04       ` Glen Choo

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).