git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jonathan Tan <jonathantanmy@google.com>, git <git@vger.kernel.org>
Subject: Re: [RFD] Long term plan with submodule refs?
Date: Thu, 9 Nov 2017 11:57:01 -0800	[thread overview]
Message-ID: <CAGZ79kZL2_v5S5OJ_FnuZbHrKmPX9gXwoyX36G0br+5i87JQhw@mail.gmail.com> (raw)
In-Reply-To: <xmqqbmkcjaxo.fsf@gitster.mtv.corp.google.com>

On Wed, Nov 8, 2017 at 9:08 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>>> The relationship is indeed currently useful, but if the long term plan
>>> is to strongly discourage detached submodule HEAD, then I would think
>>> that these patches are in the wrong direction. (If the long term plan is
>>> to end up supporting both detached and linked submodule HEAD, then these
>>> patches are fine, of course.) So I think that the plan referenced in
>>> Junio's email (that you linked above) still needs to be discussed.
>>
>> This email presents different approaches.
>>
>> Objective
>> =========
>> This document should summarize the current situation of Git submodules
>> and start a discussion of where it can be headed long term.
>> Show different ways in which submodule refs could evolve.
>>
>> Background
>> ==========
>> Submodules in Git are considered as an independet repository currently.
>> This is okay for current workflows, such as utilizing a library that is
>> rarely updated. Other workflows that require a tighter integration between
>> submodule and superproject are possible, but cumbersome as there is an
>> additional step that has to be performed, which is the update of the gitlink
>> pointer in the superproject.
>
> I do not think "rarely updaed" is an issue.
>
> The problem is that we may want to make it easier to use a
> superproject and its submodules as if the combined whole were a
> single project, which currently is not easy, primarily because
> submodules are separate entities with different set of branches that
> can be checked out independently from what branch the superproject
> is working on.

Well and this fact seems to be not a problem in the current use of submodules,
precisely because the workflow either (a) is not too cumbersome or (b)
is executed
not too often to bother enough.

> These are good starting points for copying such a combined whole to
> your local machine and start working on it.  The more interesting,
> important, and potentially difficult part is how the result of such
> work is shared back to where you started from.  "push --recursive"
> may be a simple phrase, but a sensible definition of how it should
> work won't be that simple.
...
>
> We should make detached HEAD safe against gc if it is not,
> regardless of the use of submodules.  I thought it already was made
> safe long time ago.

The detached HEAD itself is protected via its reflog (which is around
for say 2 weeks?)

If I were to develop using detached HEAD only in todays world of
submodules using different branches in the superproject, I run the risk
of loosing some commits in the submodule, as they are not the detached
HEAD all the time, but might even be loose tips.

This combined with the previous paragraph brings in another important
concern:
Some projects would have a very different history when used as a
submodule compared to when used as a stand alone project.
Other projects may be closely aligned between their branches and
what the superproject records.

So the more we deviate from the traditional branch model, the easier
we make it to have the submodule tips be very different from the
standalone tips, which may overexpose us to the gc issues as well as
the general question how much these projects have in common.

>> Use replicate refs in submodules
>> --------------------------------
>> This approach will replicate the superproject refs into the submodule
>> ref namespace, e.g. git-branch learns about --recurse-submodules, which
>> creates a branch of a given name in all submodules. These (topic) branches
>> should be kept in sync with the superproject
>>
>> Pros:
>>  * This seemed intuitive to Gerrit users
>>  * 'quick' to implement, most of the commands are already there,
>>    just git-branch is needed to have the workflows mentioned above complete.
>> Cons:
>>  * What does "git checkout -b A B" mean? (special case: B == HEAD)
>
> The command ran at which level?  In the superproject, or in a single
> submodule?

In the superproject, with --recurse-submodules, as the A and B would recurse
as strings, and not change meaning depending on the gitlink value.

>
>>    Is the branch name replicated as a string into the submodule operation,
>>    or do we dereference the superprojects gitlink and walk from there?
>
> If they are "kept in sync with the superproject", then there should
> be no difference between the two, so I do not see any room for
> wondering about that.

Except you can still break out by issuing commands in the submodule
to change the submodule refs to be different from the superproject.

This was also more along the lines of thinking about the (Gerrit) remote,
which does and okay, but not stellar job in keeping the remote branches
for superproject and submodule in sync. I'd expect glitches there.

> In other words, if there is need to worry
> about the differences between the above two, then it probably is
> fundamentally impossible to keep these in sync, and a design that
> assumes it is possible would have to expose glitches to the end-user
> experience.

yup. And by exposing you probably mean a patch series as presented?
(git status/log/diff making noise about the glitch?)

> I do not know if glitches resulting from there would be so severe to
> be show-stoppers, though.  It might be possible to paper them over.

I think so, too, as long as the user is pointed at the glitch to correct them.

>
>> No submodule refstore at all
>> ----------------------------
>> Use refs and commits in the superproject to stitch submodule changes
>> together. Disallow branches in the submodule. This is only restricted
>> to the working tree inside the superproject, such that the output of git-branch
>> changes depending whether the working tree is in- or outside the superproject
>> working tree.
>
> This would need enhancement for reachability code, but it feels the
> cleanest from the philosophical standpoint---if you want to treat a
> superproject and its submodules as if it were a single project,
> ability to check out a branch in a submodule that does not match
> that of the superproject would only get in the way of preserving the
> illusion of "single project"-ness.

I wonder if we can combine this with the approach Jonathan gave above.
In the worktree (of the submodule inside the superproject) you are allowed
to use these "mirrored" refs, whereas in any other worktree you have full
access to the normal refs of the project.

>
>> New type of symbolic refs
>> =========================
>> A symbolic ref can currently only point at a ref or another symbolic ref.
>> This proposal showcases different scenarios on how this could change in the
>> future.
>>
>> HEAD pointing at the superprojects index
>> ----------------------------------------
>
> This looks to me a mere implementation detail for a (part of)
> necessary component to realize the above "No submodule refstore".

Ah ok.

If all branches would use this new symref type, the handling would
seem to be very similar to what Jonathan described with a new type
of refstore instead.

>> Superproject operations spanning index and worktree
>>   E.g. git reset --mixed
>> As the submodules HEAD is defined in the index, we would reset it to the
>> version in the last commit. As --mixed promises to not touch the working tree,
>> the submodules worktree would not be touched. git reset --mixed in the
>> superproject is the same as --soft in the submodule.
>
> I am not sure if you want to take these promises low-level "single
> repository" plumbing operations make too literally.  "reset --mixed"
> may promise not to touch the working tree, but it also promises not
> to touch submodules at all.  If you are breaking the latter anyway,
> it would make more sense not to be afraid of breaking the former if
> it makes sense in the context of allowing the command to do more by
> breaking the latter.

ok.

Thanks,
Stefan

  reply	other threads:[~2017-11-09 19:57 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-08 19:55 [RFC PATCH 0/4] git-status reports relation to superproject Stefan Beller
2017-11-08 19:55 ` [PATCH 1/4] remote, revision: factor out exclusive counting between two commits Stefan Beller
2017-11-08 19:55 ` [PATCH 2/4] submodule.c: factor start_ls_files_dot_dot out of get_superproject_working_tree Stefan Beller
2017-11-08 19:55 ` [PATCH 3/4] submodule.c: get superprojects gitlink value Stefan Beller
2017-11-08 19:55 ` [PATCH 4/4] git-status: report reference to superproject Stefan Beller
2017-11-08 22:36 ` [RFC PATCH 0/4] git-status reports relation " Jonathan Tan
2017-11-09  0:10   ` [RFD] Long term plan with submodule refs? Stefan Beller
2017-11-09  1:29     ` Jonathan Tan
2017-11-09  5:47       ` Junio C Hamano
2017-11-09  5:08     ` Junio C Hamano
2017-11-09 19:57       ` Stefan Beller [this message]
2017-11-09  6:54     ` Jacob Keller
2017-11-09 20:16       ` Stefan Beller
2017-11-10  3:37         ` Jacob Keller
2017-11-10 20:01           ` Stefan Beller
2017-11-11  5:25             ` Jacob Keller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGZ79kZL2_v5S5OJ_FnuZbHrKmPX9gXwoyX36G0br+5i87JQhw@mail.gmail.com \
    --to=sbeller@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).