git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Lauri Alanko" <la@iki.fi>
To: git@vger.kernel.org
Subject: A design for distributed submodules
Date: Fri, 19 Oct 2012 03:31:58 +0300	[thread overview]
Message-ID: <20121019033158.93403x1gcanyjo8u.lealanko@webmail.helsinki.fi> (raw)
In-Reply-To: <7v4nlxylpm.fsf@alter.siamese.dyndns.org>


I think I finally agree that it's best to develop submodules further
rather than introduce a new tool for the functionality I require. Here
are some explicit proposals for submodules so we can at least establish
agreement on what should be done. These are in order of decreasing
importance (to me).


* Upstreamless submodules

If there is no 'url' key defined for a submodule in .gitconfig, there is
no "authoritative upstream" for it. When a recursive
fetch/pull/clone/push is performed on a remote superproject, its
upstreamless submodules are also fetched/pulled/cloned/pushed directly
from/to the submodule repositories under the superproject .git/modules.
If this is the first time that remote's submodules are accessed, that
remote is initialized for the local submodules: the submodule of the
remote superproject becomes a remote of the local submodule, and is
given the same name as the remote of the superproject.

So, suppose we have a superproject with .gitmodules:

[submodule "sub"]
	path = sub

which is hosted at repositories at URL1 and URL2. Then we do:

git clone --recursive URL1 super
cd super
git remote add other URL2
git fetch --recursive URL2

Now .git/modules/sub/config has:

[remote "origin"]
	url = URL1/.git/modules/sub
[remote "other"]
	url = URL2/.git/modules/sub

So the effect is similar to just setting the submodule's url as
".git/modules/sub", except that:

  - it hides the implementation detail of the exact location of the
    submodule repository from the publicly visible configuration file

  - it also works with bare remotes (where the actual remote submodule
    location would be URL/modules/sub)

  - it allows multiple simultaneous superproject remotes (where
    git-submodule currently always resolves relative urls against
    branch.$branch.remote with no option to fetch from a different
    remote).


* Submodule discovery across all refs

This is what Jens already mentioned. If we fetch multiple refs of a
remote superproject, we also need to fetch _all_ of the submodules
referenced by _any_ of the refs, not just the ones in the currently
active branch. Finding the complete list of submodules probably has to
be implemented by reading .gitmodules in all of the (updated) refs,
which is a bit ugly, but not too bad.


* Recording the active branch of a submodule

When a submodule is added, its active branch should be stored in
.gitmodules as submodule.$sub.branch. Then, when the submodule is
checked out, and the head of that branch is the same as the commit in
the gitlink (i.e. the superproject tree is "current"), then that branch
is set as the active branch in the checked-out submodule working tree.
Otherwise, a detached head is used.


* Multiple working trees for a submodule

A superproject may have multiple paths for the same submodule,
presumably for different commits. This is for cases where the
superproject is a snapshot of a developer's directory hierarchy, and the
developer is simultaneously working on multiple branches of a submodule
and it is convenient to have separate working trees for each of them.

This is a bit hard to express with the current .gitconfig format, since
paths are attributes of repository ids instead of vice versa. I'd
introduce an alternative section format where you can say:

[mount "path1"]
   module = sub
   branch = master

[mount "path2"]
   module = sub
   branch = topic

Implementing this is a bit intricate, since we need to use the
git-new-workdir method to create multiple working directories that share
the same refs, config, and object store, but have separate HEAD and
index. I think this is a problem with the repository layout: the
non-workdir-specific bits should all be in a single directory so that a
single symlink would be enough.


Obviously, I'm willing to implement the above functionalities since I
need them. However, I think I'm going to work in Dulwich (which doesn't
currently have any submodule support): a Python API is currently more
important to me than a command-line tool, and the git.git codebase
doesn't look like a very attractive place to contribute anyway. No
offense, it's just not to my tastes.

So the main reason I'd like to reach some tentative agreement about the
details of the proposal is to ensure that _once_ someone finally
implements this kind of functionality in git.git, it will use the same
configuration format and same conventions, so that it will be compatible
with my code. The compatibility between different tools is after all the
main reason for doing this stuff as an extension to submodules instead
of something completely different.


Lauri

  parent reply	other threads:[~2012-10-19  0:34 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-13 13:33 A design for subrepositories Lauri Alanko
2012-10-13 17:30 ` Junio C Hamano
2012-10-13 21:23   ` Lauri Alanko
2012-10-14  4:36     ` Junio C Hamano
2012-10-14 10:19       ` Lauri Alanko
2012-10-14 13:28         ` Jens Lehmann
2012-10-14 15:27           ` Lauri Alanko
2012-10-14 16:10             ` Jens Lehmann
2012-10-14 16:15             ` Jens Lehmann
2012-10-14 16:25             ` Jens Lehmann
2012-10-14 18:04               ` Junio C Hamano
2012-10-14 19:32                 ` Jens Lehmann
2012-10-19  0:31                 ` Lauri Alanko [this message]
2012-10-19 20:09                   ` A design for distributed submodules Jens Lehmann
2012-10-14 22:59               ` A design for subrepositories Lauri Alanko
2012-10-15 17:10                 ` Jens Lehmann
2012-10-13 21:20 ` perryh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121019033158.93403x1gcanyjo8u.lealanko@webmail.helsinki.fi \
    --to=la@iki.fi \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).