git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Stefan Beller <sbeller@google.com>
Cc: unlisted-recipients:; (no To-header on input)
	git@vger.kernel.org, bmwill@google.com
Subject: Re: [RFC-PATCHv2] submodules: add a background story
Date: Thu, 09 Feb 2017 15:32:48 -0800	[thread overview]
Message-ID: <xmqqo9yblz33.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <20170209020855.23486-1-sbeller@google.com> (Stefan Beller's message of "Wed, 8 Feb 2017 18:08:55 -0800")

Stefan Beller <sbeller@google.com> writes:

> Just like gitmodules(5), gitattributes(5), gitcredentials(7),
> gitnamespaces(7), gittutorial(7), we'd like to provide some background
> on submodules, which is not specific to the `submodule` command, but
> elaborates on the background and its intended usage.
>
> Add gitsubmodules(7), that explains the states, structure and usage of
> submodules.
>
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
>
> This would replace the last patch of  sb/submodule-doc, though it's still
> RFC. In this revision I took care of the technical details (i.e. proper
> formatting, spelling), and only slight rewording of the text.
>
> The main issue persists; see bottom of the patch:
>
>   SAMPLE WORKFLOWS (RFC/TODO)
>   ---------------------------
>   
>   Do we need
>   
>   * an opinionated way to check for a specific state of a submodule
>   * (submodule helper to be plumbing?)
>   * expose the design mistake of having the (name->path) mapping inside the
>     working tree, i.e. never remove a name from the submodule config even when
>     the submodule doesn't exist any more.

I am not sure about the last item.  

Are you talking about a case where submodule comes and goes (think:
"git checkout v1.0" that would make submodules added since that
version disappar)?  .gitmodules that is checked out would not have
any entry, but .git/config needs to record the end-user preference
for the module, so that the user can do "git checkout -" to come
back, no?  IOW .git/config that mentions all the submodule the user
ever showed interests in is not a design mistake, so you must be
talking about something else, but I do not know what it is.

> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index 4a4cede144..d38aa2d53a 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -24,37 +24,7 @@ DESCRIPTION
>  -----------
>  Inspects, updates and manages submodules.
>  
> -A submodule allows you to keep another Git repository in a subdirectory
> ...
> -if you choose to go that route.
> +For more information about submodules, see linkgit:gitsubmodules[5]

OK.

> @@ -420,6 +390,10 @@ This file should be formatted in the same way as `$GIT_DIR/config`. The key
>  to each submodule url is "submodule.$name.url".  See linkgit:gitmodules[5]
>  for details.
>  
> +SEE ALSO
> +--------
> +linkgit:gitsubmodules[1], linkgit:gitmodules[1].

Are they both in section (1)?  I think the former (concepts) belongs
to section 7 and the latter (file formats) belongs to section 5.

> diff --git a/Documentation/gitsubmodules.txt b/Documentation/gitsubmodules.txt
> new file mode 100644
> index 0000000000..3369d55ae9
> --- /dev/null
> +++ b/Documentation/gitsubmodules.txt
> @@ -0,0 +1,194 @@
> +gitsubmodules(7)
> +================
> +
> +NAME
> +----
> +gitsubmodules - information about submodules
> +
> +SYNOPSIS
> +--------
> +$GIT_DIR/config, .gitmodules
> +
> +------------------
> +git submodule
> +------------------
> +
> +DESCRIPTION
> +-----------
> +
> +A submodule allows you to keep another Git repository in a subdirectory
> +...
> +When cloning or pulling a repository containing submodules however,
> +the submodules will not be checked out by default; You need to instruct
> +'clone' to recurse into submodules. The 'init' and 'update' subcommands

I think this is not "You need to", but rather "You can, if you want
to have each and every submodules."

> +of 'git submodule' will maintain submodules checked out and at an
> +appropriate revision in your working tree.
> +
> +WHEN TO USE
> +-----------
> +
> +Submodules, repositories inside other repositories,
> +can be used for different use cases:
> +
> +* To have finer grained access control.
> +  The design principles of Git do not allow for partial repositories to be
> +  checked out or transferred. A repository is the smallest unit that a user
> +  can be given access to. Submodules are separate repositories, such that
> +  you can restrict access to parts of your project via the use of submodules.
> +
> +* To decouple Git histories.
> +  Decoupling histories has different benefits.
> +
> +** When you want to use a (third party) library tied to a specific version.
> +   Using submodules for a library allows you to have a clean history for
> +   your own project and only updating the library in the submodule when needed.

I somehow do not see this as decoupling; it is keeping what is
originally separate separate, isn't it?

> +** In its current form Git scales up poorly for very large repositories that
> +   change a lot, as the history grows very large. For that you may want to look
> +   at shallow clone, sparse checkout or git-lfs.
> +   However you can also use submodules to e.g. hold large binary assets
> +   and these repositories are then shallowly cloned such that you do not
> +   have a large history locally.

In other words, a better way to list these may be 

 1. using another project that stands on its own.

 2. artificially split a (logically single) project into multiple
    repositories and tying them back together.

The access control and performance reasons are subclasses of 2.
IOW, if Git had per-path ACL and infinite scaling, you wouldn't be
splitting your project into submodules for 2.  You would still want
to use somebody else's project by binding it as a subproject, instead
of merging its history into yours.

> +When working with submodules, you can think of them as in a state machine.
> +So each submodule can be in a different state, the following indicators are used:
> +
> +* the existence of the setting of 'submodule.<name>.url' in the
> +  superprojects configuration
> +* the existence of the submodules working tree within the
> +  working tree of the superproject
> +* the existence of the submodules git directory within the superprojects
> +  git directory at $GIT_DIR/modules/<name> or within the submodules working
> +  tree
> +
> +      State      URL config        working tree     git dir
> +      -----------------------------------------------------
> +      uninitialized    no               no           no
> +      initialized     yes               no           no
> +      populated       yes              yes          yes
> +      depopulated     yes               no          yes
> +      deinitialized    no               no          yes
> +      uninteresting    no              yes          yes
> +
> +      invalid          no              yes           no
> +      invalid         yes              yes           no

I do not have strong opinions on these labels; are submodule folks
happy with the above vocabulary?

"uninteresting" is not explained in the below?

> ...
> +SEE ALSO
> +--------
> +linkgit:git-submodule[1], linkgit:gitmodules[1].

Ditto.

  reply	other threads:[~2017-02-09 23:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-09  2:08 [RFC-PATCHv2] submodules: add a background story Stefan Beller
2017-02-09 23:32 ` Junio C Hamano [this message]
2017-02-14 21:46   ` Stefan Beller
2017-02-14 21:56     ` Junio C Hamano
2017-02-14 22:10       ` Stefan Beller
2017-02-14 22:17         ` Junio C Hamano
2017-02-14 22:24           ` Stefan Beller
2017-02-14 22:39             ` Junio C Hamano
2017-02-14 23:31             ` Junio C Hamano
2017-02-14  0:39 ` Brandon Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqo9yblz33.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).