git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* gitmodules below root directory
@ 2017-09-06 13:53 Robert Dailey
  2017-09-06 18:35 ` Stefan Beller
  2017-09-06 19:58 ` Junio C Hamano
  0 siblings, 2 replies; 4+ messages in thread
From: Robert Dailey @ 2017-09-06 13:53 UTC (permalink / raw)
  To: Git

The gitmodules documentation[1] states that the .gitmodules file is at
the root. However, it would be nice if this could be supported in any
directory similar to how .gitignore works. Right now git-subrepo does
not support submodules inside of a subrepo[2] (I suspect subtrees
would have the same problem, but I did not verify). I think this is a
limitation of git, rather than subrepo itself. Perhaps there are
reasons why .gitmodules must be at the root, but I at least wanted to
point it out and see if this could be supported.

[1]: https://git-scm.com/docs/gitmodules
[2]: https://github.com/ingydotnet/git-subrepo/issues/262

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gitmodules below root directory
  2017-09-06 13:53 gitmodules below root directory Robert Dailey
@ 2017-09-06 18:35 ` Stefan Beller
  2017-09-06 19:58 ` Junio C Hamano
  1 sibling, 0 replies; 4+ messages in thread
From: Stefan Beller @ 2017-09-06 18:35 UTC (permalink / raw)
  To: Robert Dailey, Prathamesh Chavan; +Cc: Git

On Wed, Sep 6, 2017 at 6:53 AM, Robert Dailey <rcdailey.lists@gmail.com> wrote:
> The gitmodules documentation[1] states that the .gitmodules file is at
> the root. However, it would be nice if this could be supported in any
> directory similar to how .gitignore works. Right now git-subrepo does
> not support submodules inside of a subrepo[2] (I suspect subtrees
> would have the same problem, but I did not verify). I think this is a
> limitation of git, rather than subrepo itself. Perhaps there are
> reasons why .gitmodules must be at the root, but I at least wanted to
> point it out and see if this could be supported.
>
> [1]: https://git-scm.com/docs/gitmodules
> [2]: https://github.com/ingydotnet/git-subrepo/issues/262

I agree that subtree likely suffers the same problem.
And at first it seems reasonable to want to have .gitmodules
at deeper trees supported, as that would fix subtree and subrepo
(and others) with ease.

Historically the need to store submodule URLs were the motivation
for having the .gitmodules file. An absolute URL for a submodule would
work fine no matter where the .gitmodules file would be located.
Relative URLs are currently defined as relative to the top level
of the project, which we would need to inspect if the anchor
is chosen well at the root or if we would want to allow anchoring
the relative URL within the tree. (This is no reason against
.gitmodules in deep trees, just pointing out the work required).

But does the URL still make sense? For absolute URLs this is likely
the case, for relative URLs my bets are off. Maybe?

It turned out that people want to e.g. move, delete and re-introduce
submodules, which is why the location of a submodule git directory
was moved to be either inside the tree (to keep supporting existing
git repos with submodules) as well as interned in the superproject.

In the example given in [2], the git dir of the submodule
("folder B") may be located at .git/modules/nameB as seen
from the root of RepoX:

    RepoX
    + folder A
    + folder B (submodule)
    + .gitmodules
    + .git # regular RepoX git dir
       + modules/<nameB>

An important mechanism of the .gitmodules file is
the resolution of the "name" and the "path" of the
submodule. (Given the path of a gitlink entry, where do
I find the git repository for the submodule? vice versa is slightly
less relevant: Given this git repository deep inside my own git
directory, where is the working tree)

So in the example we'd have

    RepoY
    + RepoX (subrepo)
      + folder A
      + folder B (submodule)
     +.gitmodules

The path entry in the .gitmodules file would not change via
subtree/subrepo merge, such that Git would need to know
that the actual path to the submodule is the
concatenation of 'path to tree in which the .gitmodules file is'
and the given path inside the .gitmodules file. Seems doable so far.

What about the name of a submodule? The .gitmodules file
follows the syntax of git config files, such that names cannot
occur twice as the names are stored as the section name:

  [submodule "nameB"]
    path = "folder B"

And I would think the property of having unique names
is important, such that each submodule has its unique
place to put its git dir inside the superprojects
"$GIT_DIR/modules/".

With multiple .gitmodules files, we would loose the
uniqueness property. (It may not be too bad, maybe
even a clever hack, haven't thought about it deeply,
but it seems ugly at first)

As said above, the name<->path resolution is
important, (and shall be unique, deterministic and simple),
so how do we do it? What about the case where we have

  .gitmodules "name" -> dir/path
  dir/.gitmodules "name" -> ./path

In this case we'd have the same mapping, but using this
mechanism we can map multiple names at the same path,
and we could choose to resolve a given path in different
.gitmodules files, which is cumbersome.

  anotherdir/.gitmodules "name" -> ../dir/path

seems crazy, too.

What about moving submodules?
Consider the example as in [2] again:

  $ git mv RepoX/folderB dir/sub
  $ git commit -m "move submodule"
  # ok fine, we can come up with a plan
  # where to put the submodule configuration,
  # maybe in dir/.gitmodules?

  $ git rm RepoX
  $ git commit -m "don't need the rest of RepoX"
  # observation: we would not want
  # RepoX/.gitmodules to still have impact on
  # the submodule.

  $ git revert HEAD^^ # undo the initial move
  # we'd move the .gitmodules file back to RepoX/.

tl;dr: I think this idea produces lots of interesting
corner cases in the data model, let's not go there
without having an idea how to solve them.

From an implementation stand point:
The submodule-config API could easily enhanced
to support reading multiple .gitmodules files (in case
their location is well defined, we would not want to
walk the whole tree recursively). This API is only
easily accessible from within C, such that current
implementing this idea in git-submodule.sh would
be a hassle to do. Prathamesh made good progress
in his GSoC project porting most of git-submodule.sh to
C, though, so once that is merged I'd claim that the
actual implementation of this idea is "rather easy".

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gitmodules below root directory
  2017-09-06 13:53 gitmodules below root directory Robert Dailey
  2017-09-06 18:35 ` Stefan Beller
@ 2017-09-06 19:58 ` Junio C Hamano
  2017-09-07  1:25   ` Jacob Keller
  1 sibling, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2017-09-06 19:58 UTC (permalink / raw)
  To: Robert Dailey; +Cc: Git

Robert Dailey <rcdailey.lists@gmail.com> writes:

> The gitmodules documentation[1] states that the .gitmodules file is at
> the root. However, it would be nice if this could be supported in any
> directory similar to how .gitignore works.

I have a mild suspicion that there would be a huge impedance
mismatch between what gitmodules file is meant to do and the way
ignore/attribute setting is done.

When the mechanism is primarily about expressing a few generic
traits that are shared by things that can be grouped by paths
(e.g. "all paths whose pathnames match '*.py' pattern contain text",
"all paths in sub/ directory are ignored"), it may make sense to
spread the information across multiple .gitignore files and make the
closest one take precedence over the further ones.  Even though
allowing multiple sources of information spread over the tree leads
to end-user confusion (e.g. "why is this path ignored?", which
triggered the debugging aid "git check-ignore"), such a grouping by
pattern matching on paths (which is what makes "closest file take
precedence" meaningful) to assign generic traits (e.g. "it's text")
makes it worthwhile by allowing to express the rules more concisely.

Compared to that, what .gitmodules file expresses is more specific
to each submodule---no two submodules in your single superproject
would share the same URL, unless you are doing something quite
unusual, for example.  Having a single file also means that updating
is much simpler---"git submodule add" and other things do not have
to choose among .gitmodules, a/.gitmodules and a/b/.gitmodules when
they update an entry for the submodule at path "a/b/c".

Having said that, I do not think the current ".gitmodules must be at
the top and nothing else matters" is ideal.  A possible change that
I suspect may make more sense is to get rid of .gitmodules file,
instead of spreading more of them all over the tree.

The current gitlink implementation records only the "what commit
from the subproject history is to be checked out at this path?" and
nothing else, by storing a single SHA-1 that happens to be the name
of the commit object (but the superproject does not even care the
fact that it is a commit or a random string).  We could substitute
that with the name of a blob object that belongs to the superproject
history and records the information about the submodule at the path
(e.g. "which repository the upstream project recommends to clone the
subproject from?", "what commit object is to be checked out").  

When you see a single tree of a superproject, you need to see what
commit is to be checked out from the tree object and everything else
needs to be read from the .gitmodules file in that tree in the
current system, but it does not have to be that way.






^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gitmodules below root directory
  2017-09-06 19:58 ` Junio C Hamano
@ 2017-09-07  1:25   ` Jacob Keller
  0 siblings, 0 replies; 4+ messages in thread
From: Jacob Keller @ 2017-09-07  1:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Robert Dailey, Git

On Wed, Sep 6, 2017 at 12:58 PM, Junio C Hamano <gitster@pobox.com> wrote:
> The current gitlink implementation records only the "what commit
> from the subproject history is to be checked out at this path?" and
> nothing else, by storing a single SHA-1 that happens to be the name
> of the commit object (but the superproject does not even care the
> fact that it is a commit or a random string).  We could substitute
> that with the name of a blob object that belongs to the superproject
> history and records the information about the submodule at the path
> (e.g. "which repository the upstream project recommends to clone the
> subproject from?", "what commit object is to be checked out").
>
> When you see a single tree of a superproject, you need to see what
> commit is to be checked out from the tree object and everything else
> needs to be read from the .gitmodules file in that tree in the
> current system, but it does not have to be that way.
>
>

IMO, this approach described here, (point the gitlink at a blob which
describes the full contents, URL, etc) would make more sense. The
trickiest parts I think are (a) it really requires tooling to change
the git module vs just editing a file, and (b) we'd need to prevent
the blobs from getting garbage collected.

I think it makes each individual submodule a bit more robust, since
the actual submodule pointer always points directly to the full data
about that submodule (it's recommended URL, it's path, etc), and
changes to those things *are* changes to the submodule pointer.

Thanks,
Jake

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-09-07  1:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-06 13:53 gitmodules below root directory Robert Dailey
2017-09-06 18:35 ` Stefan Beller
2017-09-06 19:58 ` Junio C Hamano
2017-09-07  1:25   ` Jacob Keller

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).