git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Nick Townsend <nick.townsend@mac.com>
Cc: "René Scharfe" <l.s.r@web.de>,
	"Jens Lehmann" <Jens.Lehmann@web.de>,
	git@vger.kernel.org, "Jeff King" <peff@peff.net>
Subject: Re: [PATCH] submodule recursion in git-archive
Date: Wed, 27 Nov 2013 11:43:44 -0800	[thread overview]
Message-ID: <xmqqzjopsk9b.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <9AB10474-6DEF-4FFD-B6B3-ED2AB21424AC@mac.com> (Nick Townsend's message of "Tue, 26 Nov 2013 19:55:06 -0800")

Nick Townsend <nick.townsend@mac.com> writes:

> On 26 Nov 2013, at 14:18, Junio C Hamano <gitster@pobox.com> wrote:
>
>> Even if the code is run inside a repository with a working tree,
>> when producing a tarball out of an ancient commit that had a
>> submodule not at its current location, --recurse-submodules option
>> should do the right thing, so asking for working tree location of
>> that submodule to find its repository is wrong, I think.  It may
>> happen to find one if the archived revision is close enough to what
>> is currently checked out, but that may not necessarily be the case.
>> 
>> At that point when the code discovers an S_ISGITLINK entry, it
>> should have both a pathname to the submodule relative to the
>> toplevel and the commit object name bound to that submodule
>> location.  What it should do, when it does not find the repository
>> at the given path (maybe because there is no working tree, or the
>> sudmodule directory has moved over time) is roughly:
>> 
>> - Read from .gitmodules at the top-level from the tree it is
>>   creating the tarball out of;
>> 
>> - Find "submodule.$name.path" entry that records that path to the
>>   submodule; and then
>> 
>> - Using that $name, find the stashed-away location of the submodule
>>   repository in $GIT_DIR/modules/$name.
>> 
>> or something like that.
>> 
>> This is a related tangent, but when used in a repository that people
>> often use as their remote, the repository discovery may have to
>> interact with the relative URL.  People often ship .gitmodules with
>> 
>> 	[submodule "bar"]
>>        	URL = ../bar.git
>> 		path = barDir
>> 
>> for a top-level project "foo" that can be cloned thusly:
>> 
>> 	git clone git://site.xz/foo.git
>> 
>> and host bar.git to be clonable with
>> 
>> 	git clone git://site.xz/bar.git barDir/
>> 
>> inside the working tree of the foo project.  In such a case, when
>> "archive --recurse-submodules" is running, it would find the
>> repository for the "bar" submodule at "../bar.git", I would think.
>> 
>> So this part needs a bit more thought, I am afraid.
>
> I see that there is a lot of potential complexity around setting up a submodule:

No question about it.

> * The .gitmodules file can be dirty (easy to flag, but should we
> allow archive to proceed?)

As we are discussing "archive", which takes a tree object from the
top-level project that is recorded in the object database, the
information _about_ the submodule in question should come from the
given tree being archived.  There is no reason for the .gitmodules
file that happens to be sitting in the working tree of the top-level
project to be involved in the decision, so its dirtyness should not
matter, I think.  If the tree being archived has a submodule whose
name is "kernel" at path "linux/" (relative to the top-level
project), its repository should be at .git/modules/kernel in the
layout recent git-submodule prepares, and we should find that
path-and-name mapping from .gitmodules recorded in that tree object
we are archiving. The version that happens to be checked out to the
working tree may have moved the submodule to a new path "linux-3.0/"
and "linux-3.0/.git" may have "gitdir: .git/modules/kernel" in it,
but when archiving a tree that has the submodule at "linux/", it
would not help---we would not know to look at "linux-3.0/.git" to
learn that information anyway because .gitmodules in the working
tree would say that the submodule at path "linux-3.0/" is with name
"kernel", and would not tell us anything about "linux/".

> * Users can mess with settings both prior to git submodule init
> and before git submodule update.

I think this is irrelevant for exactly the same reason as above.

What makes this tricker, however, is how to deal with an old-style
repository, where the submodule repositories are embedded in the
working tree that happens to be checked out.  In that case, we may
have to read .gitmodules from two places, i.e.

 (1) We are archiving a tree with a submodule at "linux/";

 (2) We read .gitmodules from that tree and learn that the submodule
     has name "kernel";

 (3) There is no ".git/modules/kernel" because the repository uses
     the old layout (if the user never was interested in this
     submodule, .git/modules/kernel may also be missing, and we
     should tell these two cases apart by checking .git/config to
     see if a corresponding entry for the "kernel" submodule exists
     there);

 (4) In a repository that uses the old layout, there must be the
     repository somewhere embedded in the current working tree (this
     inability to remove is why we use the new layout these days).
     We can learn where it is by looking at .gitmodules in the
     working tree---map the name "kernel" we learned earlier, and
     map it to the current path ("linux-3.0/" if you have been
     following this example so far).

And in that fallback context, I would say that reading from a dirty
(or "messed with by the user") .gitmodules is the right thing to
do.  Perhaps the user may be in the process of moving the submodule
in his working tree with

    $ mv linux-3.0 linux-3.2
    $ git config -f .gitmodules submodule.kernel.path linux-3.2

but hasn't committed the change yet.

> For those reasons I deliberately decided not to reproduce the
> above logic all by myself.

As I already hinted, I agree that the "how to find the location of
submodule repository, given a particular tree in the top-level
project the submodule belongs to and the path to the submodule in
question" deserves a separate thread to discuss with area experts.

  reply	other threads:[~2013-11-27 19:43 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-26  0:04 [PATCH] submodule recursion in git-archive Nick Townsend
2013-11-26 15:17 ` René Scharfe
2013-11-26 18:57   ` Jens Lehmann
2013-11-26 22:18   ` Junio C Hamano
2013-11-27  0:28     ` René Scharfe
2013-11-27  3:28       ` Nick Townsend
2013-11-27 19:05       ` Junio C Hamano
2013-11-27  3:55     ` Nick Townsend
2013-11-27 19:43       ` Junio C Hamano [this message]
2013-11-29 22:38         ` Heiko Voigt
     [not found]           ` <3C71BC83-4DD0-43F8-9E36-88594CA63FC5@mac.com>
2013-12-03  0:05             ` Nick Townsend
2013-12-03 18:33             ` Heiko Voigt
2013-12-09 20:55               ` [RFC/WIP PATCH] implement reading of submodule .gitmodules configuration into cache Heiko Voigt
2013-12-09 23:37                 ` Junio C Hamano
2013-12-12 13:03                   ` Heiko Voigt
2013-12-03  0:00         ` [PATCH] submodule recursion in git-archive Nick Townsend
2013-12-03  0:03           ` Fwd: " Nick Townsend
2013-11-26 22:38   ` Heiko Voigt
2013-11-27  3:33     ` Nick Townsend
     [not found] <0MWW00M0GODZPV00@nk11p03mm-asmtp002.mac.com>
2013-11-27  5:03 ` Nick Townsend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqzjopsk9b.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=Jens.Lehmann@web.de \
    --cc=git@vger.kernel.org \
    --cc=l.s.r@web.de \
    --cc=nick.townsend@mac.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).