git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Nick Townsend <nick.townsend@mac.com>
To: git@vger.kernel.org
Subject: Fwd: [PATCH] submodule recursion in git-archive
Date: Mon, 02 Dec 2013 16:03:37 -0800	[thread overview]
Message-ID: <D8D13DC5-0E93-4900-A738-A4A6700BC92F@mac.com> (raw)
In-Reply-To: 3651F1C2-741E-4170-9468-0EF07F120CB9@mac.com



Begin forwarded message:

> From: Nick Townsend <nick.townsend@mac.com>
> Subject: Re: [PATCH] submodule recursion in git-archive
> Date: 2 December 2013 16:00:50 GMT-8
> To: Junio C Hamano <gitster@pobox.com>
> Cc: René Scharfe <l.s.r@web.de>, Jens Lehmann <Jens.Lehmann@web.de>, git@vger.kernel.org, Jeff King <peff@peff.net>
> 
> 
> On 27 Nov 2013, at 11:43, Junio C Hamano <gitster@pobox.com> wrote:
> 
>> Nick Townsend <nick.townsend@mac.com> writes:
>> 
>>> On 26 Nov 2013, at 14:18, Junio C Hamano <gitster@pobox.com> wrote:
>>> 
>>>> Even if the code is run inside a repository with a working tree,
>>>> when producing a tarball out of an ancient commit that had a
>>>> submodule not at its current location, --recurse-submodules option
>>>> should do the right thing, so asking for working tree location of
>>>> that submodule to find its repository is wrong, I think.  It may
>>>> happen to find one if the archived revision is close enough to what
>>>> is currently checked out, but that may not necessarily be the case.
>>>> 
>>>> At that point when the code discovers an S_ISGITLINK entry, it
>>>> should have both a pathname to the submodule relative to the
>>>> toplevel and the commit object name bound to that submodule
>>>> location.  What it should do, when it does not find the repository
>>>> at the given path (maybe because there is no working tree, or the
>>>> sudmodule directory has moved over time) is roughly:
>>>> 
>>>> - Read from .gitmodules at the top-level from the tree it is
>>>> creating the tarball out of;
>>>> 
>>>> - Find "submodule.$name.path" entry that records that path to the
>>>> submodule; and then
>>>> 
>>>> - Using that $name, find the stashed-away location of the submodule
>>>> repository in $GIT_DIR/modules/$name.
>>>> 
>>>> or something like that.
>>>> 
>>>> This is a related tangent, but when used in a repository that people
>>>> often use as their remote, the repository discovery may have to
>>>> interact with the relative URL.  People often ship .gitmodules with
>>>> 
>>>> 	[submodule "bar"]
>>>>      	URL = ../bar.git
>>>> 		path = barDir
>>>> 
>>>> for a top-level project "foo" that can be cloned thusly:
>>>> 
>>>> 	git clone git://site.xz/foo.git
>>>> 
>>>> and host bar.git to be clonable with
>>>> 
>>>> 	git clone git://site.xz/bar.git barDir/
>>>> 
>>>> inside the working tree of the foo project.  In such a case, when
>>>> "archive --recurse-submodules" is running, it would find the
>>>> repository for the "bar" submodule at "../bar.git", I would think.
>>>> 
>>>> So this part needs a bit more thought, I am afraid.
>>> 
>>> I see that there is a lot of potential complexity around setting up a submodule:
>> 
>> No question about it.
>> 
>>> * The .gitmodules file can be dirty (easy to flag, but should we
>>> allow archive to proceed?)
>> 
>> As we are discussing "archive", which takes a tree object from the
>> top-level project that is recorded in the object database, the
>> information _about_ the submodule in question should come from the
>> given tree being archived.  There is no reason for the .gitmodules
>> file that happens to be sitting in the working tree of the top-level
>> project to be involved in the decision, so its dirtyness should not
>> matter, I think.  If the tree being archived has a submodule whose
>> name is "kernel" at path "linux/" (relative to the top-level
>> project), its repository should be at .git/modules/kernel in the
>> layout recent git-submodule prepares, and we should find that
>> path-and-name mapping from .gitmodules recorded in that tree object
>> we are archiving. The version that happens to be checked out to the
>> working tree may have moved the submodule to a new path "linux-3.0/"
>> and "linux-3.0/.git" may have "gitdir: .git/modules/kernel" in it,
>> but when archiving a tree that has the submodule at "linux/", it
>> would not help---we would not know to look at "linux-3.0/.git" to
>> learn that information anyway because .gitmodules in the working
>> tree would say that the submodule at path "linux-3.0/" is with name
>> "kernel", and would not tell us anything about "linux/".
>> 
>>> * Users can mess with settings both prior to git submodule init
>>> and before git submodule update.
>> 
>> I think this is irrelevant for exactly the same reason as above.
>> 
>> What makes this tricker, however, is how to deal with an old-style
>> repository, where the submodule repositories are embedded in the
>> working tree that happens to be checked out.  In that case, we may
>> have to read .gitmodules from two places, i.e.
>> 
>> (1) We are archiving a tree with a submodule at "linux/";
>> 
>> (2) We read .gitmodules from that tree and learn that the submodule
>>    has name "kernel";
>> 
>> (3) There is no ".git/modules/kernel" because the repository uses
>>    the old layout (if the user never was interested in this
>>    submodule, .git/modules/kernel may also be missing, and we
>>    should tell these two cases apart by checking .git/config to
>>    see if a corresponding entry for the "kernel" submodule exists
>>    there);
>> 
>> (4) In a repository that uses the old layout, there must be the
>>    repository somewhere embedded in the current working tree (this
>>    inability to remove is why we use the new layout these days).
>>    We can learn where it is by looking at .gitmodules in the
>>    working tree---map the name "kernel" we learned earlier, and
>>    map it to the current path ("linux-3.0/" if you have been
>>    following this example so far).
>> 
>> And in that fallback context, I would say that reading from a dirty
>> (or "messed with by the user") .gitmodules is the right thing to
>> do.  Perhaps the user may be in the process of moving the submodule
>> in his working tree with
>> 
>>   $ mv linux-3.0 linux-3.2
>>   $ git config -f .gitmodules submodule.kernel.path linux-3.2
>> 
>> but hasn't committed the change yet.
>> 
>>> For those reasons I deliberately decided not to reproduce the
>>> above logic all by myself.
>> 
>> As I already hinted, I agree that the "how to find the location of
>> submodule repository, given a particular tree in the top-level
>> project the submodule belongs to and the path to the submodule in
>> question" deserves a separate thread to discuss with area experts.
> 
> As per my email to Heiko on this thread, I’m happy to start such 
> a discussion - I’ll use your notes as a starting point. I’m much more comfortable
> using a wiki for this - is this common or should I start a new mail thread
> with RFC in the title or similar?
> 
> I did complete my work on my version of git-archive (for internal use) and added some regression tests
> for current behaviour. Also the add_submodule_odb patch should IMHO be incorporated
> anyway. I’ll resubmit those two for consideration in a new thread.
> 
> Kind Regards
> Nick Townsend
> 

  reply	other threads:[~2013-12-03  0:04 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-26  0:04 [PATCH] submodule recursion in git-archive Nick Townsend
2013-11-26 15:17 ` René Scharfe
2013-11-26 18:57   ` Jens Lehmann
2013-11-26 22:18   ` Junio C Hamano
2013-11-27  0:28     ` René Scharfe
2013-11-27  3:28       ` Nick Townsend
2013-11-27 19:05       ` Junio C Hamano
2013-11-27  3:55     ` Nick Townsend
2013-11-27 19:43       ` Junio C Hamano
2013-11-29 22:38         ` Heiko Voigt
     [not found]           ` <3C71BC83-4DD0-43F8-9E36-88594CA63FC5@mac.com>
2013-12-03  0:05             ` Nick Townsend
2013-12-03 18:33             ` Heiko Voigt
2013-12-09 20:55               ` [RFC/WIP PATCH] implement reading of submodule .gitmodules configuration into cache Heiko Voigt
2013-12-09 23:37                 ` Junio C Hamano
2013-12-12 13:03                   ` Heiko Voigt
2013-12-03  0:00         ` [PATCH] submodule recursion in git-archive Nick Townsend
2013-12-03  0:03           ` Nick Townsend [this message]
2013-11-26 22:38   ` Heiko Voigt
2013-11-27  3:33     ` Nick Townsend

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=D8D13DC5-0E93-4900-A738-A4A6700BC92F@mac.com \
    --to=nick.townsend@mac.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).