git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Avery Pennarun" <apenwarr@gmail.com>
To: "Sam Vilain" <sam@vilain.net>
Cc: git@vger.kernel.org
Subject: Re: git-submodule getting submodules from the parent repository
Date: Sun, 30 Mar 2008 19:00:55 -0400	[thread overview]
Message-ID: <32541b130803301600g5005876enf0fbcfe03e660fc8@mail.gmail.com> (raw)
In-Reply-To: <47EECF1F.60908@vilain.net>

On Sat, Mar 29, 2008 at 7:22 PM, Sam Vilain <sam@vilain.net> wrote:
> Avery Pennarun wrote:
>  > What if *all* the objects for A, B, and C were always in the *same*
>  > repository?  Almost all the problems would go away.  Imagine if it
>  > worked like this:
>
>  Well, that would create a lot of unnecessary work when cloning.
>  Partitioning by project is a natural way to divide the projects up.

What unnecessary work do you mean?  Certainly fetching only a
particular set of refs from a remote repository is possible, as that's
what 'git pull' does.

I agree that partitioning by project makes sense... but it also seems
to me that throwing extra objects into a repository that requires them
anyhow shouldn't have any major negative results.  After all, if you
can't build A without B, then downloading A might as well download the
objects from B too.  Which is not to say that B shouldn't *also* have
its own repository.

>  It's worth noting that the early implementations of submodules were
>  based on this design, of keeping everything together.

I'd like to read about the rationale behind this change.  Is there a
thread you can point to?

>  However, what you are suggesting should IMHO be allowed to work.  In
>  particular, if the submodule path is ".", then I think there's a good
>  case that they should come from within the same project.  If it's a
>  relative URL, it should initialize based on the remote URL that was used
>  for the original fetch (or, rather, the remote URL for the current branch).

I agree, there's no reason to take away the existing functionality of
allowing split repos.  I was more suggesting a new functionality so
that splitting isn't *required*.

>  > 2. You still check into C, then B, then A, but it doesn't actually
>  > matter if you put B and C on a branch first or not, because 'git push'
>  > will work properly, because it auto-pushes B and C revisions based on
>  > the fact that A refers to them (ie. implicit branches via the
>  > submodule mechanism).
>
>  This push failure thing is regrettable; however it's not clear which
>  branch name the submodules should get.  A given commit might exist on
>  several branches, which one do you choose to name it?

One option is to make a simple "git push origin" operation fail if
you're not on any branch; iirc, if you try that now, it just silently
*succeeds* without uploading anything at all, which is one reason I so
frequently screw it up.  Alternatively, is there a reason I can't
upload an object *without* giving it a branch name?  I guess that
would cause problems with garbage collection.

Now, the fail-on-branchless-push option still isn't really perfect,
because then I'll screw up like this:
- make change
- check in
- try to push: fails
- switch to branch
- realize I've lost my checkin(s) and have to go scrounge in the
reflog to try to find it

If we could disallow checkins to disconnected heads, then I'd get an
error at step 1, before I had a chance to screw up.  I think that
would be a usability improvement to git in general.  For example, if I
screw up a git-rebase and forget to abort, my HEAD ends up
disconnected and I occasionally check things in by accident and then
lose them (only to be saved by the reflog).  Perhaps an extra option
to git-commit that must be used if you want to check into a
non-branch?  Is that too harsh?

Another option would be to simply *always* create/update a branch tag
when doing "git submodule update".  But then the question is which
branch tag.  One thing that would give pretty useful semantics would
be to create a local branch tag with the same name as it had when the
submodule ref was checked in in the first place.

That is, if I make a change to B on branch "master," and then check
that into A, then the next time someone checks out A, it would be
great if it retrieved B with the same commitid, then named that
commitid "master".  This is *despite* the fact that the
"remotes/master" branch might be *newer* than the newly created local
"master".  Why?  Because it would give standard git semantics to my
new checkout: I can "git pull" to pull in the latest changes from
remote/master, or I can checkin and try to "git push" and it'll fail,
just like it should, which would encourage me to either "git pull" or
create a *new* topic branch, just as it should.

I think that solution would be great for me, but it would require
changes to the tree format in order to store the branch name, which is
unfortunately since trees currently don't know *anything* about branch
names afaict.  Is there a better way to achieve the same result?
(Note that in this case, "correct" operation doesn't *require* the new
branch name information; we just use it as a hint.)

>  > 4. You can 'git clone' a local copy of A, and B/C will be cloned
>  > automatically along with it.
>
> > 6. git-pull should be modified to auto-download objects referred to by
>  > 'submodule' references in trees.
>
>  I think this could be a switch to git clone/pull, configurable to be the
>  default action.

Sure.  Or it could try by default, but not error out if it turned out
not to be able to find them.

>  > 5. B and C, when git-submodule checks them out, should have their own
>  > .git directories, but use A as an 'alternatives' entry.
>
>  There is also a Google Summer of Code project for this - see
>  http://git.or.cz/gitwiki/SoC2008Ideas#head-9215572f23513542a23d3555aa72775bc4b91038

ok.  I was hoping it wouldn't be so hard as to require an entire SoC
project, since using --alternate when checking out the child repo
shouldn't be too hard.

Have fun,

Avery

  parent reply	other threads:[~2008-03-30 23:01 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-29 22:35 git-submodule getting submodules from the parent repository Avery Pennarun
2008-03-29 23:22 ` Sam Vilain
2008-03-30 13:32   ` Eyvind Bernhardsen
2008-03-30 17:48     ` Sam Vilain
2008-03-30 19:50       ` Eyvind Bernhardsen
2008-03-30 20:19         ` Sam Vilain
2008-03-31 10:05           ` Eyvind Bernhardsen
2008-03-30 23:03         ` Avery Pennarun
2008-03-31  9:29           ` Eyvind Bernhardsen
2008-03-31 21:36             ` Avery Pennarun
2008-04-01 23:05               ` Sam Vilain
2008-04-01 23:56                 ` Avery Pennarun
2008-04-02  0:35                   ` Junio C Hamano
2008-04-02  2:03                     ` Avery Pennarun
2008-04-02 20:06                       ` Sam Vilain
2008-04-02 21:32         ` Junio C Hamano
2008-03-30 23:00   ` Avery Pennarun [this message]
2008-04-01 23:10     ` Sam Vilain
2008-03-31  6:22 ` Johannes Sixt
2008-03-31 21:24   ` Avery Pennarun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32541b130803301600g5005876enf0fbcfe03e660fc8@mail.gmail.com \
    --to=apenwarr@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=sam@vilain.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).