git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Avery Pennarun <apenwarr@gmail.com>,
	Jonathan Tan <jonathantanmy@google.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Stephen R Guglielmo" <srguglielmo@gmail.com>,
	"A . Wilcox" <AWilcox@wilcox-tech.com>,
	"David Aguilar" <davvid@gmail.com>
Subject: Re: [PATCH 0/4] subtree: move out of contrib
Date: Mon, 30 Apr 2018 15:18:20 -0700	[thread overview]
Message-ID: <CAGZ79kbisif8D7EiWzR_rtsc2BRNE4kKdCXtQcUk8A_Bjwq=2w@mail.gmail.com> (raw)
In-Reply-To: <CAHqTa-1KCsbG=6T8M0PLuM5s-j972jiv=vvZHUiwOxwgpPWJeA@mail.gmail.com>

On Mon, Apr 30, 2018 at 2:53 PM, Avery Pennarun <apenwarr@gmail.com> wrote:

> For the best of both worlds, I've often thought that a good balance
> would be to use the same data structure that submodule uses, but to
> store all the code in a single git repo under different refs, which we
> might or might not download (or might or might not have different
> ACLs) under different circumstances.

There has been some experimentation with having a simpler ref
surface on the submodule side,
https://public-inbox.org/git/cover.1512168087.git.jonathantanmy@google.com/

The way you describe the future of submodules, all we'd have to do
is to teach git-clone how to select the the "interesting" refs for your use
case. Any other command would assume all submodule data to be
in the main repository.

The difference to Jonathans proposal linked above, would be the
object store to be in the main repo and the refs to be prefixed
per submodule instead of "shadowed".

>  However, when some projects get
> really huge (lots of very big submodule dependencies), then repacking
> one-big-repo starts becoming unwieldy; in that situation git-subtree
> also fails completely.

Yes, but that is a general scaling problem of Git that could be tackled,
e.g. repack into multiple packs serially instead of putting everything
into one pack.

>> Submodules do not need to produce a synthetic project history
>> when splitting off again, as the history is genuine. This allows
>> for easier work with upstream.
>
> Splitting for easier work upstream is great, and there really ought to
> be an official version of 'git subtree split', which is good for all
> sorts of purposes.
>
> However, I suspect almost all uses of the split feature are a)
> splitting a subtree that you previously merged in, or b) splitting a
> subtree into a separate project that you want to maintain separately
> from now on.  Repeated splits in case (a) are only necessary because
> you're not using submodules, or in case (b) are only necessary because
> you didn't *switch* to submodules when it finally came time to split
> the projects.  (In both cases you probably didn't switch to submodules
> because you didn't like one of its tradeoffs, especially the need to
> track multiple repos when you fork.)

That makes sense.

>
> There's one exception, which is doing a one-time permanent merge of
> two projects into one.  That's a nice feature, but is probably used
> extremely rarely.  More often people get into a
> merge-split-merge-split cycle that would be better served by a
> slightly improved git-submodule.

This rare use case is how git-subtree came into existence in gits
contrib directory AFAICT,
https://kernel.googlesource.com/pub/scm/git/git/+/634392b26275fe5436c0ea131bc89b46476aa4ae
which is interesting to view in git-show, but I think defaults could
be tweaked there, as it currently shows me mostly a license file.

>> Conceptually Gerrit is doing
>>
>>   while true:
>>     git submodule update --remote
>>     if worktree is dirty:
>>         git commit "update the submodules"
>>
>> just that Gerrit doesn't poll but does it event based.
>
> ...and it's super handy :)  The problem is it's fundamentally
> centralized: because gerrit can serialize merges into the submodule,
> it also knows exactly how to update the link in the supermodule.  If
> there was wild branching and merging (as there often is in git) and
> you had to resolve conflicts between two submodules, I don't think it
> would be obvious at all how to do it automatically when pushing a
> submodule.  (This also works quite badly with git subtree --squash.)

With the poll based solution I don't think you'd run into many more
problems than you would with Gerrits solution.

In a nearby thread, we were just discussing the submodule merging
strategies,
https://public-inbox.org/git/1524739599.20251.17.camel@klsmartin.com/
which might seem confusing, but the implementation is actually easy
as we just fastforward-only in submodules.

>>
>> https://trends.google.com/trends/explore?date=all&q=git%20subtree,git%20submodule
>>
>> Not sure what to make of this data.
>
> Clearly people need a lot more help when using submodules than when
> using subtree :)

That could be true. :)

Thanks,
Stefan

  reply	other threads:[~2018-04-30 22:18 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-30  9:50 [PATCH 0/4] subtree: move out of contrib Ævar Arnfjörð Bjarmason
2018-04-30  9:50 ` [PATCH 1/4] git-subtree: move from contrib/subtree/ Ævar Arnfjörð Bjarmason
2018-04-30  9:50 ` [PATCH 2/4] subtree: remove support for git version <1.7 Ævar Arnfjörð Bjarmason
2018-04-30  9:50 ` [PATCH 3/4] subtree: fix a test failure under GETTEXT_POISON Ævar Arnfjörð Bjarmason
2018-04-30  9:50 ` [PATCH 4/4] i18n: translate the git-subtree command Ævar Arnfjörð Bjarmason
2018-04-30 12:05 ` [PATCH 0/4] subtree: move out of contrib Philip Oakley
2018-04-30 20:45 ` Avery Pennarun
2018-04-30 21:38   ` Stefan Beller
2018-04-30 21:53     ` Avery Pennarun
2018-04-30 22:18       ` Stefan Beller [this message]
2018-04-30 22:21       ` Ævar Arnfjörð Bjarmason
2018-04-30 22:24         ` Avery Pennarun
2018-05-01 11:37 ` Duy Nguyen
2018-05-01 11:42 ` Johannes Schindelin
2018-05-01 12:48   ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGZ79kbisif8D7EiWzR_rtsc2BRNE4kKdCXtQcUk8A_Bjwq=2w@mail.gmail.com' \
    --to=sbeller@google.com \
    --cc=AWilcox@wilcox-tech.com \
    --cc=apenwarr@gmail.com \
    --cc=avarab@gmail.com \
    --cc=davvid@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    --cc=srguglielmo@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).