git@vger.kernel.org mailing list mirror (one of many)
 help / Atom feed
From: Stefan Beller <sbeller@google.com>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Jeff King <peff@peff.net>,
	John.Coiner@amd.com, git <git@vger.kernel.org>
Subject: Re: git, monorepos, and access control
Date: Thu, 6 Dec 2018 14:15:47 -0800
Message-ID: <CAGZ79kaQhQi9XVQVLresPWwQkRAKgXNmkt-jwZ-o0703FFW=Ew@mail.gmail.com> (raw)
In-Reply-To: <nycvar.QRO.7.76.6.1812062100020.41@tvgsbejvaqbjf.bet>

On Thu, Dec 6, 2018 at 12:09 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi,
>
> On Wed, 5 Dec 2018, Jeff King wrote:
>
> > The model that fits more naturally with how Git is implemented would be
> > to use submodules. There you leak the hash of the commit from the
> > private submodule, but that's probably obscure enough (and if you're
> > really worried, you can add a random nonce to the commit messages in the
> > submodule to make their hashes unguessable).
>
> I hear myself frequently saying: "Friends don't let friends use
> submodules". It's almost like: "Some people think their problem is solved
> by using submodules. Only now they have two problems."

Blaming tools for their lack of evolution/development is not necessarily the
right approach. I recall having patches rejected on this very mailing list
that fixed obvious but minor good things like whitespaces and coding style,
because it *might* produce merge conflicts. Would that situation warrant me
to blame the lacks in the merge algorithm, or could you imagine a better
way out? (No need to answer, it's purely to demonstrate that blaming
tooling is not always the right approach; only sometimes it may be)

> There are big reasons, after all, why some companies go for monorepos: it
> is not for lack of trying to go with submodules, it is the problems that
> were incurred by trying to treat entire repositories the same as single
> files (or even trees): they are just too different.

We could change that in more places.

One example you might think of is the output of git-status that displays
changed files. And in case of submodules it would just show
"submodule changes", which we already differentiate into "dirty tree" and
"different sha1 at HEAD".
Instead we could have the output of all changed files recursively in the
superprojects git-status output.

Another example is the diff machinery, which already knows some
basics such as embedding submodule logs or actual diffs.

> In a previous life, I also tried to go for submodules, was burned, and had
> to restart the whole thing. We ended up with something that might work in
> this instance, too, although our use case was not need-to-know type of
> encapsulation. What we went for was straight up modularization.

So this is a "Fix the data instead of the tool", which seems to be a local
optimization (i.e. you only have to do it once, such that it is cheaper to
do than fixing the tool for that workgroup)
... and because everyone does that the tool never gets fixed.

> What I mean is that we split the project up into over 100 individual
> projects that are now all maintained in individual repositories, and they
> are connected completely outside of Git, via a dependency management
> system (in this case, Maven, although that is probably too Java-centric
> for AMD's needs).

Once you have the dependency management system in place, you
will encounter the rare case of still wanting to change things across
repository boundaries at the same time. Submodules offer that, which
is why Android wants to migrate off of the repo tool, and there it seems
natural to go for submodules.

> I just wanted to throw that out here: if you can split up your project
> into individual projects, it might make sense not to maintain them as
> submodules but instead as individual repositories whose artifacts are
> uploaded into a central, versioned artifact store (Maven, NuGet, etc). And
> those artifacts would then be retrieved by the projects that need them.

This is cool and industry standard. But once you happen to run in a bug
that involves 2 new artifacts (but each of the new artifacts work fine
on their own), then you'd wish for something like "git-bisect but across
repositories". Submodules (in theory) allow for fine grained bisection
across these repository boundaries, I would think.

> I figure that that scheme might work for you better than submodules: I
> could imagine that you need to make the build artifacts available even to
> people who are not permitted to look at the corresponding source code,
> anyway.

This is a sensible suggestion, as they probably don't want to ramp up
development on submodules. :-)

  reply index

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-05 20:13 Coiner, John
2018-12-05 20:34 ` Ævar Arnfjörð Bjarmason
2018-12-05 20:43   ` Derrick Stolee
2018-12-05 20:58     ` Duy Nguyen
2018-12-05 21:12       ` Ævar Arnfjörð Bjarmason
2018-12-05 23:42         ` Coiner, John
2018-12-06  7:23           ` Jeff King
2018-12-05 21:01 ` Jeff King
2018-12-06  0:23   ` brian m. carlson
2018-12-06  1:08   ` Junio C Hamano
2018-12-06  7:20     ` Jeff King
2018-12-06  9:17       ` Ævar Arnfjörð Bjarmason
2018-12-06  9:30         ` Jeff King
2018-12-06 20:08   ` Johannes Schindelin
2018-12-06 22:15     ` Stefan Beller [this message]
2018-12-06 22:59     ` Coiner, John
2018-12-05 22:37 ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGZ79kaQhQi9XVQVLresPWwQkRAKgXNmkt-jwZ-o0703FFW=Ew@mail.gmail.com' \
    --to=sbeller@google.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=John.Coiner@amd.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

git@vger.kernel.org mailing list mirror (one of many)

Archives are clonable:
	git clone --mirror https://public-inbox.org/git
	git clone --mirror http://ou63pmih66umazou.onion/git
	git clone --mirror http://czquwvybam4bgbro.onion/git
	git clone --mirror http://hjrcffqmbrq6wope.onion/git

Newsgroups are available over NNTP:
	nntp://news.public-inbox.org/inbox.comp.version-control.git
	nntp://ou63pmih66umazou.onion/inbox.comp.version-control.git
	nntp://czquwvybam4bgbro.onion/inbox.comp.version-control.git
	nntp://hjrcffqmbrq6wope.onion/inbox.comp.version-control.git
	nntp://news.gmane.org/gmane.comp.version-control.git

 note: .onion URLs require Tor: https://www.torproject.org/
       or Tor2web: https://www.tor2web.org/

AGPL code for this site: git clone https://public-inbox.org/ public-inbox