git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jacob Keller <jacob.keller@gmail.com>
To: Emily Shaffer <emilyshaffer@google.com>
Cc: "Git mailing list" <git@vger.kernel.org>,
	"Albert Cui" <albertcui@google.com>,
	"Phillip Wood" <phillip.wood123@gmail.com>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Matheus Tavares Bernardino" <matheus.bernardino@usp.br>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Atharva Raykar" <raykar.ath@gmail.com>
Subject: Re: [RFC PATCH 0/4] cache parent project's gitdir in submodules
Date: Sat, 12 Jun 2021 13:12:58 -0700	[thread overview]
Message-ID: <CA+P7+xp5NNb6zz1LkQrv_A3Ny41WE=-h2-9VswyL_qzwFdyifw@mail.gmail.com> (raw)
In-Reply-To: <20210611225428.1208973-1-emilyshaffer@google.com>

On Fri, Jun 11, 2021 at 3:54 PM Emily Shaffer <emilyshaffer@google.com> wrote:
>
> It's necessary for a superproject to know which submodules it contains.
> However, historically submodules do not know they are anything but a
> normal single-repo Git project (or a superproject, in nested-submodules
> cases). This decision does help prevent us from having to support
> divergent behaviors in submodule projects vs. superprojects, which makes
> sure Git is (somewhat) less confusing for the reader, and helps simplify
> our code.
>
> One could imagine, though, some conveniences we could gain from
> submodules learning added behavior (though not necessarily *different*
> behavior) to provide more context about the state of the project as a
> whole, and to make large submodule-based projects easier to work with.
> One example is a series[1] I sent some time ago, adding a config to be
> shared between the superproject and all its submodules. The RFC[2] I
> sent around the same time mentions a few other examples, such as "git
> status" run in the submodule noticing whether the superproject has a
> commit referencing the submodule's current HEAD.
>
> It's expensive and non-definitive to try and guess whether or not the
> current repo is a submodule. submodule.c:get_superproject_working_tree()
> does so by essentially running 'git -C .. ls-files -- <own-path>',
> invoking an additional process. get_superproject_working_tree() is not
> called often, so that's mostly fine. However, [1] attempted to include
> an additional config located in the superproject's gitdir by running
> 'git -C .. rev-parse --git-dir' during startup - a little expensive in
> the best case, because it's an extra process, but extremely expensive in
> the case when the current repo is *not* a submodule, because we hunt all
> the way up the filesystem looking for a '.git'. Adding that cost to
> every startup is infeasible.
>

It also adds cost for no benefit to the normal case where it's not a
submodule. The cost added for non-submodules ought to be as near-zero
as possible.

> To that end, in this series I propose caching a path to the
> superproject's gitdir - by having the superproject write that relative
> path to the submodule's config on creation or update. The goal here is
> *not* to say "If I am a submodule, I must have
> submodule.superprojectGitDir set" - but instead to say "If I have
> submodule.superprojectGitDir set, then I must be a submodule." That is,
> I expect we will find edge cases where a submodule was introduced in
> some interesting way that bypassed any of the patches below, and
> therefore doesn't have the superproject's gitdir cached.
>
> The combination of these two rules:
>  - Anything relying on submodule.superprojectGitDir must be nice to
>    have, but not essential, because
>  - It's possible for a submodule to be valid without having
>    submodule.superprojectGitDir set
> makes me feel more comfortable with the idea of submodules learning
> additional behavior based on this config. I feel pretty unconfident in
> our ability to ensure that *every* submodule has this config set.
>

I think this is a good direction. I do think having some information
about being a submodule could be very useful for tools such as git
status, and making it more of a "if we know for sure, we get some
small benefit, but if we don't know then it's no harm" is a good
direction.

> The series covers a few paths for introducing that config, which I'm
> hoping covers most cases.
>  - "git submodule update" (which seems to be part of the "git submodule
>    init" flow)
>  - "git submodule absorbgitdir" to convert a "git init"'d repo into a
>    submodule
>
> Notably, we can only really set this config when 'the_repository' is the
> superproject - that appears to be the only time when we know the gitdirs
> of both the superproject and the submodule.

We could also presumably add a new command for setting this?

>
> I'm expecting folks may have a lot to say about this, so I look forward
> to discussion :)
>
>  - Emily
>
> 1: https://lore.kernel.org/git/20210423001539.4059524-1-emilyshaffer@google.com
> 2: https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com
>
> Emily Shaffer (4):
>   t7400-submodule-basic: modernize inspect() helper
>   introduce submodule.superprojectGitDir cache
>   submodule: cache superproject gitdir during absorbgitdirs
>   submodule: cache superproject gitdir during 'update'
>
>  builtin/submodule--helper.c        |  4 +++
>  git-submodule.sh                   |  9 ++++++
>  submodule.c                        | 10 ++++++
>  t/t7400-submodule-basic.sh         | 49 ++++++++++++++----------------
>  t/t7406-submodule-update.sh        | 10 ++++++
>  t/t7412-submodule-absorbgitdirs.sh |  1 +
>  6 files changed, 57 insertions(+), 26 deletions(-)
>
> --
> 2.32.0.272.g935e593368-goog
>

  parent reply	other threads:[~2021-06-12 20:14 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-11 22:54 [RFC PATCH 0/4] cache parent project's gitdir in submodules Emily Shaffer
2021-06-11 22:54 ` [RFC PATCH 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-06-14  4:52   ` Junio C Hamano
2021-06-11 22:54 ` [RFC PATCH 2/4] introduce submodule.superprojectGitDir cache Emily Shaffer
2021-06-14  5:09   ` Junio C Hamano
2021-06-15 22:00     ` Emily Shaffer
2021-06-11 22:54 ` [RFC PATCH 3/4] submodule: cache superproject gitdir during absorbgitdirs Emily Shaffer
2021-06-14  6:18   ` Junio C Hamano
2021-06-11 22:54 ` [RFC PATCH 4/4] submodule: cache superproject gitdir during 'update' Emily Shaffer
2021-06-14  6:22   ` Junio C Hamano
2021-06-15 21:27     ` Emily Shaffer
2021-06-12 20:12 ` Jacob Keller [this message]
2021-06-14  7:26 ` [RFC PATCH 0/4] cache parent project's gitdir in submodules Junio C Hamano
2021-06-15 21:18   ` Emily Shaffer
2021-06-16  0:45 ` [PATCH v2 " Emily Shaffer
2021-06-16  0:45   ` [PATCH v2 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-07-27 17:12     ` Jonathan Tan
2021-08-19 17:46       ` Emily Shaffer
2021-06-16  0:45   ` [PATCH v2 2/4] introduce submodule.superprojectGitDir cache Emily Shaffer
2021-06-16  4:40     ` Junio C Hamano
2021-06-16  4:43       ` Junio C Hamano
2021-06-18  0:03         ` Emily Shaffer
2021-06-18  0:00       ` Emily Shaffer
2021-07-27 17:46     ` Jonathan Tan
2021-08-19 17:53       ` Emily Shaffer
2021-10-14 19:25     ` Ævar Arnfjörð Bjarmason
2021-06-16  0:45   ` [PATCH v2 3/4] submodule: cache superproject gitdir during absorbgitdirs Emily Shaffer
2021-06-16  0:45   ` [PATCH v2 4/4] submodule: cache superproject gitdir during 'update' Emily Shaffer
2021-07-27 17:51     ` Jonathan Tan
2021-08-19 18:02       ` Emily Shaffer
2021-08-19 20:09   ` [PATCH v3 0/4] cache parent project's gitdir in submodules Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 2/4] introduce submodule.superprojectGitDir record Emily Shaffer
2021-08-20  0:38       ` Derrick Stolee
2021-10-13 19:36         ` Emily Shaffer
2021-09-04 17:20       ` Matheus Tavares
2021-10-13 19:39         ` Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 3/4] submodule: record superproject gitdir during absorbgitdirs Emily Shaffer
2021-08-20  0:50       ` Derrick Stolee
2021-10-13 19:42         ` Emily Shaffer
2021-09-04 17:27       ` Matheus Tavares
2021-10-14 18:40         ` Emily Shaffer
2021-08-19 20:09     ` [PATCH v3 4/4] submodule: record superproject gitdir during 'update' Emily Shaffer
2021-08-20  0:59       ` Derrick Stolee
2021-10-14 18:45         ` Emily Shaffer
2021-08-19 21:56     ` [PATCH v3 0/4] cache parent project's gitdir in submodules Junio C Hamano
2021-08-20  1:09     ` Derrick Stolee
2021-10-13 18:51       ` Emily Shaffer
2021-10-14 17:12         ` Derrick Stolee
2021-10-14 18:52           ` Emily Shaffer
2021-09-04 17:50     ` Matheus Tavares Bernardino

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+P7+xp5NNb6zz1LkQrv_A3Ny41WE=-h2-9VswyL_qzwFdyifw@mail.gmail.com' \
    --to=jacob.keller@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=albertcui@google.com \
    --cc=avarab@gmail.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=matheus.bernardino@usp.br \
    --cc=phillip.wood123@gmail.com \
    --cc=raykar.ath@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).