From: Jacob Keller <jacob.keller@gmail.com>
To: Emily Shaffer <emilyshaffer@google.com>
Cc: "Git mailing list" <git@vger.kernel.org>,
"Albert Cui" <albertcui@google.com>,
"Phillip Wood" <phillip.wood123@gmail.com>,
"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Junio C Hamano" <gitster@pobox.com>,
"Matheus Tavares Bernardino" <matheus.bernardino@usp.br>,
"Jonathan Nieder" <jrnieder@gmail.com>,
"Atharva Raykar" <raykar.ath@gmail.com>
Subject: Re: [RFC PATCH 0/4] cache parent project's gitdir in submodules
Date: Sat, 12 Jun 2021 13:12:58 -0700 [thread overview]
Message-ID: <CA+P7+xp5NNb6zz1LkQrv_A3Ny41WE=-h2-9VswyL_qzwFdyifw@mail.gmail.com> (raw)
In-Reply-To: <20210611225428.1208973-1-emilyshaffer@google.com>
On Fri, Jun 11, 2021 at 3:54 PM Emily Shaffer <emilyshaffer@google.com> wrote:
>
> It's necessary for a superproject to know which submodules it contains.
> However, historically submodules do not know they are anything but a
> normal single-repo Git project (or a superproject, in nested-submodules
> cases). This decision does help prevent us from having to support
> divergent behaviors in submodule projects vs. superprojects, which makes
> sure Git is (somewhat) less confusing for the reader, and helps simplify
> our code.
>
> One could imagine, though, some conveniences we could gain from
> submodules learning added behavior (though not necessarily *different*
> behavior) to provide more context about the state of the project as a
> whole, and to make large submodule-based projects easier to work with.
> One example is a series[1] I sent some time ago, adding a config to be
> shared between the superproject and all its submodules. The RFC[2] I
> sent around the same time mentions a few other examples, such as "git
> status" run in the submodule noticing whether the superproject has a
> commit referencing the submodule's current HEAD.
>
> It's expensive and non-definitive to try and guess whether or not the
> current repo is a submodule. submodule.c:get_superproject_working_tree()
> does so by essentially running 'git -C .. ls-files -- <own-path>',
> invoking an additional process. get_superproject_working_tree() is not
> called often, so that's mostly fine. However, [1] attempted to include
> an additional config located in the superproject's gitdir by running
> 'git -C .. rev-parse --git-dir' during startup - a little expensive in
> the best case, because it's an extra process, but extremely expensive in
> the case when the current repo is *not* a submodule, because we hunt all
> the way up the filesystem looking for a '.git'. Adding that cost to
> every startup is infeasible.
>
It also adds cost for no benefit to the normal case where it's not a
submodule. The cost added for non-submodules ought to be as near-zero
as possible.
> To that end, in this series I propose caching a path to the
> superproject's gitdir - by having the superproject write that relative
> path to the submodule's config on creation or update. The goal here is
> *not* to say "If I am a submodule, I must have
> submodule.superprojectGitDir set" - but instead to say "If I have
> submodule.superprojectGitDir set, then I must be a submodule." That is,
> I expect we will find edge cases where a submodule was introduced in
> some interesting way that bypassed any of the patches below, and
> therefore doesn't have the superproject's gitdir cached.
>
> The combination of these two rules:
> - Anything relying on submodule.superprojectGitDir must be nice to
> have, but not essential, because
> - It's possible for a submodule to be valid without having
> submodule.superprojectGitDir set
> makes me feel more comfortable with the idea of submodules learning
> additional behavior based on this config. I feel pretty unconfident in
> our ability to ensure that *every* submodule has this config set.
>
I think this is a good direction. I do think having some information
about being a submodule could be very useful for tools such as git
status, and making it more of a "if we know for sure, we get some
small benefit, but if we don't know then it's no harm" is a good
direction.
> The series covers a few paths for introducing that config, which I'm
> hoping covers most cases.
> - "git submodule update" (which seems to be part of the "git submodule
> init" flow)
> - "git submodule absorbgitdir" to convert a "git init"'d repo into a
> submodule
>
> Notably, we can only really set this config when 'the_repository' is the
> superproject - that appears to be the only time when we know the gitdirs
> of both the superproject and the submodule.
We could also presumably add a new command for setting this?
>
> I'm expecting folks may have a lot to say about this, so I look forward
> to discussion :)
>
> - Emily
>
> 1: https://lore.kernel.org/git/20210423001539.4059524-1-emilyshaffer@google.com
> 2: https://lore.kernel.org/git/YHofmWcIAidkvJiD@google.com
>
> Emily Shaffer (4):
> t7400-submodule-basic: modernize inspect() helper
> introduce submodule.superprojectGitDir cache
> submodule: cache superproject gitdir during absorbgitdirs
> submodule: cache superproject gitdir during 'update'
>
> builtin/submodule--helper.c | 4 +++
> git-submodule.sh | 9 ++++++
> submodule.c | 10 ++++++
> t/t7400-submodule-basic.sh | 49 ++++++++++++++----------------
> t/t7406-submodule-update.sh | 10 ++++++
> t/t7412-submodule-absorbgitdirs.sh | 1 +
> 6 files changed, 57 insertions(+), 26 deletions(-)
>
> --
> 2.32.0.272.g935e593368-goog
>
next prev parent reply other threads:[~2021-06-12 20:14 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-11 22:54 [RFC PATCH 0/4] cache parent project's gitdir in submodules Emily Shaffer
2021-06-11 22:54 ` [RFC PATCH 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-06-14 4:52 ` Junio C Hamano
2021-06-11 22:54 ` [RFC PATCH 2/4] introduce submodule.superprojectGitDir cache Emily Shaffer
2021-06-14 5:09 ` Junio C Hamano
2021-06-15 22:00 ` Emily Shaffer
2021-06-11 22:54 ` [RFC PATCH 3/4] submodule: cache superproject gitdir during absorbgitdirs Emily Shaffer
2021-06-14 6:18 ` Junio C Hamano
2021-06-11 22:54 ` [RFC PATCH 4/4] submodule: cache superproject gitdir during 'update' Emily Shaffer
2021-06-14 6:22 ` Junio C Hamano
2021-06-15 21:27 ` Emily Shaffer
2021-06-12 20:12 ` Jacob Keller [this message]
2021-06-14 7:26 ` [RFC PATCH 0/4] cache parent project's gitdir in submodules Junio C Hamano
2021-06-15 21:18 ` Emily Shaffer
2021-06-16 0:45 ` [PATCH v2 " Emily Shaffer
2021-06-16 0:45 ` [PATCH v2 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-07-27 17:12 ` Jonathan Tan
2021-08-19 17:46 ` Emily Shaffer
2021-06-16 0:45 ` [PATCH v2 2/4] introduce submodule.superprojectGitDir cache Emily Shaffer
2021-06-16 4:40 ` Junio C Hamano
2021-06-16 4:43 ` Junio C Hamano
2021-06-18 0:03 ` Emily Shaffer
2021-06-18 0:00 ` Emily Shaffer
2021-07-27 17:46 ` Jonathan Tan
2021-08-19 17:53 ` Emily Shaffer
2021-10-14 19:25 ` Ævar Arnfjörð Bjarmason
2021-06-16 0:45 ` [PATCH v2 3/4] submodule: cache superproject gitdir during absorbgitdirs Emily Shaffer
2021-06-16 0:45 ` [PATCH v2 4/4] submodule: cache superproject gitdir during 'update' Emily Shaffer
2021-07-27 17:51 ` Jonathan Tan
2021-08-19 18:02 ` Emily Shaffer
2021-08-19 20:09 ` [PATCH v3 0/4] cache parent project's gitdir in submodules Emily Shaffer
2021-08-19 20:09 ` [PATCH v3 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-08-19 20:09 ` [PATCH v3 2/4] introduce submodule.superprojectGitDir record Emily Shaffer
2021-08-20 0:38 ` Derrick Stolee
2021-10-13 19:36 ` Emily Shaffer
2021-09-04 17:20 ` Matheus Tavares
2021-10-13 19:39 ` Emily Shaffer
2021-08-19 20:09 ` [PATCH v3 3/4] submodule: record superproject gitdir during absorbgitdirs Emily Shaffer
2021-08-20 0:50 ` Derrick Stolee
2021-10-13 19:42 ` Emily Shaffer
2021-09-04 17:27 ` Matheus Tavares
2021-10-14 18:40 ` Emily Shaffer
2021-08-19 20:09 ` [PATCH v3 4/4] submodule: record superproject gitdir during 'update' Emily Shaffer
2021-08-20 0:59 ` Derrick Stolee
2021-10-14 18:45 ` Emily Shaffer
2021-08-19 21:56 ` [PATCH v3 0/4] cache parent project's gitdir in submodules Junio C Hamano
2021-08-20 1:09 ` Derrick Stolee
2021-10-13 18:51 ` Emily Shaffer
2021-10-14 17:12 ` Derrick Stolee
2021-10-14 18:52 ` Emily Shaffer
2021-09-04 17:50 ` Matheus Tavares Bernardino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CA+P7+xp5NNb6zz1LkQrv_A3Ny41WE=-h2-9VswyL_qzwFdyifw@mail.gmail.com' \
--to=jacob.keller@gmail.com \
--cc=Johannes.Schindelin@gmx.de \
--cc=albertcui@google.com \
--cc=avarab@gmail.com \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=matheus.bernardino@usp.br \
--cc=phillip.wood123@gmail.com \
--cc=raykar.ath@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).