git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>,
	"Emily Shaffer" <emilyshaffer@google.com>,
	"Albert Cui" <albertcui@google.com>,
	"Phillip Wood" <phillip.wood123@gmail.com>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Matheus Tavares Bernardino" <matheus.bernardino@usp.br>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Jacob Keller" <jacob.keller@gmail.com>,
	"Atharva Raykar" <raykar.ath@gmail.com>,
	"Derrick Stolee" <stolee@gmail.com>,
	"Jonathan Tan" <jonathantanmy@google.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [RFC PATCH 0/2] submodule: test what happens if submodule.superprojectGitDir isn't around
Date: Wed, 17 Nov 2021 12:43:38 +0100	[thread overview]
Message-ID: <RFC-cover-0.2-00000000000-20211117T113134Z-avarab@gmail.com> (raw)
In-Reply-To: <20211117005701.371808-1-emilyshaffer@google.com>

On Tue, Nov 16 2021, Emily Shaffer wrote:

> [...]
> A couple things. Firstly, a semantics change *back* to the semantics of
> v3 - we map from gitdir to gitdir, *not* from common dir to common dir,
> so that theoretically a submodule with multiple worktrees in multiple
> superproject worktrees will be able to figure out which worktree of the
> superproject it's in. (Realistically, that's not really possible right
> now, but I'd like to change that soon.)
>
> Secondly, a rewording of comments and commit messages to indicate that
> this isn't a cache of some expensive operation, but rather intended to
> be the source of truth for all submodules. I also added a fifth commit
> rewriting `git rev-parse --show-superproject-working-tree` to
> demonstrate what that means in practice - but from a practical
> standpoint, I'm a little worried about that fifth patch. More details in
> the patch 5 description.
>
> I did discuss Ævar's idea of relying on in-process filesystem digging to
> find the superproject's gitdir with the rest of the Google team, but in
> the end decided that there are some worries about filesystem digging in
> this way (namely, some ugly interactions with network drives that are
> actually already an issue for Googler Linux machines). Plus, the allure
> of being able to definitively know that we're a submodule is pretty
> strong. ;) But overall, this is the direction I'd prefer to keep going                                                                                                                          
> in, rather than trying to guess from the filesystem going forward.

Did you try running the ad-hoc benchmark I included in [1] on that
Google NFS? I've dealt with some slow-ish network filesystems, but if
it's slower than AIX's local FS (where I couldn't see a difference) I'd
put money on it being a cross-Atlantic mount or something :)

Re your:

    "this isn't a cache of some expensive operation, but rather intended to                                                                                                                          be the source of truth for all submodules."

In your 5/5 it says, in seeming contradiction to this:

    This commit may be more of an RFC - to demonstrate what life looks like
    if we use submodule.superprojectGitDir as the source of truth. But since
    'git rev-parse --show-superproject-working-tree' is used in a lot of
    scripts in the wild[1], I'm not so sure it's a great example.

    To be honest, I'd prefer to die("Try running 'git submodule update'")
    here, but I don't think that's very script-friendly. However, falling
    back on the old implementation kind of undermines the idea of treating
    submodule.superprojectGitDir as the point of truth.

Most of what I've been suggesting in my [1] and related is that I'm
confused about if & how this is a pure caching mechanism.

Removing mentions of it being a cache but it seemingly still being a
cache at the tip of this series has just added to that confusion for
me :)

Anyway. While I do think this caching mechanism is probably
unnecessary in the short to medium term, i.e. it seems to the extent
that it was ever needed was due to some bridging of *.sh<->*.c that
we're *this* close to eliminating anyway.

But maybe I'm wrong. The benchmark I suggested above on that Google
NFS might be indicative. I don't really see how something that'll be
doing a bunch of FS ops anyway is going to be noticeably slower with
that approach, but maybe opening the index/tree of the superproject is
more expensive than I'm expecting.

In any case, all of that's not the hill I'm picking to die on. If
you'd like to go ahead with this cache-or-not-a-cache then sure, I
won't belabor that point.

I *do* strongly think if we're doing so though that we should have
something like this on top. I.e. let's test wha happens if we do and
don't have this "caching" variable, which is demonstrably easy to do.

Benchmarking the two gives me:

    $ git hyperfine -L rev HEAD~0 -L s true,false -s 'make -j8 all' '(cd t && GIT_TEST_SUBMODULE_CACHE_SUPERPROJECT_DIR={s} ./t7412-submodule-absorbgitdirs.sh)'
    Benchmark 1: (cd t && GIT_TEST_SUBMODULE_CACHE_SUPERPROJECT_DIR=true ./t7412-submodule-absorbgitdirs.sh)' in 'HEAD~0
      Time (mean ± σ):     545.9 ms ±   1.6 ms    [User: 490.3 ms, System: 114.0 ms]
      Range (min … max):   543.5 ms … 548.1 ms    10 runs
     
    Benchmark 2: (cd t && GIT_TEST_SUBMODULE_CACHE_SUPERPROJECT_DIR=false ./t7412-submodule-absorbgitdirs.sh)' in 'HEAD~0
      Time (mean ± σ):     537.9 ms ±  11.4 ms    [User: 476.8 ms, System: 117.6 ms]
      Range (min … max):   532.7 ms … 570.1 ms    10 runs
     
    Summary
      '(cd t && GIT_TEST_SUBMODULE_CACHE_SUPERPROJECT_DIR=false ./t7412-submodule-absorbgitdirs.sh)' in 'HEAD~0' ran
        1.01 ± 0.02 times faster than '(cd t && GIT_TEST_SUBMODULE_CACHE_SUPERPROJECT_DIR=true ./t7412-submodule-absorbgitdirs.sh)' in 'HEAD~0'

I.e. not using the cache is either indistinguishable or a bit faster
(the "a bit faster" is definitely due to just running less test code
though).

I'm sending this before the CI run[2] finishes (which now tests both
modes), but both of these work for me locally on a full test suite
run.

1. https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@evledraar.gmail.com/
2. https://github.com/avar/git/runs/4237446991?check_suite_focus=true

Ævar Arnfjörð Bjarmason (2):
  submodule tests: fix potentially broken "config .. --unset"
  submodule: add test mode for checking absence of "superProjectGitDir"

 ci/run-build-and-tests.sh          |  1 +
 git-submodule.sh                   |  2 +-
 submodule.c                        |  7 +++++++
 t/lib-submodule-superproject.sh    | 24 ++++++++++++++++++++++++
 t/t7406-submodule-update.sh        | 13 ++++++-------
 t/t7412-submodule-absorbgitdirs.sh | 19 ++++++-------------
 6 files changed, 45 insertions(+), 21 deletions(-)
 create mode 100644 t/lib-submodule-superproject.sh

-- 
2.34.0.796.g2c87ed6146a


  parent reply	other threads:[~2021-11-17 11:43 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-17  0:56 [PATCH v6 0/5] teach submodules to know they're submodules Emily Shaffer
2021-11-17  0:56 ` [PATCH v6 1/5] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-11-17  0:56 ` [PATCH v6 2/5] introduce submodule.superprojectGitDir record Emily Shaffer
2021-11-17 23:43   ` Jonathan Tan
2021-11-17  0:56 ` [PATCH v6 3/5] submodule: record superproject gitdir during absorbgitdirs Emily Shaffer
2021-11-17  0:57 ` [PATCH v6 4/5] submodule: record superproject gitdir during 'update' Emily Shaffer
2021-11-17  0:57 ` [PATCH v6 5/5] submodule: use config to find superproject worktree Emily Shaffer
2021-11-17 11:43 ` Ævar Arnfjörð Bjarmason [this message]
2021-11-17 11:43   ` [RFC PATCH 1/2] submodule tests: fix potentially broken "config .. --unset" Ævar Arnfjörð Bjarmason
2021-11-17 11:43   ` [RFC PATCH 2/2] submodule: add test mode for checking absence of "superProjectGitDir" Ævar Arnfjörð Bjarmason
2021-11-23 20:08   ` [RFC PATCH 0/2] submodule: test what happens if submodule.superprojectGitDir isn't around Emily Shaffer
2021-11-24  1:38     ` Ævar Arnfjörð Bjarmason
2021-11-17 23:28 ` [PATCH v6 0/5] teach submodules to know they're submodules Jonathan Tan
2021-11-23 20:28   ` Emily Shaffer
2022-02-03 21:59 ` Emily Shaffer
2022-02-03 21:59   ` [PATCH v7 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2022-02-03 21:59   ` [PATCH v7 2/4] introduce submodule.superprojectGitDir record Emily Shaffer
2022-02-03 21:59   ` [PATCH v7 3/4] submodule: record superproject gitdir during absorbgitdirs Emily Shaffer
2022-02-03 21:59   ` [PATCH v7 4/4] submodule: record superproject gitdir during 'update' Emily Shaffer
2022-02-03 22:39   ` [PATCH v6 0/5] teach submodules to know they're submodules Junio C Hamano
2022-02-04  1:15   ` Ævar Arnfjörð Bjarmason
2022-02-04 16:20     ` Junio C Hamano
2022-02-07 19:56     ` Jonathan Nieder
2022-02-07 23:21       ` Junio C Hamano
2022-02-08  1:18         ` Jonathan Nieder
2022-02-08 18:24           ` Junio C Hamano
2022-02-10 22:12             ` Emily Shaffer
2022-02-10 22:53               ` Jonathan Nieder
2022-02-12 20:35       ` Ævar Arnfjörð Bjarmason
2022-02-13  6:25         ` Junio C Hamano
2022-03-01  0:26   ` [PATCH v8 0/3] " Emily Shaffer
2022-03-01  0:26     ` [PATCH v8 1/3] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2022-03-01  0:26     ` [PATCH v8 2/3] introduce submodule.hasSuperproject record Emily Shaffer
2022-03-01  7:00       ` Junio C Hamano
2022-03-08 20:04         ` Emily Shaffer
2022-03-08 22:13       ` Glen Choo
2022-03-08 22:29         ` Glen Choo
2022-03-01  0:26     ` [PATCH v8 3/3] rev-parse: short-circuit superproject worktree when config unset Emily Shaffer
2022-03-01  7:06       ` Junio C Hamano
2022-03-09  0:38         ` Emily Shaffer
2022-03-01  3:08     ` [PATCH v8 0/3] teach submodules to know they're submodules Junio C Hamano
2022-03-08 18:54       ` Emily Shaffer
2022-03-10  0:44     ` [PATCH v9 " Emily Shaffer
2022-03-10  0:44       ` [PATCH v9 1/3] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2022-03-10  0:44       ` [PATCH v9 2/3] introduce submodule.hasSuperproject record Emily Shaffer
2022-03-10  2:09         ` Junio C Hamano
2022-03-10 21:29           ` Glen Choo
2022-03-10 21:40           ` Glen Choo
2022-03-10 22:10             ` Junio C Hamano
2022-03-10 23:42               ` Glen Choo
2022-03-10 23:53                 ` Glen Choo
2022-03-15 20:48                   ` Emily Shaffer
2022-03-15 20:56                     ` Emily Shaffer
2022-03-15 21:19                       ` Glen Choo
2022-03-15 18:39               ` Emily Shaffer
2022-03-15 19:19                 ` Junio C Hamano
2022-03-10  2:32         ` Junio C Hamano
2022-03-10 21:54         ` Glen Choo
2022-03-15 18:27           ` Emily Shaffer
2022-03-10  0:44       ` [PATCH v9 3/3] rev-parse: short-circuit superproject worktree when config unset Emily Shaffer
2022-03-10  1:47         ` Junio C Hamano
2022-03-10  4:39           ` Eric Sunshine
2022-03-11  9:09       ` [PATCH v9 0/3] teach submodules to know they're submodules Ævar Arnfjörð Bjarmason
2022-03-13  5:43         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=RFC-cover-0.2-00000000000-20211117T113134Z-avarab@gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=albertcui@google.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.keller@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=matheus.bernardino@usp.br \
    --cc=phillip.wood123@gmail.com \
    --cc=raykar.ath@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).