git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Emily Shaffer <emilyshaffer@google.com>
Cc: git@vger.kernel.org, Atharva Raykar <raykar.ath@gmail.com>
Subject: Re: [PATCH v5 4/4] submodule: record superproject gitdir during 'update'
Date: Tue, 09 Nov 2021 02:12:24 +0100	[thread overview]
Message-ID: <211109.86r1bqdsmu.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <YYmxSptSDRPWJJ3t@google.com>


On Mon, Nov 08 2021, Emily Shaffer wrote:

> On Fri, Nov 05, 2021 at 09:51:12AM +0100, Ævar Arnfjörð Bjarmason wrote:
>> 
>> 
>> On Thu, Nov 04 2021, Emily Shaffer wrote:
>> 
>> > A recorded hint path to the superproject's gitdir might be added during
>> > 'git submodule add', but in some cases - like submodules which were
>> > created before 'git submodule add' learned to record that info - it might
>> > be useful to update the hint. Let's do it during 'git submodule
>> > update', when we already have a handle to the superproject while calling
>> > operations on the submodules.
>> >
>> > Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
>> > ---
>> >  git-submodule.sh            | 14 ++++++++++++++
>> >  t/t7406-submodule-update.sh | 12 ++++++++++++
>> >  2 files changed, 26 insertions(+)
>> >
>> > diff --git a/git-submodule.sh b/git-submodule.sh
>> > index 652861aa66..873d64eb99 100755
>> > --- a/git-submodule.sh
>> > +++ b/git-submodule.sh
>> > @@ -449,6 +449,20 @@ cmd_update()
>> >  			;;
>> >  		esac
>> >  
>> > +		# Cache a pointer to the superproject's common dir. This may have
>> > +		# changed, unless it's a fresh clone. Writes it to worktree
>> > +		# if applicable, otherwise to local.
>> > +		if test -z "$just_cloned"
>> > +		then
>> > +			sm_gitdir="$(git -C "$sm_path" rev-parse --absolute-git-dir)"
>> > +			relative_gitdir="$(git rev-parse --path-format=relative \
>> > +							 --prefix "${sm_gitdir}" \
>> > +							 --git-common-dir)"
>> > +
>> > +			git -C "$sm_path" config --worktree \
>> > +				submodule.superprojectgitdir "$relative_gitdir"
>> > +		fi
>> > +
>> 
>> Aside from the "is this really needed anymore?" comment on this caching
>> strategy in general I had in [1] does this really need to be adding new
>> shell code to git-submodule.sh given that we're actively trying to get
>> rid of it entirely and move it over to C.
>> 
>> I.e. here we've just called "git submodule--helper
>> run-update-procedure", and we pass it "$sm_path" (but not "$recursive",
>> but could easily do so).
>> 
>> It needs to clone this new submodule, so presumably it already has a
>> "$sm_gitdir" equivalent, and we can turn that into the same relative
>> path, no?
>> 
>> Can't we call this with a git_config_set*() somewhere in that code?
>> 
>> *investigates a bit*
>> 
>> Okey, so I see that [2] is part of a series that Atharva Raykar had a
>> version of (including this new git-submodule.sh code above) [3] rebased
>> on top of an older version of this topic. I.e. this bit is that part of that patch:
>> 
>> +       git_config_set_in_file(config_path, "submodule.superprojectGitdir",
>> +                              relative_path(get_git_dir(),
>> +                                            update_data->sm_path, &sb));
>> 
>> I also vaguely recall that Junio ejected his topic to make room for this
>> topic first.
>> 
>> Maybe I've missed some update on this but is his [2][3] broken in
>> combination with your topic here? And re[1] is whatever "caching" is
>> being done here still beneficial once this is all moved to C?
>> 
>> In your [4] there seems to be an agreement to do it the other way
>> around, but as noted in the mail linked from [1] maybe the caching isn't
>> needed anymore then?
>
> I answered vs. your other mail; yes, it's still needed, [...]

I replied just now (in [1], but it's not on lore yet, maybe vger ate my
mail again).

tl;dr: Ran some quick performance numbers, and can't reproduce any
slowdown at all when instrumenting the test suite to run another
setup_git_directory() that'll do a nested lookup on pretty much every
git command, and running the test suite (you mentioned a ~5x overall
slowdown).

Anyway, I think the two replies you've got here only partially address
what I pointed out, in particular in [2]:
    
    But I'm a bit iffy on a series that's structured in a way as to not
    start by asserting that we should have given semantics without the
    cache, and then adds the cache later as an optional optimization.

I.e. even if it was 10x as slow I think it should still be structured in
such a way as to at least have some GIT_TEST_* knob to turn it off in
favor of a slow path.

E.g. we've got commit-graph paths that are probably 100x or 1000x faster
than the slow path, but having the cache-less ones means we can validate
their correctness.
    
> and last I checked Atharva's series had been kicked out to make room
> for this one.

Indeed, but as a comment on this proposed series that doesn't really
address this in my above-quoted ([3])

    I.e. here we've just called "git submodule--helper
    run-update-procedure", and we pass it "$sm_path" (but not "$recursive",
    but could easily do so).

I.e. my understanding is that Junio ejected Atharva's because this
seemed like the smaller change, but perhaps it wasn't realized that we'd
be adding shellscript only to remove it again?

But in any case, even if we're patching git-submodule.sh not having
Atharva's go first in its entirety doesn't answer why we can't extract
the bits he/you came up with in C for this series.

I.e. that git-submodule.sh wouldn't need that shelling out (since it
just called a helper, that helper could just return this data too),
that's just a few lines above the hunk in this series.

I.e. if some remaining performance issue I couldn't reproduce in [1] is
due to the shellscript version of this v.s. calling a C function in
libgit isn't it better to focus on closing that gap than having the
caching mechanism?

(Per the above & linked mails I'm still not 100% sure this even *is* a
caching mechanism, or if we store primary data in it, but I'm just
trying to go by your commit message descriptions).

1. https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/87r1cnfkqx.fsf@evledraar.gmail.com/
3. https://lore.kernel.org/git/211105.86wnlngebr.gmgdl@evledraar.gmail.com/

>> 
>> 1. https://lore.kernel.org/git/211105.861r3vhtot.gmgdl@evledraar.gmail.com/
>> 2. https://lore.kernel.org/git/20211013051805.45662-8-raykar.ath@gmail.com/
>> 3. https://lore.kernel.org/git/20211013051805.45662-1-raykar.ath@gmail.com/
>> 4. https://lore.kernel.org/git/YWiXL+plA7GHfuVv@google.com/


  reply	other threads:[~2021-11-09  1:40 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-14 20:34 [PATCH v4 0/4] cache parent project's gitdir in submodules Emily Shaffer
2021-10-14 20:34 ` [PATCH v4 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-10-14 20:34 ` [PATCH v4 2/4] introduce submodule.superprojectGitDir record Emily Shaffer
2021-10-14 20:34 ` [PATCH v4 3/4] submodule: record superproject gitdir during absorbgitdirs Emily Shaffer
2021-10-18 23:18   ` Jonathan Tan
2021-10-25 16:14     ` Derrick Stolee
2021-11-04 23:22       ` Emily Shaffer
2021-11-08  1:07         ` Derrick Stolee
2021-11-04 22:09     ` Emily Shaffer
2021-10-14 20:34 ` [PATCH v4 4/4] submodule: record superproject gitdir during 'update' Emily Shaffer
2021-10-25 16:17   ` Derrick Stolee
2021-10-25 16:19 ` [PATCH v4 0/4] cache parent project's gitdir in submodules Derrick Stolee
2021-11-04 23:49 ` [PATCH v5 " Emily Shaffer
2021-11-04 23:49   ` [PATCH v5 1/4] t7400-submodule-basic: modernize inspect() helper Emily Shaffer
2021-11-04 23:49   ` [PATCH v5 2/4] introduce submodule.superprojectGitDir record Emily Shaffer
2021-11-05  7:50     ` Junio C Hamano
2021-11-08 23:16       ` Emily Shaffer
2021-11-04 23:49   ` [PATCH v5 3/4] submodule: record superproject gitdir during absorbgitdirs Emily Shaffer
2021-11-04 23:49   ` [PATCH v5 4/4] submodule: record superproject gitdir during 'update' Emily Shaffer
2021-11-05  4:49     ` Junio C Hamano
2021-11-05  8:43       ` Ævar Arnfjörð Bjarmason
2021-11-08 23:21         ` Emily Shaffer
2021-11-09  0:42           ` Ævar Arnfjörð Bjarmason
2021-11-09 20:36             ` Emily Shaffer
2021-11-09 21:46               ` Emily Shaffer
2021-11-05  8:51     ` Ævar Arnfjörð Bjarmason
2021-11-08 23:22       ` Emily Shaffer
2021-11-09  1:12         ` Ævar Arnfjörð Bjarmason [this message]
2021-11-08  1:24   ` [PATCH v5 0/4] cache parent project's gitdir in submodules Derrick Stolee
2021-11-08 23:11     ` Emily Shaffer
2021-11-09 15:58       ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=211109.86r1bqdsmu.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=raykar.ath@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).