From: Junio C Hamano <gitster@pobox.com>
To: Stefan Beller <sbeller@google.com>
Cc: Uma Srinivasan <usrinivasan@twitter.com>,
Jacob Keller <jacob.keller@gmail.com>,
Git Mailing List <git@vger.kernel.org>,
Jens Lehmann <Jens.Lehmann@web.de>,
Heiko Voigt <hvoigt@hvoigt.net>
Subject: Re: git submodules implementation question
Date: Thu, 01 Sep 2016 13:21:02 -0700 [thread overview]
Message-ID: <xmqq4m5zwevl.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <xmqqa8frwhpr.fsf@gitster.mtv.corp.google.com> (Junio C. Hamano's message of "Thu, 01 Sep 2016 12:19:44 -0700")
Junio C Hamano <gitster@pobox.com> writes:
> Stefan Beller <sbeller@google.com> writes:
>
>>> The final version needs to be accompanied with tests to show the
>>> effect of this change for callers. A test would set up a top-level
>>> and submodule, deliberately break submodule/.git/ repository and
>>> show what breaks and how without this change.
>>
>> Tests are really good at providing this context as well, or to communicate
>> the actual underlying problem, which is not quite clear to me.
>> That is why I refrained from jumping into the discussion as I think the
>> first few emails were dropped from the mailing list and I am missing context.
>
> I do not know where you started reading, but the gist of it is that
> submodule.c spawns subprocess to run in the submodule's context by
> assuming that chdir'ing into the <path> of the submodule and running
> it (i.e. cp.dir set to <path> to drive start_command(&cp)) is
> sufficient. When <path>/.git (either it is a directory itself or it
> points at a directory in .git/module/<name> in the superproject) is
> a corrupt repository, running "git -C <path> command" would try to
> auto-detect the repository, because it thinks <path>/.git is not a
> repository and it thinks it is not at the top-level of the working
> tree, and instead finds the repository of the top-level, which is
> almost never what we want.
This is with a test that covers the call in get_next_submodule() for
the parallel fetch callback. I think many of the codepaths will end
up recursing forever the same way without the fix in a submodule
repository that is broken in a similar way, but I didn't check, so
I do not consider this to be completed.
-- >8 --
Subject: submodule: avoid auto-discovery in prepare_submodule_repo_env()
The function is used to set up the environment variable used in a
subprocess we spawn in a submodule directory. The callers set up a
child_process structure, find the working tree path of one submodule
and set .dir field to it, and then use start_command() API to spawn
the subprocess like "status", "fetch", etc.
When this happens, we expect that the ".git" (either a directory or
a gitfile that points at the real location) in the current working
directory of the subprocess MUST be the repository for the submodule.
If this ".git" thing is a corrupt repository, however, because
prepare_submodule_repo_env() unsets GIT_DIR and GIT_WORK_TREE, the
subprocess will see ".git", thinks it is not a repository, and
attempt to find one by going up, likely to end up in finding the
repository of the superproject. In some codepaths, this will cause
a command run with the "--recurse-submodules" option to recurse
forever.
By exporting GIT_DIR=.git, disable the auto-discovery logic in the
subprocess, which would instead stop it and report an error.
Not-signed-off-yet.
---
submodule.c | 1 +
t/t5526-fetch-submodules.sh | 29 +++++++++++++++++++++++++++++
2 files changed, 30 insertions(+)
diff --git a/submodule.c b/submodule.c
index 1b5cdfb..e8258f0 100644
--- a/submodule.c
+++ b/submodule.c
@@ -1160,4 +1160,5 @@ void prepare_submodule_repo_env(struct argv_array *out)
if (strcmp(*var, CONFIG_DATA_ENVIRONMENT))
argv_array_push(out, *var);
}
+ argv_array_push(out, "GIT_DIR=.git");
}
diff --git a/t/t5526-fetch-submodules.sh b/t/t5526-fetch-submodules.sh
index 954d0e4..b2dee30 100755
--- a/t/t5526-fetch-submodules.sh
+++ b/t/t5526-fetch-submodules.sh
@@ -485,4 +485,33 @@ test_expect_success 'fetching submodules respects parallel settings' '
)
'
+test_expect_success 'fetching submodule into a broken repository' '
+ # Prepare src and src/sub nested in it
+ git init src &&
+ (
+ cd src &&
+ git init sub &&
+ git -C sub commit --allow-empty -m "initial in sub" &&
+ git submodule add -- ./sub sub &&
+ git commit -m "initial in top"
+ ) &&
+
+ # Clone the old-fashoned way
+ git clone src dst &&
+ git -C dst clone ../src/sub sub &&
+
+ # Make sure that old-fashoned layout is still supported
+ git -C dst status &&
+
+ # Recursive-fetch works fine
+ git -C dst fetch --recurse-submodules &&
+
+ # Break the receiving submodule
+ rm -f dst/sub/.git/HEAD &&
+
+ # Recursive-fetch must terminate
+ # NOTE: without fix this will recurse forever!
+ test_must_fail git -C dst fetch --recurse-submodules
+'
+
test_done
next prev parent reply other threads:[~2016-09-01 21:00 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-28 23:24 git submodules implementation question Uma Srinivasan
2016-08-29 20:03 ` Junio C Hamano
2016-08-29 21:03 ` Uma Srinivasan
2016-08-29 21:09 ` Junio C Hamano
2016-08-29 21:13 ` Uma Srinivasan
2016-08-29 23:04 ` Uma Srinivasan
2016-08-29 23:15 ` Junio C Hamano
2016-08-29 23:34 ` Uma Srinivasan
2016-08-30 0:02 ` Jacob Keller
2016-08-30 0:12 ` Uma Srinivasan
2016-08-30 6:09 ` Jacob Keller
2016-08-30 6:23 ` Jacob Keller
2016-08-30 17:40 ` Uma Srinivasan
2016-08-30 17:53 ` Junio C Hamano
2016-08-31 2:54 ` Uma Srinivasan
2016-08-31 16:42 ` Junio C Hamano
2016-08-31 18:40 ` Uma Srinivasan
2016-08-31 18:44 ` Junio C Hamano
2016-08-31 18:58 ` Uma Srinivasan
2016-09-01 1:04 ` Uma Srinivasan
2016-09-01 4:09 ` Junio C Hamano
2016-09-01 16:05 ` Uma Srinivasan
2016-09-01 18:32 ` Junio C Hamano
2016-09-01 18:37 ` Stefan Beller
2016-09-01 19:19 ` Junio C Hamano
2016-09-01 19:56 ` Uma Srinivasan
2016-09-01 20:29 ` Junio C Hamano
2016-09-01 20:21 ` Junio C Hamano [this message]
2016-09-01 21:02 ` Junio C Hamano
2016-09-01 21:04 ` Stefan Beller
2016-09-01 21:12 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq4m5zwevl.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox.com \
--cc=Jens.Lehmann@web.de \
--cc=git@vger.kernel.org \
--cc=hvoigt@hvoigt.net \
--cc=jacob.keller@gmail.com \
--cc=sbeller@google.com \
--cc=usrinivasan@twitter.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).