git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: "Antonio Ospite" <ao2@ao2.it>,
	git@vger.kernel.org, "Daniel Graña" <dangra@gmail.com>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Richard Hartmann" <richih.mailinglist@gmail.com>,
	"Stefan Beller" <sbeller@google.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: Re: [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree
Date: Wed, 19 Sep 2018 12:24:30 -0700	[thread overview]
Message-ID: <xmqqfty547f5.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <20180918171257.GC27036@localhost> ("SZEDER Gábor"'s message of "Tue, 18 Sep 2018 19:12:57 +0200")

SZEDER Gábor <szeder.dev@gmail.com> writes:

> it appears that this patch (and its previous versions as well) is
> responsible for triggering occasional test failures in
> 't7814-grep-recurse-submodules.sh', more frequently, about once in
> every ten runs, on macOS on Travis CI, less frequently, about once in
> a couple of hundred runs on Linux on my machine.

I see that among Cc'ed are people who are more familiar with the
submodule code and where it wants to go.  Thanks for a report and
analysis.

> The reason for the failure is memory corruption manifesting in various
> ways: segfault, malloc() or use after free() errors from libc, corrupt
> loose object, invalid ref, bogus output, etc.
> 
> Applying the following patch makes t7814 fail almost every time,
> though sometimes that loop has to iterate over 1000 times until that
> 'git grep' finally fails...  so good luck with debugging ;)
>
> diff --git a/t/t7814-grep-recurse-submodules.sh b/t/t7814-grep-recurse-submodules.sh
> index 7184113b9b..93ae2e8e7c 100755
> --- a/t/t7814-grep-recurse-submodules.sh
> +++ b/t/t7814-grep-recurse-submodules.sh
> @@ -337,6 +337,10 @@ test_expect_success 'grep --recurse-submodules should pass the pattern type alon
>  	test_must_fail git -c grep.patternType=fixed grep --recurse-submodules -e "(.|.)[\d]" &&
>  
>  	# Basic
> +	for i in $(seq 0 2000)
> +	do
> +		git grep --recurse-submodules 1 >/dev/null || return 1
> +	done &&
>  	git grep -G --recurse-submodules -e "(.|.)[\d]" >actual &&
>  	cat >expect <<-\EOF &&
>  	a:(1|2)d(3|4)
>
> On first look I didn't notice anything that is obviously wrong in this
> patch and could be responsible for the memory corruption, but there is
> one thing I found strange, though:
>
>
> On Mon, Sep 17, 2018 at 04:09:40PM +0200, Antonio Ospite wrote:
>> When the .gitmodules file is not available in the working tree, try
>> using the content from the index and from the current branch.
>
> "from the index and from the current branch" of which repository?
>
>> This
>> covers the case when the file is part of the repository but for some
>> reason it is not checked out, for example because of a sparse checkout.
>> 
>> This makes it possible to use at least the 'git submodule' commands
>> which *read* the gitmodules configuration file without fully populating
>> the working tree.
>> 
>> Writing to .gitmodules will still require that the file is checked out,
>> so check for that before calling config_set_in_gitmodules_file_gently.
>> 
>> Add a similar check also in git-submodule.sh::cmd_add() to anticipate
>> the eventual failure of the "git submodule add" command when .gitmodules
>> is not safely writeable; this prevents the command from leaving the
>> repository in a spurious state (e.g. the submodule repository was cloned
>> but .gitmodules was not updated because
>> config_set_in_gitmodules_file_gently failed).
>> 
>> Finally, add t7416-submodule-sparse-gitmodules.sh to verify that reading
>> from .gitmodules succeeds and that writing to it fails when the file is
>> not checked out.
>> 
>> Signed-off-by: Antonio Ospite <ao2@ao2.it>
>> ---
>
>> diff --git a/submodule-config.c b/submodule-config.c
>> index 61a555e920..bdb1d0e2c9 100644
>> --- a/submodule-config.c
>> +++ b/submodule-config.c
>
>> @@ -603,8 +604,21 @@ static void submodule_cache_check_init(struct repository *repo)
>>  static void config_from_gitmodules(config_fn_t fn, struct repository *repo, void *data)
>>  {
>>  	if (repo->worktree) {
>> -		char *file = repo_worktree_path(repo, GITMODULES_FILE);
>> -		git_config_from_file(fn, file, data);
>> +		struct git_config_source config_source = { 0 };
>> +		const struct config_options opts = { 0 };
>> +		struct object_id oid;
>> +		char *file;
>> +
>> +		file = repo_worktree_path(repo, GITMODULES_FILE);
>> +		if (file_exists(file))
>> +			config_source.file = file;
>> +		else if (get_oid(GITMODULES_INDEX, &oid) >= 0)
>> +			config_source.blob = GITMODULES_INDEX;
>
> The repository used in t7814 contains nested submodules, which means
> that config_from_gitmodules() is invoked three times.
>
> Now, the first two of those calls look at the superproject and at
> 'submodule', and find the existing files '.../trash
> directory.t7814-grep-recurse-submodules/.gitmodules' and '.../trash
> directory.t7814-grep-recurse-submodules/submodule/.gitmodules',
> respectively.  So far so good.
>
> The third call, however, looks at the nested submodule at
> 'submodule/sub', which doesn't contain a '.gitmodules' file.  So this
> function goes on with the second condition and calls
> get_oid(GITMODULES_INDEX, &oid), which then appears to find the blob
> in the _superproject's_ index.
>
> I'm no expert on submodules, but my gut feeling says that this can't
> be right.  But if it _is_ right, then I would say that the commit
> message should explain in detail, why it is right.
>
> Anyway, even if it is indeed wrong, I'm not sure whether this is the
> root cause of the memory corruption.
>
>
>> +		else if (get_oid(GITMODULES_HEAD, &oid) >= 0)
>> +			config_source.blob = GITMODULES_HEAD;
>> +
>> +		config_with_options(fn, data, &config_source, &opts);
>> +
>>  		free(file);
>>  	}
>>  }

  reply	other threads:[~2018-09-19 19:24 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-17 14:09 [PATCH v5 0/9] Make submodules work if .gitmodules is not checked out Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 1/9] submodule: add a print_config_from_gitmodules() helper Antonio Ospite
2018-09-24 10:25   ` Antonio Ospite
2018-09-24 23:06     ` Stefan Beller
2018-09-17 14:09 ` [PATCH v5 2/9] submodule: factor out a config_set_in_gitmodules_file_gently function Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 3/9] t7411: merge tests 5 and 6 Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 4/9] t7411: be nicer to future tests and really clean things up Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 5/9] submodule--helper: add a new 'config' subcommand Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 6/9] submodule: use the 'submodule--helper config' command Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 7/9] t7506: clean up .gitmodules properly before setting up new scenario Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 8/9] submodule: add a helper to check if it is safe to write to .gitmodules Antonio Ospite
2018-09-17 14:09 ` [PATCH v5 9/9] submodule: support reading .gitmodules when it's not in the working tree Antonio Ospite
2018-09-18 17:12   ` SZEDER Gábor
2018-09-19 19:24     ` Junio C Hamano [this message]
2018-09-20 15:35     ` Antonio Ospite
2018-09-21 16:19       ` Junio C Hamano
2018-09-27 14:49         ` Antonio Ospite
2018-09-24 10:20     ` Antonio Ospite
2018-09-24 21:00       ` Stefan Beller
2018-09-27 14:44         ` Antonio Ospite
2018-09-27 18:00           ` Stefan Beller
2018-10-01 15:45             ` Antonio Ospite
2018-10-01 19:42               ` Stefan Beller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqfty547f5.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=ao2@ao2.it \
    --cc=avarab@gmail.com \
    --cc=dangra@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=richih.mailinglist@gmail.com \
    --cc=sbeller@google.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).