git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, me@ttaylorr.com, newren@gmail.com,
	vdye@github.com, Derrick Stolee <stolee@gmail.com>,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v3 1/3] sparse-checkout: fix segfault on malformed patterns
Date: Wed, 15 Dec 2021 12:56:13 -0800	[thread overview]
Message-ID: <xmqqv8zp4mfm.fsf@gitster.g> (raw)
In-Reply-To: <1744a26845fbe4d7dbc80f387be1d842b5f8fe94.1639575968.git.gitgitgadget@gmail.com> (Derrick Stolee via GitGitGadget's message of "Wed, 15 Dec 2021 13:46:06 +0000")

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> Then core.sparseCheckoutCone is enabled, the sparse-checkout patterns are
> used to populate two hashsets that accelerate pattern matching. If the user
> modifies the sparse-checkout file outside of the 'sparse-checkout' builtin,
> then strange patterns can happen, triggering some error checks.
>
> One of these error checks is possible to hit when some special characters
> exist in a line. A warning message is correctly written to stderr, but then
> there is additional logic that attempts to remove the line from the hashset
> and free the data.

Makes sense.

> This leads to a segfault in the 'git sparse-checkout
> list' command because it iterates over the contents of the hashset, which is
> now invalid.

Understandable.

> The fix here is to stop trying to remove from the hashset. Better to leave
> bad data in the sparse-checkout matching logic (with a warning) than to
> segfault.

True, as long as it won't make the situation worse by depending on
that bad data to further damage working tree data or in-repo data
when damaged working tree data gets committed.  And "list segfaults
with freed/NULLed data---so leave the bad ones in to print these bad
ones" feels OK-ish.  

As long as the user is not transporting the listed output to another
repository, which may fall into "making the situation worse"
category by spreading an existing breakage, that is.

In other words, this may paper over the segfault, and it may be safe
only for "sparse-checkout list", but is it safe for other operations
that actually use this bad data to further affect other things in
the repository?  If not, I wonder if we want to hard die to lock the
repository down before the issue is fixed to avoid spreading the
damage?

> diff --git a/dir.c b/dir.c
> index 5aa6fbad0b7..0693c7cb3ee 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -819,9 +819,6 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
>  		/* we already included this at the parent level */
>  		warning(_("your sparse-checkout file may have issues: pattern '%s' is repeated"),
>  			given->pattern);
> -		hashmap_remove(&pl->parent_hashmap, &translated->ent, &data);
> -		free(data);
> -		free(translated);
>  	}
>  
>  	return;
> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
> index 272ba1b566b..c72b8ee2e7b 100755
> --- a/t/t1091-sparse-checkout-builtin.sh
> +++ b/t/t1091-sparse-checkout-builtin.sh
> @@ -708,4 +708,19 @@ test_expect_success 'cone mode clears ignored subdirectories' '
>  	test_cmp expect out
>  '
>  
> +test_expect_success 'malformed cone-mode patterns' '
> +	git -C repo sparse-checkout init --cone &&
> +	mkdir -p repo/foo/bar &&
> +	touch repo/foo/bar/x repo/foo/y &&
> +	cat >repo/.git/info/sparse-checkout <<-\EOF &&
> +	/*
> +	!/*/
> +	/foo/
> +	!/foo/*/
> +	/foo/\*/
> +	EOF
> +	cat repo/.git/info/sparse-checkout &&

Stray debugging output?

> +	git -C repo sparse-checkout list

And we are happy as long as the command does not segfault, and we do
not care what the output is.

> +'
> +
>  test_done

Will queue, but not convinced yet.

  reply	other threads:[~2021-12-15 20:56 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-07 20:02 [PATCH 0/3] sparse-checkout: fix segfault on malformed patterns Derrick Stolee via GitGitGadget
2021-12-07 20:02 ` [PATCH 1/3] " Derrick Stolee via GitGitGadget
2021-12-07 20:22   ` Elijah Newren
2021-12-07 20:02 ` [PATCH 2/3] sparse-checkout: fix OOM error with mixed patterns Derrick Stolee via GitGitGadget
2021-12-07 20:02 ` [PATCH 3/3] sparse-checkout: refuse to add to bad patterns Derrick Stolee via GitGitGadget
2021-12-07 21:51 ` [PATCH 0/3] sparse-checkout: fix segfault on malformed patterns Elijah Newren
2021-12-08 14:23   ` Derrick Stolee
2021-12-10 15:18 ` [PATCH v2 0/4] " Derrick Stolee via GitGitGadget
2021-12-10 15:18   ` [PATCH v2 1/4] " Derrick Stolee via GitGitGadget
2021-12-10 15:18   ` [PATCH v2 2/4] sparse-checkout: fix OOM error with mixed patterns Derrick Stolee via GitGitGadget
2021-12-10 15:18   ` [PATCH v2 3/4] sparse-checkout: refuse to add to bad patterns Derrick Stolee via GitGitGadget
2021-12-15 13:46   ` [PATCH v3 0/3] sparse-checkout: fix segfault on malformed patterns Derrick Stolee via GitGitGadget
2021-12-15 13:46     ` [PATCH v3 1/3] " Derrick Stolee via GitGitGadget
2021-12-15 20:56       ` Junio C Hamano [this message]
2021-12-16 14:23         ` Derrick Stolee
2021-12-15 13:46     ` [PATCH v3 2/3] sparse-checkout: fix OOM error with mixed patterns Derrick Stolee via GitGitGadget
2021-12-15 13:46     ` [PATCH v3 3/3] sparse-checkout: refuse to add to bad patterns Derrick Stolee via GitGitGadget
2021-12-15 20:43     ` [PATCH v3 0/3] sparse-checkout: fix segfault on malformed patterns Junio C Hamano
2021-12-16 14:24       ` Derrick Stolee
2021-12-16 19:16         ` Junio C Hamano
2021-12-16 16:13     ` [PATCH v4 " Derrick Stolee via GitGitGadget
2021-12-16 16:13       ` [PATCH v4 1/3] " Derrick Stolee via GitGitGadget
2021-12-16 16:13       ` [PATCH v4 2/3] sparse-checkout: fix OOM error with mixed patterns Derrick Stolee via GitGitGadget
2021-12-16 16:13       ` [PATCH v4 3/3] sparse-checkout: refuse to add to bad patterns Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqv8zp4mfm.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=stolee@gmail.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).