From: Derrick Stolee <stolee@gmail.com>
To: Jeff King <peff@peff.net>,
Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, me@ttaylorr.com, newren@gmail.com,
Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH v3 09/12] sparse-checkout: properly match escaped characters
Date: Wed, 29 Jan 2020 08:58:59 -0500 [thread overview]
Message-ID: <6003bbf2-ad16-0686-dc58-2010fe02ce05@gmail.com> (raw)
In-Reply-To: <20200129100309.GA4218@coredump.intra.peff.net>
On 1/29/2020 5:03 AM, Jeff King wrote:
> On Tue, Jan 28, 2020 at 06:26:40PM +0000, Derrick Stolee via GitGitGadget wrote:
>
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> In cone mode, the sparse-checkout feature uses hashset containment
>> queries to match paths. Make this algorithm respect escaped asterisk
>> (*) and backslash (\) characters.
>
> Do we also need to worry about other glob metacharacters? E.g., "?" or
> ranges like "[A-Z]"?
These are not part of the .gitignore patterns [1].
[1] https://git-scm.com/docs/gitignore#_pattern_format
>> +static char *dup_and_filter_pattern(const char *pattern)
>> +{
>> + char *set, *read;
>> + char *result = xstrdup(pattern);
>> +
>> + set = result;
>> + read = result;
>> +
>> + while (*read) {
>> + /* skip escape characters (once) */
>> + if (*read == '\\')
>> + read++;
>> +
>> + *set = *read;
>> +
>> + set++;
>> + read++;
>> + }
>> + *set = 0;
>> +
>> + if (*(read - 2) == '/' && *(read - 1) == '*')
>> + *(read - 2) = 0;
>> +
>> + return result;
>> +}
>
> Do we need to check that the pattern is longer than 1 character here? If
> it's a single character, it seems like this "*(read - 2)" will
> dereference the byte before the string.
This method is only called by add_pattern_to_hashsets(), which
has a guard against paths of length less than 2, but thats' no
excuse for dangerous pointer arithmetic here.
But you also point out an even more confusing thing: why are we
modifying based on the 'read' pointer, and not the 'set' pointer?
This seems to work _accidentally_ only when the pattern has "<something>/*"
and "<something>" has no escape characters.
I had to recall exactly why we are dropping this "/*", but it's because
the pattern _actually_ ends with "/*/" but the in-memory pattern has
already dropped that last slash and applied PATTERN_FLAG_MUSTBEDIR.
Here is a diff that I can apply to this patch to fix this problem
_and_ demonstrate it in the tests:
diff --git a/dir.c b/dir.c
index 579f274d13..277577c8bf 100644
--- a/dir.c
+++ b/dir.c
@@ -633,6 +633,7 @@ int pl_hashmap_cmp(const void *unused_cmp_data,
static char *dup_and_filter_pattern(const char *pattern)
{
char *set, *read;
+ size_t count = 0;
char *result = xstrdup(pattern);
set = result;
@@ -647,11 +648,14 @@ static char *dup_and_filter_pattern(const char *pattern)
set++;
read++;
+ count++;
}
*set = 0;
- if (*(read - 2) == '/' && *(read - 1) == '*')
- *(read - 2) = 0;
+ if (count > 2 &&
+ *(set - 1) == '*' &&
+ *(set - 2) == '/')
+ *(set - 2) = 0;
return result;
}
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 0a21a5e15d..20b0465f77 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -383,6 +383,7 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
/*
!/*/
/zbad\\dir/
+ !/zbad\\dir/*/
/zdoes\*not\*exist/
/zdoes\*exist/
EOF
With this extra line in the test, but compiling the old version of this patch,
the test fails with:
'err' is not empty, it contains:
+ cat err
warning: unrecognized negative pattern: '/zbad\\dir/*'
warning: disabling cone pattern matching
To ensure this negative pattern exists in the later patch where we set
the patterns using the builtin, I'll add "zbad\\dir/bogus" to the list
of directories to include, which will add another pattern to the set.
Thanks,
-Stolee
next prev parent reply other threads:[~2020-01-29 13:59 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
2020-01-14 19:25 ` [PATCH 1/8] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-16 21:40 ` Junio C Hamano
2020-01-14 19:25 ` [PATCH 2/8] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-16 21:46 ` Junio C Hamano
2020-01-14 19:25 ` [PATCH 3/8] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-14 19:30 ` Taylor Blau
2020-01-14 19:25 ` [PATCH 4/8] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-14 21:16 ` Jeff King
2020-01-14 19:25 ` [PATCH 5/8] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-14 19:26 ` [PATCH 6/8] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
2020-01-14 19:26 ` [PATCH 7/8] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-14 21:21 ` Jeff King
2020-01-14 22:08 ` Derrick Stolee
2020-01-14 19:26 ` [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-14 21:25 ` Jeff King
2020-01-14 22:11 ` Derrick Stolee
2020-01-14 22:48 ` Jeff King
2020-01-24 21:10 ` Derrick Stolee
2020-01-24 21:42 ` Jeff King
2020-01-28 15:03 ` Derrick Stolee
2020-01-14 19:34 ` [PATCH 0/8] Harden the sparse-checkout builtin Taylor Blau
2020-01-14 19:44 ` Derrick Stolee
2020-01-14 21:31 ` Jeff King
2020-01-15 19:16 ` Junio C Hamano
2020-01-15 20:32 ` Derrick Stolee
2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 02/12] t1091: improve here-docs Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 03/12] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 04/12] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 06/12] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 07/12] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 08/12] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
2020-01-24 21:19 ` [PATCH v2 12/12] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
2020-01-28 18:26 ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
2020-01-28 18:26 ` [PATCH v3 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-28 18:26 ` [PATCH v3 02/12] t1091: improve here-docs Derrick Stolee via GitGitGadget
2020-01-28 18:26 ` [PATCH v3 03/12] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-28 18:26 ` [PATCH v3 04/12] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-28 18:26 ` [PATCH v3 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
2020-01-28 18:26 ` [PATCH v3 06/12] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-28 18:26 ` [PATCH v3 07/12] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-28 18:26 ` [PATCH v3 08/12] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
2020-01-28 18:26 ` [PATCH v3 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-29 10:03 ` Jeff King
2020-01-29 13:58 ` Derrick Stolee [this message]
2020-01-29 14:04 ` Derrick Stolee
2020-01-28 18:26 ` [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-29 10:17 ` Jeff King
2020-01-29 10:33 ` Jeff King
2020-01-29 14:16 ` Derrick Stolee
2020-01-29 14:39 ` Derrick Stolee
2020-01-30 7:29 ` Jeff King
2020-01-30 15:01 ` Derrick Stolee
2020-01-28 18:26 ` [PATCH v3 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
2020-01-29 10:23 ` Jeff King
2020-01-28 18:26 ` [PATCH v3 12/12] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 01/15] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 02/15] t1091: improve here-docs Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 03/15] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 04/15] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 05/15] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 06/15] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 07/15] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 08/15] sparse-checkout: warn on globs in cone patterns Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 09/15] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 10/15] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 11/15] sparse-checkout: unquote C-style strings over --stdin Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 12/15] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 13/15] sparse-checkout: escape all glob characters on write Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 14/15] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
2020-01-31 20:16 ` [PATCH v4 15/15] sparse-checkout: fix cone mode behavior mismatch Derrick Stolee via GitGitGadget
2020-01-31 20:36 ` [PATCH v4 00/15] Harden the sparse-checkout builtin Elijah Newren
2020-02-03 14:09 ` Derrick Stolee
2020-02-08 23:32 ` Taylor Blau
2020-02-09 17:27 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6003bbf2-ad16-0686-dc58-2010fe02ce05@gmail.com \
--to=stolee@gmail.com \
--cc=dstolee@microsoft.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=me@ttaylorr.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).