git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH 0/8] Harden the sparse-checkout builtin
@ 2020-01-14 19:25 Derrick Stolee via GitGitGadget
  2020-01-14 19:25 ` [PATCH 1/8] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
                   ` (10 more replies)
  0 siblings, 11 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-14 19:25 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee

This series is based on ds/sparse-list-in-cone-mode.

This series attempts to clean up some rough edges in the sparse-checkout
feature, especially around the cone mode.

Unfortunately, after the v2.25.0 release, we noticed an issue with the "git
clone --sparse" option when using a URL instead of a local path. This is
fixed and properly tested here.

Also, let's improve Git's response to these more complicated scenarios:

 1. Running "git sparse-checkout init" in a worktree would complain because
    the "info" dir doesn't exist.
 2. Tracked paths that include "*" and "" in their filenames.
 3. If a user edits the sparse-checkout file to have non-cone pattern, such
    as "*" anywhere or "" in the wrong place, then we should respond
    appropriately. That is: warn that the patterns are not cone-mode, then
    revert to the old logic.

Thanks, -Stolee

Derrick Stolee (8):
  t1091: use check_files to reduce boilerplate
  sparse-checkout: create leading directories
  clone: fix --sparse option with URLs
  sparse-checkout: cone mode does not recognize "**"
  sparse-checkout: detect short patterns
  sparse-checkout: warn on incorrect '*' in patterns
  sparse-checkout: properly match escaped characters
  sparse-checkout: write escaped patterns in cone mode

 builtin/clone.c                    |   2 +-
 builtin/sparse-checkout.c          |  52 ++++-
 dir.c                              |  69 ++++++-
 dir.h                              |   1 +
 t/t1091-sparse-checkout-builtin.sh | 320 ++++++++++++++++-------------
 5 files changed, 296 insertions(+), 148 deletions(-)


base-commit: 4fd683b6a35eabd23dd5183da7f654a1e1f00325
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-513%2Fderrickstolee%2Fsparse-harden-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-513/derrickstolee/sparse-harden-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/513
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH 1/8] t1091: use check_files to reduce boilerplate
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
@ 2020-01-14 19:25 ` Derrick Stolee via GitGitGadget
  2020-01-16 21:40   ` Junio C Hamano
  2020-01-14 19:25 ` [PATCH 2/8] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-14 19:25 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When testing the sparse-checkout feature, we need to compare the
contents of the working-directory against some expected output.
Using here-docs was useful in the beginning, but became repetetive
as the test script grew.

Create a check_files helper to make the tests simpler and easier
to extend. It also reduces instances of bad here-doc whitespace.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1091-sparse-checkout-builtin.sh | 215 ++++++++++-------------------
 1 file changed, 71 insertions(+), 144 deletions(-)

diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index ff7f8f7a1f..20caefe155 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -12,6 +12,13 @@ list_files() {
 	(cd "$1" && printf '%s\n' *)
 }
 
+check_files() {
+	DIR=$1
+	printf "%s\n" $2 >expect &&
+	list_files $DIR >actual &&
+	test_cmp expect actual
+}
+
 test_expect_success 'setup' '
 	git init repo &&
 	(
@@ -39,11 +46,11 @@ test_expect_success 'git sparse-checkout list (empty)' '
 
 test_expect_success 'git sparse-checkout list (populated)' '
 	test_when_finished rm -f repo/.git/info/sparse-checkout &&
-	cat >repo/.git/info/sparse-checkout <<-EOF &&
-		/folder1/*
-		/deep/
-		**/a
-		!*bin*
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/folder1/*
+	/deep/
+	**/a
+	!*bin*
 	EOF
 	cp repo/.git/info/sparse-checkout expect &&
 	git -C repo sparse-checkout list >list &&
@@ -52,22 +59,20 @@ test_expect_success 'git sparse-checkout list (populated)' '
 
 test_expect_success 'git sparse-checkout init' '
 	git -C repo sparse-checkout init &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	test_cmp_config -C repo true core.sparsecheckout &&
-	list_files repo >dir  &&
-	echo a >expect &&
-	test_cmp expect dir
+	check_files repo a
 '
 
 test_expect_success 'git sparse-checkout list after init' '
 	git -C repo sparse-checkout list >actual &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect actual
 '
@@ -75,32 +80,24 @@ test_expect_success 'git sparse-checkout list after init' '
 test_expect_success 'init with existing sparse-checkout' '
 	echo "*folder*" >> repo/.git/info/sparse-checkout &&
 	git -C repo sparse-checkout init &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		*folder*
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	*folder*
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo "a folder1 folder2"
 '
 
 test_expect_success 'clone --sparse' '
 	git clone --sparse repo clone &&
 	git -C clone sparse-checkout list >actual &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect actual &&
-	list_files clone >dir &&
-	echo a >expect &&
-	test_cmp expect dir
+	check_files clone a
 '
 
 test_expect_success 'set enables config' '
@@ -119,41 +116,29 @@ test_expect_success 'set enables config' '
 
 test_expect_success 'set sparse-checkout using builtin' '
 	git -C repo sparse-checkout set "/*" "!/*/" "*folder*" &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		*folder*
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	*folder*
 	EOF
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo "a folder1 folder2"
 '
 
 test_expect_success 'set sparse-checkout using --stdin' '
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/folder1/
-		/folder2/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/folder1/
+	/folder2/
 	EOF
 	git -C repo sparse-checkout set --stdin <expect &&
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo "a folder1 folder2"
 '
 
 test_expect_success 'cone mode: match patterns' '
@@ -162,13 +147,7 @@ test_expect_success 'cone mode: match patterns' '
 	git -C repo read-tree -mu HEAD 2>err &&
 	test_i18ngrep ! "disabling cone patterns" err &&
 	git -C repo reset --hard &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo "a folder1 folder2"
 '
 
 test_expect_success 'cone mode: warn on bad pattern' '
@@ -185,14 +164,7 @@ test_expect_success 'sparse-checkout disable' '
 	test_path_is_file repo/.git/info/sparse-checkout &&
 	git -C repo config --list >config &&
 	test_must_fail git config core.sparseCheckout &&
-	list_files repo >dir &&
-	cat >expect <<-EOF &&
-		a
-		deep
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo "a deep folder1 folder2"
 '
 
 test_expect_success 'cone mode: init and set' '
@@ -204,52 +176,31 @@ test_expect_success 'cone mode: init and set' '
 	test_cmp expect dir &&
 	git -C repo sparse-checkout set deep/deeper1/deepest/ 2>err &&
 	test_must_be_empty err &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deep
-	EOF
-	test_cmp expect dir &&
-	list_files repo/deep >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deeper1
-	EOF
-	test_cmp expect dir &&
-	list_files repo/deep/deeper1 >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deepest
-	EOF
-	test_cmp expect dir &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/deep/
-		!/deep/*/
-		/deep/deeper1/
-		!/deep/deeper1/*/
-		/deep/deeper1/deepest/
+	check_files repo "a deep" &&
+	check_files repo/deep "a deeper1" &&
+	check_files repo/deep/deeper1 "a deepest" &&
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
+	!/deep/*/
+	/deep/deeper1/
+	!/deep/deeper1/*/
+	/deep/deeper1/deepest/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	git -C repo sparse-checkout set --stdin 2>err <<-EOF &&
-		folder1
-		folder2
+	git -C repo sparse-checkout set --stdin 2>err <<-\EOF &&
+	folder1
+	folder2
 	EOF
 	test_must_be_empty err &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	list_files repo >dir &&
-	test_cmp expect dir
+	check_files repo "a folder1 folder2"
 '
 
 test_expect_success 'cone mode: list' '
-	cat >expect <<-EOF &&
-		folder1
-		folder2
+	cat >expect <<-\EOF &&
+	folder1
+	folder2
 	EOF
 	git -C repo sparse-checkout set --stdin <expect &&
 	git -C repo sparse-checkout list >actual 2>err &&
@@ -260,10 +211,10 @@ test_expect_success 'cone mode: list' '
 test_expect_success 'cone mode: set with nested folders' '
 	git -C repo sparse-checkout set deep deep/deeper1/deepest 2>err &&
 	test_line_count = 0 err &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/deep/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
 	EOF
 	test_cmp repo/.git/info/sparse-checkout expect
 '
@@ -275,13 +226,7 @@ test_expect_success 'revert to old sparse-checkout on bad update' '
 	test_must_fail git -C repo sparse-checkout set deep/deeper1 2>err &&
 	test_i18ngrep "cannot set sparse-checkout patterns" err &&
 	test_cmp repo/.git/info/sparse-checkout expect &&
-	list_files repo/deep >dir &&
-	cat >expect <<-EOF &&
-		a
-		deeper1
-		deeper2
-	EOF
-	test_cmp dir expect
+	check_files repo/deep "a deeper1 deeper2"
 '
 
 test_expect_success 'revert to old sparse-checkout on empty update' '
@@ -326,18 +271,13 @@ test_expect_success 'sparse-checkout (init|set|disable) fails with dirty status'
 test_expect_success 'cone mode: set with core.ignoreCase=true' '
 	git -C repo sparse-checkout init --cone &&
 	git -C repo -c core.ignoreCase=true sparse-checkout set folder1 &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/folder1/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/folder1/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-	EOF
-	test_cmp expect dir
+	check_files repo "a folder1"
 '
 
 test_expect_success 'interaction with submodules' '
@@ -351,21 +291,8 @@ test_expect_success 'interaction with submodules' '
 		git sparse-checkout init --cone &&
 		git sparse-checkout set folder1
 	) &&
-	list_files super >dir &&
-	cat >expect <<-\EOF &&
-		a
-		folder1
-		modules
-	EOF
-	test_cmp expect dir &&
-	list_files super/modules/child >dir &&
-	cat >expect <<-\EOF &&
-		a
-		deep
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files super "a folder1 modules" &&
+	check_files super/modules/child "a deep folder1 folder2"
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH 2/8] sparse-checkout: create leading directories
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
  2020-01-14 19:25 ` [PATCH 1/8] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
@ 2020-01-14 19:25 ` Derrick Stolee via GitGitGadget
  2020-01-16 21:46   ` Junio C Hamano
  2020-01-14 19:25 ` [PATCH 3/8] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-14 19:25 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git init' command creates the ".git/info" directory and fills it
with some default files. However, 'git worktree add' does not create
the info directory for that worktree. This causes a problem when running
"git sparse-checkout init" inside a worktree. While care was taken to
allow the sparse-checkout config to be specific to a worktree, this
initialization was untested.

Safely create the leading directories for the sparse-checkout file. This
is the safest thing to do even without worktrees, as a user could delete
their ".git/info" directory and expect Git to recover safely.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          |  4 ++++
 t/t1091-sparse-checkout-builtin.sh | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index b3bed891cb..3cee8ab46e 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -199,6 +199,10 @@ static int write_patterns_and_update(struct pattern_list *pl)
 	int result;
 
 	sparse_filename = get_sparse_checkout_filename();
+
+	if (safe_create_leading_directories(sparse_filename))
+		die(_("failed to create directory for sparse-checkout file"));
+
 	fd = hold_lock_file_for_update(&lk, sparse_filename,
 				      LOCK_DIE_ON_ERROR);
 
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 20caefe155..37365dc668 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -295,4 +295,14 @@ test_expect_success 'interaction with submodules' '
 	check_files super/modules/child "a deep folder1 folder2"
 '
 
+test_expect_success 'different sparse-checkouts with worktrees' '
+	git -C repo worktree add --detach ../worktree &&
+	check_files worktree "a deep folder1 folder2" &&
+	git -C worktree sparse-checkout init --cone &&
+	git -C repo sparse-checkout set folder1 &&
+	git -C worktree sparse-checkout set deep/deeper1 &&
+	check_files repo "a folder1" &&
+	check_files worktree "a deep"
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH 3/8] clone: fix --sparse option with URLs
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
  2020-01-14 19:25 ` [PATCH 1/8] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
  2020-01-14 19:25 ` [PATCH 2/8] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
@ 2020-01-14 19:25 ` Derrick Stolee via GitGitGadget
  2020-01-14 19:30   ` Taylor Blau
  2020-01-14 19:25 ` [PATCH 4/8] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-14 19:25 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The --sparse option was added to the clone builtin in d89f09c (clone:
add --sparse mode, 2019-11-21) and was tested with a local path clone
in t1091-sparse-checkout-builtin.sh. However, due to a difference in
how local paths are handled versus URLs, this mechanism does not work
with URLs.

Modify the test to use a "file://" URL, which would output this error
before the code change:

  Cloning into 'clone'...
  fatal: cannot change to 'file://.../repo': No such file or directory
  error: failed to initialize sparse-checkout

These errors are due to using a "-C <path>" option to call 'git -C
<path> sparse-checkout init' but the URL is being given instead of
the target directory.

Update that target directory to evaluate this correctly. I have also
manually tested that https:// URLs are handled correctly as well.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/clone.c                    | 2 +-
 t/t1091-sparse-checkout-builtin.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 4348d962c9..2caefc44fb 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1130,7 +1130,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (option_required_reference.nr || option_optional_reference.nr)
 		setup_reference();
 
-	if (option_sparse_checkout && git_sparse_checkout_init(repo))
+	if (option_sparse_checkout && git_sparse_checkout_init(dir))
 		return 1;
 
 	remote = remote_get(option_origin);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 37365dc668..58d9c69163 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -90,7 +90,7 @@ test_expect_success 'init with existing sparse-checkout' '
 '
 
 test_expect_success 'clone --sparse' '
-	git clone --sparse repo clone &&
+	git clone --sparse "file://$(pwd)/repo" clone &&
 	git -C clone sparse-checkout list >actual &&
 	cat >expect <<-\EOF &&
 	/*
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH 4/8] sparse-checkout: cone mode does not recognize "**"
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                   ` (2 preceding siblings ...)
  2020-01-14 19:25 ` [PATCH 3/8] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
@ 2020-01-14 19:25 ` Derrick Stolee via GitGitGadget
  2020-01-14 21:16   ` Jeff King
  2020-01-14 19:25 ` [PATCH 5/8] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-14 19:25 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When core.sparseCheckoutCone is enabled, the 'git sparse-checkout set'
command creates a restricted set of possible patterns that are used
by a custom algorithm to quickly match those patterns.

If a user manually edits the sparse-checkout file, then they could
create patterns that do not match these expectations. The cone-mode
matching algorithm can return incorrect results. The solution is to
detect these incorrect patterns, warn that we do not recognize them,
and revert to the standard algorithm.

Check each pattern for the "**" substring, and revert to the old
logic if seen. While technically a "/<dir>/**" pattern matches
the meaning of "/<dir>/", it is not one that would be written by
the sparse-checkout builtin in cone mode. Attempting to accept that
pattern change complicates the logic and instead we punt and do
not accept any instance of "**".

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              |  7 ++++++
 t/t1091-sparse-checkout-builtin.sh | 34 ++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/dir.c b/dir.c
index 22d08e61c2..f8e350dda2 100644
--- a/dir.c
+++ b/dir.c
@@ -651,6 +651,13 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 		return;
 	}
 
+	if (strstr(given->pattern, "**")) {
+		/* Not a cone pattern. */
+		pl->use_cone_patterns = 0;
+		warning(_("unrecognized pattern: '%s'"), given->pattern);
+		goto clear_hashmaps;
+	}
+
 	if (given->patternlen > 2 &&
 	    !strcmp(given->pattern + given->patternlen - 2, "/*")) {
 		if (!(given->flags & PATTERN_FLAG_NEGATIVE)) {
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 58d9c69163..e532a52f89 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -305,4 +305,38 @@ test_expect_success 'different sparse-checkouts with worktrees' '
 	check_files worktree "a deep"
 '
 
+check_read_tree_errors () {
+	REPO=$1
+	FILES=$2
+	ERRORS=$3
+	git -C $REPO read-tree -mu HEAD 2>err &&
+	if test -z "$ERRORS"
+	then
+		test_must_be_empty err
+	else
+		test_i18ngrep "$ERRORS" err
+	fi &&
+	check_files $REPO "$FILES"
+}
+
+test_expect_success 'pattern-checks: /A/**' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/folder1/**
+	EOF
+	check_read_tree_errors repo "a folder1" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: /A/**/B/' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/deep/**/deepest
+	EOF
+	check_read_tree_errors repo "a deep" "disabling cone pattern matching" &&
+	check_files repo/deep "deeper1" &&
+	check_files repo/deep/deeper1 "deepest"
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH 5/8] sparse-checkout: detect short patterns
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                   ` (3 preceding siblings ...)
  2020-01-14 19:25 ` [PATCH 4/8] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
@ 2020-01-14 19:25 ` Derrick Stolee via GitGitGadget
  2020-01-14 19:26 ` [PATCH 6/8] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-14 19:25 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the shortest pattern the sparse-checkout command will
write into the sparse-checkout file is "/*". This is handled carefully
in add_pattern_to_hashsets(), so warn if any other pattern is this
short. This will assist future pattern checks by allowing us to assume
there are at least three characters in the pattern.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 3 ++-
 t/t1091-sparse-checkout-builtin.sh | 9 +++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index f8e350dda2..1c96ddf5e3 100644
--- a/dir.c
+++ b/dir.c
@@ -651,7 +651,8 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 		return;
 	}
 
-	if (strstr(given->pattern, "**")) {
+	if (given->patternlen <= 2 ||
+	    strstr(given->pattern, "**")) {
 		/* Not a cone pattern. */
 		pl->use_cone_patterns = 0;
 		warning(_("unrecognized pattern: '%s'"), given->pattern);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index e532a52f89..974a4fec8f 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -339,4 +339,13 @@ test_expect_success 'pattern-checks: /A/**/B/' '
 	check_files repo/deep/deeper1 "deepest"
 '
 
+test_expect_success 'pattern-checks: too short' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/a
+	EOF
+	check_read_tree_errors repo "a" "disabling cone pattern matching"
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH 6/8] sparse-checkout: warn on incorrect '*' in patterns
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                   ` (4 preceding siblings ...)
  2020-01-14 19:25 ` [PATCH 5/8] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
@ 2020-01-14 19:26 ` Derrick Stolee via GitGitGadget
  2020-01-14 19:26 ` [PATCH 7/8] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-14 19:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the sparse-checkout commmand will write patterns that
allow faster pattern matching. This matching only works if the patterns
in the sparse-checkout file are those written by that command. Users
can edit the sparse-checkout file and create patterns that cause the
cone mode matching to fail.

The cone mode patterns may end in "/*" but otherwise an un-escaped
asterisk is invalid. Add checks to disable cone mode when seeing these
values.

A later change will properly handle escaped asterisks.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 30 ++++++++++++++++++++++++++++++
 t/t1091-sparse-checkout-builtin.sh | 27 +++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/dir.c b/dir.c
index 1c96ddf5e3..150c05f4de 100644
--- a/dir.c
+++ b/dir.c
@@ -635,6 +635,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 	struct pattern_entry *translated;
 	char *truncated;
 	char *data = NULL;
+	const char *prev, *cur, *next;
 
 	if (!pl->use_cone_patterns)
 		return;
@@ -652,6 +653,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 	}
 
 	if (given->patternlen <= 2 ||
+	    *given->pattern == '*' ||
 	    strstr(given->pattern, "**")) {
 		/* Not a cone pattern. */
 		pl->use_cone_patterns = 0;
@@ -659,6 +661,34 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 		goto clear_hashmaps;
 	}
 
+	prev = given->pattern;
+	cur = given->pattern + 1;
+	next = given->pattern + 2;
+
+	while (*cur) {
+		/* We care about *cur == '*' */
+		if (*cur != '*')
+			goto increment;
+
+		/* But only if *prev != '\\' */
+		if (*prev == '\\')
+			goto increment;
+
+		/* But a trailing '/' then '*' is fine */
+		if (*prev == '/' && *next == 0)
+			goto increment;
+
+		/* Not a cone pattern. */
+		pl->use_cone_patterns = 0;
+		warning(_("unrecognized pattern: '%s'"), given->pattern);
+		goto clear_hashmaps;
+
+	increment:
+		prev++;
+		cur++;
+		next++;
+	}
+
 	if (given->patternlen > 2 &&
 	    !strcmp(given->pattern + given->patternlen - 2, "/*")) {
 		if (!(given->flags & PATTERN_FLAG_NEGATIVE)) {
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 974a4fec8f..5b50be53a4 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -348,4 +348,31 @@ test_expect_success 'pattern-checks: too short' '
 	check_read_tree_errors repo "a" "disabling cone pattern matching"
 '
 
+test_expect_success 'pattern-checks: trailing "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/a*
+	EOF
+	check_read_tree_errors repo "a" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: starting "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	*eep/
+	EOF
+	check_read_tree_errors repo "a deep" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: escaped "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/does\*not\*exist/
+	EOF
+	check_read_tree_errors repo "a" ""
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH 7/8] sparse-checkout: properly match escaped characters
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                   ` (5 preceding siblings ...)
  2020-01-14 19:26 ` [PATCH 6/8] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
@ 2020-01-14 19:26 ` Derrick Stolee via GitGitGadget
  2020-01-14 21:21   ` Jeff King
  2020-01-14 19:26 ` [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-14 19:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the sparse-checkout feature uses hashset containment
queries to match paths. Make this algorithm respect escaped asterisk
(*) and backslash (\) characters.

Create dup_and_filter_pattern() method to convert a pattern by
removing escape characters and dropping an optional "/*" at the end.
This method is available in dir.h as we will use it in
builtin/sparse-chekcout.c in a later change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 31 +++++++++++++++++++++++++++---
 dir.h                              |  1 +
 t/t1091-sparse-checkout-builtin.sh | 22 +++++++++++++++++----
 3 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index 150c05f4de..2840bacd40 100644
--- a/dir.c
+++ b/dir.c
@@ -630,6 +630,32 @@ int pl_hashmap_cmp(const void *unused_cmp_data,
 	return strncmp(ee1->pattern, ee2->pattern, min_len);
 }
 
+char *dup_and_filter_pattern(const char *pattern)
+{
+	char *set, *read;
+	char *result = xstrdup(pattern);
+
+	set = result;
+	read = result;
+
+	while (*read) {
+		/* skip escape characters (once) */
+		if (*read == '\\')
+			read++;
+
+		*set = *read;
+
+		set++;
+		read++;
+	}
+	*set = 0;
+
+	if (*(read - 2) == '/' && *(read - 1) == '*')
+		*(read - 2) = 0;
+
+	return result;
+}
+
 static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern *given)
 {
 	struct pattern_entry *translated;
@@ -698,8 +724,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 			goto clear_hashmaps;
 		}
 
-		truncated = xstrdup(given->pattern);
-		truncated[given->patternlen - 2] = 0;
+		truncated = dup_and_filter_pattern(given->pattern);
 
 		translated = xmalloc(sizeof(struct pattern_entry));
 		translated->pattern = truncated;
@@ -733,7 +758,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 
 	translated = xmalloc(sizeof(struct pattern_entry));
 
-	translated->pattern = xstrdup(given->pattern);
+	translated->pattern = dup_and_filter_pattern(given->pattern);
 	translated->patternlen = given->patternlen;
 	hashmap_entry_init(&translated->ent,
 			   ignore_case ?
diff --git a/dir.h b/dir.h
index 77a43dbf89..6dcd9d33e7 100644
--- a/dir.h
+++ b/dir.h
@@ -304,6 +304,7 @@ int pl_hashmap_cmp(const void *unused_cmp_data,
 		   const struct hashmap_entry *a,
 		   const struct hashmap_entry *b,
 		   const void *key);
+char *dup_and_filter_pattern(const char *pattern);
 int hashmap_contains_parent(struct hashmap *map,
 			    const char *path,
 			    struct strbuf *buffer);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 5b50be53a4..051c1f3bf2 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -366,13 +366,27 @@ test_expect_success 'pattern-checks: starting "*"' '
 	check_read_tree_errors repo "a deep" "disabling cone pattern matching"
 '
 
-test_expect_success 'pattern-checks: escaped "*"' '
-	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
+	git clone repo escaped &&
+	TREEOID=$(git -C escaped rev-parse HEAD:folder1) &&
+	NEWTREE=$(git -C escaped mktree <<-EOF
+	$(git -C escaped ls-tree HEAD)
+	040000 tree $TREEOID	zbad\\dir
+	040000 tree $TREEOID	zdoes*exist
+	EOF
+	) &&
+	COMMIT=$(git -C escaped commit-tree $NEWTREE -p HEAD) &&
+	git -C escaped reset --hard $COMMIT &&
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
+	git -C escaped sparse-checkout init --cone &&
+	cat >escaped/.git/info/sparse-checkout <<-\EOF &&
 	/*
 	!/*/
-	/does\*not\*exist/
+	/zbad\\dir/
+	/zdoes\*not\*exist/
+	/zdoes\*exist/
 	EOF
-	check_read_tree_errors repo "a" ""
+	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                   ` (6 preceding siblings ...)
  2020-01-14 19:26 ` [PATCH 7/8] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
@ 2020-01-14 19:26 ` Derrick Stolee via GitGitGadget
  2020-01-14 21:25   ` Jeff King
  2020-01-14 19:34 ` [PATCH 0/8] Harden the sparse-checkout builtin Taylor Blau
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-14 19:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

If a user somehow creates a directory with an asterisk (*) or backslash
(\), then the "git sparse-checkout set" command will struggle to provide
the correct pattern in the sparse-checkout file. When not in cone mode,
the provided pattern is written directly into the sparse-checkout file.
However, in cone mode we expect a list of paths to directories and then
we convert those into patterns.

Even more specifically, the goal is to always allow the following from
the root of a repo:

  git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin

The ls-tree command provides directory names with an unescaped asterisk.
It also quotes the directories that contain an escaped backslash. We
must remove these quotes, then keep the escaped backslashes.

However, there is some care needed for the timing of these escapes. The
in-memory pattern list is used to update the working directory before
writing the patterns to disk. Thus, we need the command to have the
unescaped names in the hashsets for the cone comparisons, then escape
the patterns later.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          | 48 ++++++++++++++++++++++++++++--
 t/t1091-sparse-checkout-builtin.sh | 21 +++++++++++--
 2 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 3cee8ab46e..61d2c30036 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -140,6 +140,22 @@ static int update_working_directory(struct pattern_list *pl)
 	return result;
 }
 
+static char *escaped_pattern(char *pattern)
+{
+	char *p = pattern;
+	struct strbuf final = STRBUF_INIT;
+
+	while (*p) {
+		if (*p == '*' || *p == '\\')
+			strbuf_addch(&final, '\\');
+
+		strbuf_addch(&final, *p);
+		p++;
+	}
+
+	return strbuf_detach(&final, NULL);
+}
+
 static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 {
 	int i;
@@ -164,10 +180,11 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 	fprintf(fp, "/*\n!/*/\n");
 
 	for (i = 0; i < sl.nr; i++) {
-		char *pattern = sl.items[i].string;
+		char *pattern = escaped_pattern(sl.items[i].string);
 
 		if (strlen(pattern))
 			fprintf(fp, "%s/\n!%s/*/\n", pattern, pattern);
+		free(pattern);
 	}
 
 	string_list_clear(&sl, 0);
@@ -185,8 +202,9 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 	string_list_remove_duplicates(&sl, 0);
 
 	for (i = 0; i < sl.nr; i++) {
-		char *pattern = sl.items[i].string;
+		char *pattern = escaped_pattern(sl.items[i].string);
 		fprintf(fp, "%s/\n", pattern);
+		free(pattern);
 	}
 }
 
@@ -337,7 +355,9 @@ static void insert_recursive_pattern(struct pattern_list *pl, struct strbuf *pat
 {
 	struct pattern_entry *e = xmalloc(sizeof(*e));
 	e->patternlen = path->len;
-	e->pattern = strbuf_detach(path, NULL);
+	e->pattern = dup_and_filter_pattern(path->buf);
+	strbuf_release(path);
+
 	hashmap_entry_init(&e->ent,
 			   ignore_case ?
 			   strihash(e->pattern) :
@@ -369,6 +389,7 @@ static void insert_recursive_pattern(struct pattern_list *pl, struct strbuf *pat
 
 static void strbuf_to_cone_pattern(struct strbuf *line, struct pattern_list *pl)
 {
+	int i;
 	strbuf_trim(line);
 
 	strbuf_trim_trailing_dir_sep(line);
@@ -376,6 +397,27 @@ static void strbuf_to_cone_pattern(struct strbuf *line, struct pattern_list *pl)
 	if (!line->len)
 		return;
 
+	for (i = 0; i < line->len; i++) {
+		if (line->buf[i] == '*') {
+			strbuf_insert(line, i, "\\", 1);
+			i++;
+		}
+
+		if (line->buf[i] == '\\') {
+			if (i < line->len - 1 && line->buf[i + 1] == '\\')
+				i++;
+			else
+				strbuf_insert(line, i, "\\", 1);
+
+			i++;
+		}
+	}
+
+	if (line->buf[0] == '"' && line->buf[line->len - 1] == '"') {
+		strbuf_remove(line, 0, 1);
+		strbuf_remove(line, line->len - 1, 1);
+	}
+
 	if (line->buf[0] != '/')
 		strbuf_insert(line, 0, "/", 1);
 
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 051c1f3bf2..3da7c10bd9 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -309,6 +309,9 @@ check_read_tree_errors () {
 	REPO=$1
 	FILES=$2
 	ERRORS=$3
+	git -C $REPO -c core.sparseCheckoutCone=false read-tree -mu HEAD 2>err &&
+	test_must_be_empty err &&
+	check_files $REPO "$FILES" &&
 	git -C $REPO read-tree -mu HEAD 2>err &&
 	if test -z "$ERRORS"
 	then
@@ -379,14 +382,28 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	git -C escaped reset --hard $COMMIT &&
 	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
 	git -C escaped sparse-checkout init --cone &&
-	cat >escaped/.git/info/sparse-checkout <<-\EOF &&
+	git -C escaped sparse-checkout set zbad\\dir zdoes\*not\*exist zdoes\*exist &&
+	cat >expect <<-\EOF &&
 	/*
 	!/*/
 	/zbad\\dir/
+	/zdoes\*exist/
 	/zdoes\*not\*exist/
+	EOF
+	test_cmp expect escaped/.git/info/sparse-checkout &&
+	check_read_tree_errors escaped "a zbad\\dir zdoes*exist" &&
+	git -C escaped ls-tree -d --name-only HEAD | git -C escaped sparse-checkout set --stdin &&
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
+	/folder1/
+	/folder2/
+	/zbad\\dir/
 	/zdoes\*exist/
 	EOF
-	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
+	test_cmp expect escaped/.git/info/sparse-checkout &&
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist"
 '
 
 test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH 3/8] clone: fix --sparse option with URLs
  2020-01-14 19:25 ` [PATCH 3/8] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
@ 2020-01-14 19:30   ` Taylor Blau
  0 siblings, 0 replies; 82+ messages in thread
From: Taylor Blau @ 2020-01-14 19:30 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, peff, newren, Derrick Stolee

Hi Stolee,

On Tue, Jan 14, 2020 at 07:25:57PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> The --sparse option was added to the clone builtin in d89f09c (clone:
> add --sparse mode, 2019-11-21) and was tested with a local path clone
> in t1091-sparse-checkout-builtin.sh. However, due to a difference in
> how local paths are handled versus URLs, this mechanism does not work
> with URLs.

As we discussed off-list, both of us (as well as Peff) were able to
reproduce this issue. I think that this paragraph is a good description
of what's going on heee.

> Modify the test to use a "file://" URL, which would output this error
> before the code change:
>
>   Cloning into 'clone'...
>   fatal: cannot change to 'file://.../repo': No such file or directory
>   error: failed to initialize sparse-checkout

Nice, this should give us confidence that there won't be a regression
here in the future. I don't think that the explanation is complicated
enough for a single commit which introduced an expected failure, so
grouping it all together in this patch seems good to me.

> These errors are due to using a "-C <path>" option to call 'git -C
> <path> sparse-checkout init' but the URL is being given instead of
> the target directory.
>
> Update that target directory to evaluate this correctly. I have also
> manually tested that https:// URLs are handled correctly as well.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/clone.c                    | 2 +-
>  t/t1091-sparse-checkout-builtin.sh | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 4348d962c9..2caefc44fb 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1130,7 +1130,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	if (option_required_reference.nr || option_optional_reference.nr)
>  		setup_reference();
>
> -	if (option_sparse_checkout && git_sparse_checkout_init(repo))
> +	if (option_sparse_checkout && git_sparse_checkout_init(dir))

I agree that 'dir' is the right thing to use here. It's the string we
read from to print "Cloning into ...", which always displays the
directory relative to the cwd. Looking at the implementation in
'git_sparse_checkout_init', this matches my understanding, too.

>  		return 1;
>
>  	remote = remote_get(option_origin);
> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
> index 37365dc668..58d9c69163 100755
> --- a/t/t1091-sparse-checkout-builtin.sh
> +++ b/t/t1091-sparse-checkout-builtin.sh
> @@ -90,7 +90,7 @@ test_expect_success 'init with existing sparse-checkout' '
>  '
>
>  test_expect_success 'clone --sparse' '
> -	git clone --sparse repo clone &&
> +	git clone --sparse "file://$(pwd)/repo" clone &&
>  	git -C clone sparse-checkout list >actual &&
>  	cat >expect <<-\EOF &&
>  	/*
> --
> gitgitgadget

This all looks good to me.

  Acked-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 0/8] Harden the sparse-checkout builtin
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                   ` (7 preceding siblings ...)
  2020-01-14 19:26 ` [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
@ 2020-01-14 19:34 ` Taylor Blau
  2020-01-14 19:44   ` Derrick Stolee
  2020-01-15 19:16 ` Junio C Hamano
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
  10 siblings, 1 reply; 82+ messages in thread
From: Taylor Blau @ 2020-01-14 19:34 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, peff, newren, Derrick Stolee

Hi Stolee,

On Tue, Jan 14, 2020 at 07:25:54PM +0000, Derrick Stolee via GitGitGadget wrote:
> This series is based on ds/sparse-list-in-cone-mode.
>
> This series attempts to clean up some rough edges in the sparse-checkout
> feature, especially around the cone mode.
>
> Unfortunately, after the v2.25.0 release, we noticed an issue with the "git
> clone --sparse" option when using a URL instead of a local path. This is
> fixed and properly tested here.

I haven't had a chance to look at the other patches (besides the one
that I have already reviewed on- and off-list), so take my comments here
with a grain of salt.

It's too bad that 'git clone --sparse' isn't working with URLs in
v2.25.0, but it happens, and I don't think that this is a grave-enough
issue to warrant a new release. At least, since '--sparse' is new in
v2.25.0, we're not breaking existing workflows that have already relied
on it.

And, since sparse-checkout is still relatively niche (at, least for now
;-)), I think that this not handling cloning from URLs is fine until
v2.26.0.

Of course, if there's ever another need for v2.25.1, I don't think that
this would *hurt* to release then, which is to say that we definitely
should have these patches in a release, soon, but I don't think that
there's a terrible sense of urgency in the meantime.

> Also, let's improve Git's response to these more complicated scenarios:
>
>  1. Running "git sparse-checkout init" in a worktree would complain because
>     the "info" dir doesn't exist.
>  2. Tracked paths that include "*" and "" in their filenames.
>  3. If a user edits the sparse-checkout file to have non-cone pattern, such
>     as "*" anywhere or "" in the wrong place, then we should respond
>     appropriately. That is: warn that the patterns are not cone-mode, then
>     revert to the old logic.
>
> Thanks, -Stolee
>
> Derrick Stolee (8):
>   t1091: use check_files to reduce boilerplate
>   sparse-checkout: create leading directories
>   clone: fix --sparse option with URLs
>   sparse-checkout: cone mode does not recognize "**"
>   sparse-checkout: detect short patterns
>   sparse-checkout: warn on incorrect '*' in patterns
>   sparse-checkout: properly match escaped characters
>   sparse-checkout: write escaped patterns in cone mode
>
>  builtin/clone.c                    |   2 +-
>  builtin/sparse-checkout.c          |  52 ++++-
>  dir.c                              |  69 ++++++-
>  dir.h                              |   1 +
>  t/t1091-sparse-checkout-builtin.sh | 320 ++++++++++++++++-------------
>  5 files changed, 296 insertions(+), 148 deletions(-)
>
>
> base-commit: 4fd683b6a35eabd23dd5183da7f654a1e1f00325
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-513%2Fderrickstolee%2Fsparse-harden-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-513/derrickstolee/sparse-harden-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/513
> --
> gitgitgadget
Thanks,
Taylor

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 0/8] Harden the sparse-checkout builtin
  2020-01-14 19:34 ` [PATCH 0/8] Harden the sparse-checkout builtin Taylor Blau
@ 2020-01-14 19:44   ` Derrick Stolee
  2020-01-14 21:31     ` Jeff King
  0 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee @ 2020-01-14 19:44 UTC (permalink / raw)
  To: Taylor Blau, Derrick Stolee via GitGitGadget
  Cc: git, peff, newren, Derrick Stolee

On 1/14/2020 2:34 PM, Taylor Blau wrote:
> Hi Stolee,
> 
> On Tue, Jan 14, 2020 at 07:25:54PM +0000, Derrick Stolee via GitGitGadget wrote:
>> This series is based on ds/sparse-list-in-cone-mode.
>>
>> This series attempts to clean up some rough edges in the sparse-checkout
>> feature, especially around the cone mode.
>>
>> Unfortunately, after the v2.25.0 release, we noticed an issue with the "git
>> clone --sparse" option when using a URL instead of a local path. This is
>> fixed and properly tested here.
> 
> I haven't had a chance to look at the other patches (besides the one
> that I have already reviewed on- and off-list), so take my comments here
> with a grain of salt.
> 
> It's too bad that 'git clone --sparse' isn't working with URLs in
> v2.25.0, but it happens, and I don't think that this is a grave-enough
> issue to warrant a new release. At least, since '--sparse' is new in
> v2.25.0, we're not breaking existing workflows that have already relied
> on it.
> 
> And, since sparse-checkout is still relatively niche (at, least for now
> ;-)), I think that this not handling cloning from URLs is fine until
> v2.26.0.

Since we've already worked out the workaround to be:

	git clone --no-checkout <url> <dir>
	cd <dir>
	git sparse-checkout init --cone

there is no rush to fix this. Users _may_ discover the --sparse option
from the clone docs and complain, but we can point them to the above
directions for now.

> Of course, if there's ever another need for v2.25.1, I don't think that
> this would *hurt* to release then, which is to say that we definitely
> should have these patches in a release, soon, but I don't think that
> there's a terrible sense of urgency in the meantime.

I wouldn't complain to have patches 1-3 in an otherwise-warranted .1 release.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 4/8] sparse-checkout: cone mode does not recognize "**"
  2020-01-14 19:25 ` [PATCH 4/8] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
@ 2020-01-14 21:16   ` Jeff King
  0 siblings, 0 replies; 82+ messages in thread
From: Jeff King @ 2020-01-14 21:16 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, newren, Derrick Stolee

On Tue, Jan 14, 2020 at 07:25:58PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> When core.sparseCheckoutCone is enabled, the 'git sparse-checkout set'
> command creates a restricted set of possible patterns that are used
> by a custom algorithm to quickly match those patterns.
> 
> If a user manually edits the sparse-checkout file, then they could
> create patterns that do not match these expectations. The cone-mode
> matching algorithm can return incorrect results. The solution is to
> detect these incorrect patterns, warn that we do not recognize them,
> and revert to the standard algorithm.
> 
> Check each pattern for the "**" substring, and revert to the old
> logic if seen. While technically a "/<dir>/**" pattern matches
> the meaning of "/<dir>/", it is not one that would be written by
> the sparse-checkout builtin in cone mode. Attempting to accept that
> pattern change complicates the logic and instead we punt and do
> not accept any instance of "**".

That all makes sense.

> diff --git a/dir.c b/dir.c
> index 22d08e61c2..f8e350dda2 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -651,6 +651,13 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
>  		return;
>  	}
>  
> +	if (strstr(given->pattern, "**")) {
> +		/* Not a cone pattern. */
> +		pl->use_cone_patterns = 0;
> +		warning(_("unrecognized pattern: '%s'"), given->pattern);
> +		goto clear_hashmaps;
> +	}

The clear_hashmaps label already unsets pl->use_cone_patterns, so the
first line is redundant (the same is true of existing goto jumps, as
well, though).

I wondered whether this warning could be triggered accidentally by
somebody who just happened to add such a pattern. But we'd exit
immediately from add_pattern_to_hashsets() immediately unless the user
has set core.sparseCheckoutCone. And if that's set, then warning is
definitely the right thing to do.

-Peff

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 7/8] sparse-checkout: properly match escaped characters
  2020-01-14 19:26 ` [PATCH 7/8] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
@ 2020-01-14 21:21   ` Jeff King
  2020-01-14 22:08     ` Derrick Stolee
  0 siblings, 1 reply; 82+ messages in thread
From: Jeff King @ 2020-01-14 21:21 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, newren, Derrick Stolee

On Tue, Jan 14, 2020 at 07:26:01PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> In cone mode, the sparse-checkout feature uses hashset containment
> queries to match paths. Make this algorithm respect escaped asterisk
> (*) and backslash (\) characters.
> 
> Create dup_and_filter_pattern() method to convert a pattern by
> removing escape characters and dropping an optional "/*" at the end.
> This method is available in dir.h as we will use it in
> builtin/sparse-chekcout.c in a later change.

s/chekcout/checkout/

It took me a minute to understand the problem here, but I think it's: if
a path in the sparse-checkout file has "\*" in it, we'd try to match a
literal "\*" in the hash, not "*"?

But we wouldn't run into that yet because we don't properly _write_ the
escaped names until patch 8.

Is that right?

-Peff

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode
  2020-01-14 19:26 ` [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
@ 2020-01-14 21:25   ` Jeff King
  2020-01-14 22:11     ` Derrick Stolee
  0 siblings, 1 reply; 82+ messages in thread
From: Jeff King @ 2020-01-14 21:25 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, newren, Derrick Stolee

On Tue, Jan 14, 2020 at 07:26:02PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> If a user somehow creates a directory with an asterisk (*) or backslash
> (\), then the "git sparse-checkout set" command will struggle to provide
> the correct pattern in the sparse-checkout file. When not in cone mode,
> the provided pattern is written directly into the sparse-checkout file.
> However, in cone mode we expect a list of paths to directories and then
> we convert those into patterns.
> 
> Even more specifically, the goal is to always allow the following from
> the root of a repo:
> 
>   git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin
> 
> The ls-tree command provides directory names with an unescaped asterisk.
> It also quotes the directories that contain an escaped backslash. We
> must remove these quotes, then keep the escaped backslashes.

Do we need to document these rules somewhere? Naively I'd expect
"--stdin" to take in literal pathnames. But of course it can't represent
a path with a newline. So perhaps it makes sense to take quoted names by
default, and allow literal NUL-separated input with "-z" if anybody
wants it.

-Peff

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 0/8] Harden the sparse-checkout builtin
  2020-01-14 19:44   ` Derrick Stolee
@ 2020-01-14 21:31     ` Jeff King
  0 siblings, 0 replies; 82+ messages in thread
From: Jeff King @ 2020-01-14 21:31 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Taylor Blau, Derrick Stolee via GitGitGadget, git, newren,
	Derrick Stolee

On Tue, Jan 14, 2020 at 02:44:50PM -0500, Derrick Stolee wrote:

> Since we've already worked out the workaround to be:
> 
> 	git clone --no-checkout <url> <dir>
> 	cd <dir>
> 	git sparse-checkout init --cone
> 
> there is no rush to fix this. Users _may_ discover the --sparse option
> from the clone docs and complain, but we can point them to the above
> directions for now.

Yes, but part of the beauty of the new system is just having to say
"--sparse" to make something useful happen. :)

> > Of course, if there's ever another need for v2.25.1, I don't think that
> > this would *hurt* to release then, which is to say that we definitely
> > should have these patches in a release, soon, but I don't think that
> > there's a terrible sense of urgency in the meantime.
> 
> I wouldn't complain to have patches 1-3 in an otherwise-warranted .1 release.

Agreed. 1-3 look obviously correct to me. The quoting bits I'm a little
more fuzzy on, just because I haven't really looked hard into cone mode.
Ditto for the "disable cone mode" checks.

My gut instinct is that you should be able to deduce whether the pattern
hashmap can be used purely from the patterns you see, and
core.sparseCheckoutCone would not be needed (and so if you violate the
rules by writing something manual, then it just gets slower; or maybe
we're even able to apply the literal cone-mode rules quickly and handle
the other separately). But it's much more likely I'm showing off my lack
of knowledge of the details of the problem space. You can feel free to
educate me, and/or roll your eyes and ignore me if this was already
discussed earlier.

By the way, I did notice this while poking about, which could go on top
(or hopefully be lumped in with the 1-3 as "obviously correct"):

-- >8 --

Subject: [PATCH] sparse-checkout: fix documentation typo for core.sparseCheckoutCone

Signed-off-by: Jeff King <peff@peff.net>
---
 Documentation/git-sparse-checkout.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
index 974ade2238..285893b069 100644
--- a/Documentation/git-sparse-checkout.txt
+++ b/Documentation/git-sparse-checkout.txt
@@ -106,7 +106,7 @@ The full pattern set allows for arbitrary pattern matches and complicated
 inclusion/exclusion rules. These can result in O(N*M) pattern matches when
 updating the index, where N is the number of patterns and M is the number
 of paths in the index. To combat this performance issue, a more restricted
-pattern set is allowed when `core.spareCheckoutCone` is enabled.
+pattern set is allowed when `core.sparseCheckoutCone` is enabled.
 
 The accepted patterns in the cone pattern set are:
 
-- 
2.25.0.639.gb9b1511416


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH 7/8] sparse-checkout: properly match escaped characters
  2020-01-14 21:21   ` Jeff King
@ 2020-01-14 22:08     ` Derrick Stolee
  0 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee @ 2020-01-14 22:08 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: git, me, newren, Derrick Stolee

On 1/14/2020 4:21 PM, Jeff King wrote:
> On Tue, Jan 14, 2020 at 07:26:01PM +0000, Derrick Stolee via GitGitGadget wrote:
> 
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> In cone mode, the sparse-checkout feature uses hashset containment
>> queries to match paths. Make this algorithm respect escaped asterisk
>> (*) and backslash (\) characters.
>>
>> Create dup_and_filter_pattern() method to convert a pattern by
>> removing escape characters and dropping an optional "/*" at the end.
>> This method is available in dir.h as we will use it in
>> builtin/sparse-chekcout.c in a later change.
> 
> s/chekcout/checkout/

Thanks.

> It took me a minute to understand the problem here, but I think it's: if
> a path in the sparse-checkout file has "\*" in it, we'd try to match a
> literal "\*" in the hash, not "*"?

Yes, the hashset would have the string "\*" instead of the string "*". This
would lead to missing directories when cone mode is enabled compared to
cone mode not being enabled.
 
> But we wouldn't run into that yet because we don't properly _write_ the
> escaped names until patch 8.

We wouldn't run into it when using the builtin, but also a user could
edit their sparse-checkout file manually OR figure out how to get the
"right" pattern by running "git sparse-checkout set "my\\*dir" (where the
escaped backslash is collapsed by the shell and Git sees "my\*dir".

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode
  2020-01-14 21:25   ` Jeff King
@ 2020-01-14 22:11     ` Derrick Stolee
  2020-01-14 22:48       ` Jeff King
  0 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee @ 2020-01-14 22:11 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: git, me, newren, Derrick Stolee

On 1/14/2020 4:25 PM, Jeff King wrote:
> On Tue, Jan 14, 2020 at 07:26:02PM +0000, Derrick Stolee via GitGitGadget wrote:
> 
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> If a user somehow creates a directory with an asterisk (*) or backslash
>> (\), then the "git sparse-checkout set" command will struggle to provide
>> the correct pattern in the sparse-checkout file. When not in cone mode,
>> the provided pattern is written directly into the sparse-checkout file.
>> However, in cone mode we expect a list of paths to directories and then
>> we convert those into patterns.
>>
>> Even more specifically, the goal is to always allow the following from
>> the root of a repo:
>>
>>   git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin
>>
>> The ls-tree command provides directory names with an unescaped asterisk.
>> It also quotes the directories that contain an escaped backslash. We
>> must remove these quotes, then keep the escaped backslashes.
> 
> Do we need to document these rules somewhere? Naively I'd expect
> "--stdin" to take in literal pathnames. But of course it can't represent
> a path with a newline. So perhaps it makes sense to take quoted names by
> default, and allow literal NUL-separated input with "-z" if anybody
> wants it.

This is worth thinking about the right way to describe the rules:

1. You don't _need_ quotes. They happen to come along for the ride in
  'git ls-tree' so it doesn't mess up shell scripts that iterate on
  those entries. At least, that's why I think they are quoted.

2. If you use quotes, the first layer of quotes will be removed.

How much of this needs to be documented explicitly, or how much should
we say "The input format matches what we would expect from 'git ls-tree
--name-only'"?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode
  2020-01-14 22:11     ` Derrick Stolee
@ 2020-01-14 22:48       ` Jeff King
  2020-01-24 21:10         ` Derrick Stolee
  0 siblings, 1 reply; 82+ messages in thread
From: Jeff King @ 2020-01-14 22:48 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, me, newren, Derrick Stolee

On Tue, Jan 14, 2020 at 05:11:03PM -0500, Derrick Stolee wrote:

> > Do we need to document these rules somewhere? Naively I'd expect
> > "--stdin" to take in literal pathnames. But of course it can't represent
> > a path with a newline. So perhaps it makes sense to take quoted names by
> > default, and allow literal NUL-separated input with "-z" if anybody
> > wants it.
> 
> This is worth thinking about the right way to describe the rules:
> 
> 1. You don't _need_ quotes. They happen to come along for the ride in
>   'git ls-tree' so it doesn't mess up shell scripts that iterate on
>   those entries. At least, that's why I think they are quoted.

It's not just shell scripts. Without quoting, the syntax becomes
ambiguous (e.g., imagine a file with a newline in it). So most Git
output that shows a filename will quote it if necessary, unless
NUL separators are being used.

> 2. If you use quotes, the first layer of quotes will be removed.

I take this to mean that anything starting with a double-quote will have
the outer layer removed, and backslash escapes inside expanded. And
anything without a starting double quote (even if it has internal
backslash escapes!) will be taken literally.

That would match how things like "update-index --index-info" work.

As far as implementation, I know you're trying to keep some of the
escaping, but I think it might make more sense to do use
unquote_c_style() to parse the input (see update-index's use for some
prior art), and then re-quote as necessary to put things into the
sparse-checkout file (I guess quoting more than just quote_c_style()
would do, since you need to quote glob metacharacters like '*' and
probably "!"). But as much as possible, I think you'd want literal
strings inside the program, and just quoting/unquoting at the edges.

> How much of this needs to be documented explicitly, or how much should
> we say "The input format matches what we would expect from 'git ls-tree
> --name-only'"?

I think it's fine to say that, and maybe call attention to the quoting.
Like:

  The input format matches the output of `git ls-tree --name-only`. This
  includes interpreting pathnames that begin with a double quote (") as
  C-style quoted strings.

Disappointingly, update-index does not seem to explain the rules
anywhere. fast-import does cover it. Maybe it's something that ought to
be hoisted out into gitcli(7) or similar (or maybe it has been and I
just can't find it).

-Peff

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 0/8] Harden the sparse-checkout builtin
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                   ` (8 preceding siblings ...)
  2020-01-14 19:34 ` [PATCH 0/8] Harden the sparse-checkout builtin Taylor Blau
@ 2020-01-15 19:16 ` Junio C Hamano
  2020-01-15 20:32   ` Derrick Stolee
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
  10 siblings, 1 reply; 82+ messages in thread
From: Junio C Hamano @ 2020-01-15 19:16 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, peff, newren, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Also, let's improve Git's response to these more complicated scenarios:
>
>  1. Running "git sparse-checkout init" in a worktree would complain because
>     the "info" dir doesn't exist.
>  2. Tracked paths that include "*" and "" in their filenames.
>  3. If a user edits the sparse-checkout file to have non-cone pattern, such
>     as "*" anywhere or "" in the wrong place, then we should respond
>     appropriately. That is: warn that the patterns are not cone-mode, then
>     revert to the old logic.

It seems somebody ate a letter to make "<some letter>" into "" an
empty string, so I cannot quite grok the above list---two out of
three bullet points are not quite readable.


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 0/8] Harden the sparse-checkout builtin
  2020-01-15 19:16 ` Junio C Hamano
@ 2020-01-15 20:32   ` Derrick Stolee
  0 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee @ 2020-01-15 20:32 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee via GitGitGadget
  Cc: git, me, peff, newren, Derrick Stolee

On 1/15/2020 2:16 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> Also, let's improve Git's response to these more complicated scenarios:
>>
>>  1. Running "git sparse-checkout init" in a worktree would complain because
>>     the "info" dir doesn't exist.
>>  2. Tracked paths that include "*" and "" in their filenames.
>>  3. If a user edits the sparse-checkout file to have non-cone pattern, such
>>     as "*" anywhere or "" in the wrong place, then we should respond
>>     appropriately. That is: warn that the patterns are not cone-mode, then
>>     revert to the old logic.
> 
> It seems somebody ate a letter to make "<some letter>" into "" an
> empty string, so I cannot quite grok the above list---two out of
> three bullet points are not quite readable.

In 2., the "" should be "\".

In 3., the "*" and "" should be "**" and "*" respectively.

These are things that are being collapsed by GitHub's markdown
processing. Sorry that this affects GitGitGadget's cover letter.

By escaping appropriately, these show up correctly and hopefully
will be fixed in v2.

-Stolee

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 1/8] t1091: use check_files to reduce boilerplate
  2020-01-14 19:25 ` [PATCH 1/8] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
@ 2020-01-16 21:40   ` Junio C Hamano
  0 siblings, 0 replies; 82+ messages in thread
From: Junio C Hamano @ 2020-01-16 21:40 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, peff, newren, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> When testing the sparse-checkout feature, we need to compare the
> contents of the working-directory against some expected output.
> Using here-docs was useful in the beginning, but became repetetive
> as the test script grew.
>
> Create a check_files helper to make the tests simpler and easier
> to extend. It also reduces instances of bad here-doc whitespace.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  t/t1091-sparse-checkout-builtin.sh | 215 ++++++++++-------------------
>  1 file changed, 71 insertions(+), 144 deletions(-)
>
> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
> index ff7f8f7a1f..20caefe155 100755
> --- a/t/t1091-sparse-checkout-builtin.sh
> +++ b/t/t1091-sparse-checkout-builtin.sh
> @@ -12,6 +12,13 @@ list_files() {
>  	(cd "$1" && printf '%s\n' *)
>  }
>  
> +check_files() {
> +	DIR=$1
> +	printf "%s\n" $2 >expect &&
> +	list_files $DIR >actual  &&

It is unclear if the script is being deliberate or sloppy.

It turns out that not quoting $2 is deliberate (i.e. it wants to
pass more than one words in $2, have them split at $IFS and show
each of them on a separate line), at the same time not quoting $DIR
is simply sloppy.

And it is totally unnecessary to confuse readers like this.

Unless you plan to extend this helper further, I think this would be
much less burdensome to the readers:

        check_files () {
                list_files "$1" >actual  &&
                shift &&
                printf "%s\n" "$@" >expect &&
                test_cmp expect actual
        }

This ...

>  	test_cmp expect repo/.git/info/sparse-checkout &&
> -	list_files repo >dir  &&
> -	cat >expect <<-EOF &&
> -		a
> -		folder1
> -		folder2
> -	EOF
> -	test_cmp expect dir
> +	check_files repo "a folder1 folder2"

... is a kind of change that the log message advertises, which is a
very nice rewrite.

And ...

>  test_expect_success 'clone --sparse' '
>  	git clone --sparse repo clone &&
>  	git -C clone sparse-checkout list >actual &&
> -	cat >expect <<-EOF &&
> -		/*
> -		!/*/
> +	cat >expect <<-\EOF &&
> +	/*
> +	!/*/
>  	EOF

... this is a style-fix that is another nice rewrite but in a
different category.  I wonder if they should be done in separate
commits.

Other than that, makes sense.

Thanks.

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 2/8] sparse-checkout: create leading directories
  2020-01-14 19:25 ` [PATCH 2/8] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
@ 2020-01-16 21:46   ` Junio C Hamano
  0 siblings, 0 replies; 82+ messages in thread
From: Junio C Hamano @ 2020-01-16 21:46 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, peff, newren, Derrick Stolee

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> The 'git init' command creates the ".git/info" directory and fills it
> with some default files. However, 'git worktree add' does not create
> the info directory for that worktree. This causes a problem when running
> "git sparse-checkout init" inside a worktree. While care was taken to
> allow the sparse-checkout config to be specific to a worktree, this
> initialization was untested.
>
> Safely create the leading directories for the sparse-checkout file. This
> is the safest thing to do even without worktrees, as a user could delete
> their ".git/info" directory and expect Git to recover safely.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/sparse-checkout.c          |  4 ++++
>  t/t1091-sparse-checkout-builtin.sh | 10 ++++++++++
>  2 files changed, 14 insertions(+)
>
> diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
> index b3bed891cb..3cee8ab46e 100644
> --- a/builtin/sparse-checkout.c
> +++ b/builtin/sparse-checkout.c
> @@ -199,6 +199,10 @@ static int write_patterns_and_update(struct pattern_list *pl)
>  	int result;
>  
>  	sparse_filename = get_sparse_checkout_filename();
> +
> +	if (safe_create_leading_directories(sparse_filename))
> +		die(_("failed to create directory for sparse-checkout file"));
> +

The use of safe_create_leading_directories() here, which uses
adjust_shared_perm(), is the right thing to do.

Looks good.

> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
> index 20caefe155..37365dc668 100755
> --- a/t/t1091-sparse-checkout-builtin.sh
> +++ b/t/t1091-sparse-checkout-builtin.sh
> @@ -295,4 +295,14 @@ test_expect_success 'interaction with submodules' '
>  	check_files super/modules/child "a deep folder1 folder2"
>  '
>  
> +test_expect_success 'different sparse-checkouts with worktrees' '
> +	git -C repo worktree add --detach ../worktree &&
> +	check_files worktree "a deep folder1 folder2" &&
> +	git -C worktree sparse-checkout init --cone &&
> +	git -C repo sparse-checkout set folder1 &&
> +	git -C worktree sparse-checkout set deep/deeper1 &&
> +	check_files repo "a folder1" &&
> +	check_files worktree "a deep"
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode
  2020-01-14 22:48       ` Jeff King
@ 2020-01-24 21:10         ` Derrick Stolee
  2020-01-24 21:42           ` Jeff King
  0 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee @ 2020-01-24 21:10 UTC (permalink / raw)
  To: Jeff King
  Cc: Derrick Stolee via GitGitGadget, git, me, newren, Derrick Stolee

On 1/14/2020 5:48 PM, Jeff King wrote:
> On Tue, Jan 14, 2020 at 05:11:03PM -0500, Derrick Stolee wrote:
> 
>>> Do we need to document these rules somewhere? Naively I'd expect
>>> "--stdin" to take in literal pathnames. But of course it can't represent
>>> a path with a newline. So perhaps it makes sense to take quoted names by
>>> default, and allow literal NUL-separated input with "-z" if anybody
>>> wants it.
>>
>> This is worth thinking about the right way to describe the rules:
>>
>> 1. You don't _need_ quotes. They happen to come along for the ride in
>>   'git ls-tree' so it doesn't mess up shell scripts that iterate on
>>   those entries. At least, that's why I think they are quoted.
> 
> It's not just shell scripts. Without quoting, the syntax becomes
> ambiguous (e.g., imagine a file with a newline in it). So most Git
> output that shows a filename will quote it if necessary, unless
> NUL separators are being used.

Good to know.

>> 2. If you use quotes, the first layer of quotes will be removed.
> 
> I take this to mean that anything starting with a double-quote will have
> the outer layer removed, and backslash escapes inside expanded. And
> anything without a starting double quote (even if it has internal
> backslash escapes!) will be taken literally.

Hm. Perhaps you are right! The ls-tree output for the test example
is:

	deep
	folder1
	folder2
	"zbad\\dir"
	zdoes*exist

so the "zdoes*exist" value is not escaped. I believe the current
logic does something extra: consider supplying this input to
'git sparse-checkout set --stdin':

	deep
	folder1
	folder2
	"zbad\\dir"
	zdoes\*exist

then should we un-escape "\*" to "*"? Or is this not a valid input
because a backslash should have been quoted into C-style quotes?

The behavior in the current series allows this output that would
never be written by "git ls-tree".

> That would match how things like "update-index --index-info" work.
> 
> As far as implementation, I know you're trying to keep some of the
> escaping, but I think it might make more sense to do use
> unquote_c_style() to parse the input (see update-index's use for some
> prior art), and then re-quote as necessary to put things into the
> sparse-checkout file (I guess quoting more than just quote_c_style()
> would do, since you need to quote glob metacharacters like '*' and
> probably "!"). But as much as possible, I think you'd want literal
> strings inside the program, and just quoting/unquoting at the edges.

I was playing around with this, and I think that quote_c_style() is
necessary for the output, but we have a strange in-memory situation
for the other escaping: we both fill the hashsets with the un-escaped
data and fill the pattern list with the escaped patterns.

I'll add a commit with the quote_c_style() calls during the 'list'
subcommand.

>> How much of this needs to be documented explicitly, or how much should
>> we say "The input format matches what we would expect from 'git ls-tree
>> --name-only'"?
> 
> I think it's fine to say that, and maybe call attention to the quoting.
> Like:
> 
>   The input format matches the output of `git ls-tree --name-only`. This
>   includes interpreting pathnames that begin with a double quote (") as
>   C-style quoted strings.
> 
> Disappointingly, update-index does not seem to explain the rules
> anywhere. fast-import does cover it. Maybe it's something that ought to
> be hoisted out into gitcli(7) or similar (or maybe it has been and I
> just can't find it).

I'll start the process by using your recommended language. I noticed
also that the 'set' command doesn't actually document what happens
when in cone mode, so I will correct that, too.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v2 00/12] Harden the sparse-checkout builtin
  2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                   ` (9 preceding siblings ...)
  2020-01-15 19:16 ` Junio C Hamano
@ 2020-01-24 21:19 ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
                     ` (12 more replies)
  10 siblings, 13 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee

This series is based on ds/sparse-list-in-cone-mode.

This series attempts to clean up some rough edges in the sparse-checkout
feature, especially around the cone mode.

Unfortunately, after the v2.25.0 release, we noticed an issue with the "git
clone --sparse" option when using a URL instead of a local path. This is
fixed and properly tested here.

Also, let's improve Git's response to these more complicated scenarios:

 1. Running "git sparse-checkout init" in a worktree would complain because
    the "info" dir doesn't exist.
 2. Tracked paths that include "*" and "\" in their filenames.
 3. If a user edits the sparse-checkout file to have non-cone pattern, such
    as "**" anywhere or "*" in the wrong place, then we should respond
    appropriately. That is: warn that the patterns are not cone-mode, then
    revert to the old logic.

Updates in V2:

 * Added C-style quoting to the output of "git sparse-checkout list" in cone
   mode.
 * Improved documentation.
 * Responded to most style feedback. Hopefully I didn't miss anything.
 * I was lingering on this a little to see if I could also fix the issue
   raised in [1], but I have not figured that one out, yet.

[1] 
https://lore.kernel.org/git/062301d5d0bc$c3e17760$4ba46620$@Frontier.com/

Thanks, -Stolee

Derrick Stolee (11):
  t1091: use check_files to reduce boilerplate
  t1091: improve here-docs
  sparse-checkout: create leading directories
  clone: fix --sparse option with URLs
  sparse-checkout: cone mode does not recognize "**"
  sparse-checkout: detect short patterns
  sparse-checkout: warn on incorrect '*' in patterns
  sparse-checkout: properly match escaped characters
  sparse-checkout: write escaped patterns in cone mode
  sparse-checkout: use C-style quotes in 'list' subcommand
  sparse-checkout: improve docs around 'set' in cone mode

Jeff King (1):
  sparse-checkout: fix documentation typo for core.sparseCheckoutCone

 Documentation/git-sparse-checkout.txt |  19 +-
 builtin/clone.c                       |   2 +-
 builtin/sparse-checkout.c             |  59 ++++-
 dir.c                                 |  68 +++++-
 dir.h                                 |   1 +
 t/t1091-sparse-checkout-builtin.sh    | 323 +++++++++++++++-----------
 6 files changed, 317 insertions(+), 155 deletions(-)


base-commit: 4fd683b6a35eabd23dd5183da7f654a1e1f00325
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-513%2Fderrickstolee%2Fsparse-harden-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-513/derrickstolee/sparse-harden-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/513

Range-diff vs v1:

  -:  ---------- >  1:  1cc825412f t1091: use check_files to reduce boilerplate
  1:  9f7791ae5e !  2:  b7a6ad145a t1091: use check_files to reduce boilerplate
     @@ -1,14 +1,11 @@
      Author: Derrick Stolee <dstolee@microsoft.com>
      
     -    t1091: use check_files to reduce boilerplate
     +    t1091: improve here-docs
      
     -    When testing the sparse-checkout feature, we need to compare the
     -    contents of the working-directory against some expected output.
     -    Using here-docs was useful in the beginning, but became repetetive
     -    as the test script grew.
     -
     -    Create a check_files helper to make the tests simpler and easier
     -    to extend. It also reduces instances of bad here-doc whitespace.
     +    t1091-sparse-checkout-builtin.sh uses here-docs to populate the
     +    expected contents of the sparse-checkout file. These do not use
     +    shell interpolation, so use "-\EOF" instead of "-EOF". Also use
     +    proper tabbing.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
     @@ -16,20 +13,6 @@
       --- a/t/t1091-sparse-checkout-builtin.sh
       +++ b/t/t1091-sparse-checkout-builtin.sh
      @@
     - 	(cd "$1" && printf '%s\n' *)
     - }
     - 
     -+check_files() {
     -+	DIR=$1
     -+	printf "%s\n" $2 >expect &&
     -+	list_files $DIR >actual &&
     -+	test_cmp expect actual
     -+}
     -+
     - test_expect_success 'setup' '
     - 	git init repo &&
     - 	(
     -@@
       
       test_expect_success 'git sparse-checkout list (populated)' '
       	test_when_finished rm -f repo/.git/info/sparse-checkout &&
     @@ -59,11 +42,7 @@
       	EOF
       	test_cmp expect repo/.git/info/sparse-checkout &&
       	test_cmp_config -C repo true core.sparsecheckout &&
     --	list_files repo >dir  &&
     --	echo a >expect &&
     --	test_cmp expect dir
     -+	check_files repo a
     - '
     +@@
       
       test_expect_success 'git sparse-checkout list after init' '
       	git -C repo sparse-checkout list >actual &&
     @@ -90,16 +69,8 @@
      +	*folder*
       	EOF
       	test_cmp expect repo/.git/info/sparse-checkout &&
     --	list_files repo >dir  &&
     --	cat >expect <<-EOF &&
     --		a
     --		folder1
     --		folder2
     --	EOF
     --	test_cmp expect dir
     -+	check_files repo "a folder1 folder2"
     - '
     - 
     + 	check_files repo a folder1 folder2
     +@@
       test_expect_success 'clone --sparse' '
       	git clone --sparse repo clone &&
       	git -C clone sparse-checkout list >actual &&
     @@ -111,13 +82,7 @@
      +	!/*/
       	EOF
       	test_cmp expect actual &&
     --	list_files clone >dir &&
     --	echo a >expect &&
     --	test_cmp expect dir
     -+	check_files clone a
     - '
     - 
     - test_expect_success 'set enables config' '
     + 	check_files clone a
      @@
       
       test_expect_success 'set sparse-checkout using builtin' '
     @@ -133,15 +98,7 @@
       	EOF
       	git -C repo sparse-checkout list >actual &&
       	test_cmp expect actual &&
     - 	test_cmp expect repo/.git/info/sparse-checkout &&
     --	list_files repo >dir  &&
     --	cat >expect <<-EOF &&
     --		a
     --		folder1
     --		folder2
     --	EOF
     --	test_cmp expect dir
     -+	check_files repo "a folder1 folder2"
     +@@
       '
       
       test_expect_success 'set sparse-checkout using --stdin' '
     @@ -158,72 +115,10 @@
       	EOF
       	git -C repo sparse-checkout set --stdin <expect &&
       	git -C repo sparse-checkout list >actual &&
     - 	test_cmp expect actual &&
     - 	test_cmp expect repo/.git/info/sparse-checkout &&
     --	list_files repo >dir  &&
     --	cat >expect <<-EOF &&
     --		a
     --		folder1
     --		folder2
     --	EOF
     --	test_cmp expect dir
     -+	check_files repo "a folder1 folder2"
     - '
     - 
     - test_expect_success 'cone mode: match patterns' '
     -@@
     - 	git -C repo read-tree -mu HEAD 2>err &&
     - 	test_i18ngrep ! "disabling cone patterns" err &&
     - 	git -C repo reset --hard &&
     --	list_files repo >dir  &&
     --	cat >expect <<-EOF &&
     --		a
     --		folder1
     --		folder2
     --	EOF
     --	test_cmp expect dir
     -+	check_files repo "a folder1 folder2"
     - '
     - 
     - test_expect_success 'cone mode: warn on bad pattern' '
      @@
     - 	test_path_is_file repo/.git/info/sparse-checkout &&
     - 	git -C repo config --list >config &&
     - 	test_must_fail git config core.sparseCheckout &&
     --	list_files repo >dir &&
     --	cat >expect <<-EOF &&
     --		a
     --		deep
     --		folder1
     --		folder2
     --	EOF
     --	test_cmp expect dir
     -+	check_files repo "a deep folder1 folder2"
     - '
     - 
     - test_expect_success 'cone mode: init and set' '
     -@@
     - 	test_cmp expect dir &&
     - 	git -C repo sparse-checkout set deep/deeper1/deepest/ 2>err &&
     - 	test_must_be_empty err &&
     --	list_files repo >dir  &&
     --	cat >expect <<-EOF &&
     --		a
     --		deep
     --	EOF
     --	test_cmp expect dir &&
     --	list_files repo/deep >dir  &&
     --	cat >expect <<-EOF &&
     --		a
     --		deeper1
     --	EOF
     --	test_cmp expect dir &&
     --	list_files repo/deep/deeper1 >dir  &&
     --	cat >expect <<-EOF &&
     --		a
     --		deepest
     --	EOF
     --	test_cmp expect dir &&
     + 	check_files repo a deep &&
     + 	check_files repo/deep a deeper1 &&
     + 	check_files repo/deep/deeper1 a deepest &&
      -	cat >expect <<-EOF &&
      -		/*
      -		!/*/
     @@ -232,9 +127,6 @@
      -		/deep/deeper1/
      -		!/deep/deeper1/*/
      -		/deep/deeper1/deepest/
     -+	check_files repo "a deep" &&
     -+	check_files repo/deep "a deeper1" &&
     -+	check_files repo/deep/deeper1 "a deepest" &&
      +	cat >expect <<-\EOF &&
      +	/*
      +	!/*/
     @@ -253,14 +145,7 @@
      +	folder2
       	EOF
       	test_must_be_empty err &&
     --	cat >expect <<-EOF &&
     --		a
     --		folder1
     --		folder2
     --	EOF
     --	list_files repo >dir &&
     --	test_cmp expect dir
     -+	check_files repo "a folder1 folder2"
     + 	check_files repo a folder1 folder2
       '
       
       test_expect_success 'cone mode: list' '
     @@ -288,21 +173,6 @@
       	EOF
       	test_cmp repo/.git/info/sparse-checkout expect
       '
     -@@
     - 	test_must_fail git -C repo sparse-checkout set deep/deeper1 2>err &&
     - 	test_i18ngrep "cannot set sparse-checkout patterns" err &&
     - 	test_cmp repo/.git/info/sparse-checkout expect &&
     --	list_files repo/deep >dir &&
     --	cat >expect <<-EOF &&
     --		a
     --		deeper1
     --		deeper2
     --	EOF
     --	test_cmp dir expect
     -+	check_files repo/deep "a deeper1 deeper2"
     - '
     - 
     - test_expect_success 'revert to old sparse-checkout on empty update' '
      @@
       test_expect_success 'cone mode: set with core.ignoreCase=true' '
       	git -C repo sparse-checkout init --cone &&
     @@ -317,37 +187,4 @@
      +	/folder1/
       	EOF
       	test_cmp expect repo/.git/info/sparse-checkout &&
     --	list_files repo >dir &&
     --	cat >expect <<-EOF &&
     --		a
     --		folder1
     --	EOF
     --	test_cmp expect dir
     -+	check_files repo "a folder1"
     - '
     - 
     - test_expect_success 'interaction with submodules' '
     -@@
     - 		git sparse-checkout init --cone &&
     - 		git sparse-checkout set folder1
     - 	) &&
     --	list_files super >dir &&
     --	cat >expect <<-\EOF &&
     --		a
     --		folder1
     --		modules
     --	EOF
     --	test_cmp expect dir &&
     --	list_files super/modules/child >dir &&
     --	cat >expect <<-\EOF &&
     --		a
     --		deep
     --		folder1
     --		folder2
     --	EOF
     --	test_cmp expect dir
     -+	check_files super "a folder1 modules" &&
     -+	check_files super/modules/child "a deep folder1 folder2"
     - '
     - 
     - test_done
     + 	check_files repo a folder1
  2:  53a266f9aa !  3:  5497ad8778 sparse-checkout: create leading directories
     @@ -34,7 +34,7 @@
       --- a/t/t1091-sparse-checkout-builtin.sh
       +++ b/t/t1091-sparse-checkout-builtin.sh
      @@
     - 	check_files super/modules/child "a deep folder1 folder2"
     + 	check_files super/modules/child a deep folder1 folder2
       '
       
      +test_expect_success 'different sparse-checkouts with worktrees' '
     @@ -43,8 +43,8 @@
      +	git -C worktree sparse-checkout init --cone &&
      +	git -C repo sparse-checkout set folder1 &&
      +	git -C worktree sparse-checkout set deep/deeper1 &&
     -+	check_files repo "a folder1" &&
     -+	check_files worktree "a deep"
     ++	check_files repo a folder1 &&
     ++	check_files worktree a deep
      +'
      +
       test_done
  3:  3ef8e021a5 !  4:  4991a51f6d clone: fix --sparse option with URLs
     @@ -22,6 +22,7 @@
          Update that target directory to evaluate this correctly. I have also
          manually tested that https:// URLs are handled correctly as well.
      
     +    Acked-by: Taylor Blau <me@ttaylorr.com>
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       diff --git a/builtin/clone.c b/builtin/clone.c
  -:  ---------- >  5:  ae78c3069b sparse-checkout: fix documentation typo for core.sparseCheckoutCone
  4:  dfa7e20444 !  6:  2ad4d3e467 sparse-checkout: cone mode does not recognize "**"
     @@ -30,7 +30,6 @@
       
      +	if (strstr(given->pattern, "**")) {
      +		/* Not a cone pattern. */
     -+		pl->use_cone_patterns = 0;
      +		warning(_("unrecognized pattern: '%s'"), given->pattern);
      +		goto clear_hashmaps;
      +	}
     @@ -38,12 +37,17 @@
       	if (given->patternlen > 2 &&
       	    !strcmp(given->pattern + given->patternlen - 2, "/*")) {
       		if (!(given->flags & PATTERN_FLAG_NEGATIVE)) {
     + 			/* Not a cone pattern. */
     +-			pl->use_cone_patterns = 0;
     + 			warning(_("unrecognized pattern: '%s'"), given->pattern);
     + 			goto clear_hashmaps;
     + 		}
      
       diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
       --- a/t/t1091-sparse-checkout-builtin.sh
       +++ b/t/t1091-sparse-checkout-builtin.sh
      @@
     - 	check_files worktree "a deep"
     + 	check_files worktree a deep
       '
       
      +check_read_tree_errors () {
     @@ -57,7 +61,7 @@
      +	else
      +		test_i18ngrep "$ERRORS" err
      +	fi &&
     -+	check_files $REPO "$FILES"
     ++	check_files $REPO $FILES
      +}
      +
      +test_expect_success 'pattern-checks: /A/**' '
  5:  9be49908fd !  7:  aace064510 sparse-checkout: detect short patterns
     @@ -21,8 +21,8 @@
      +	if (given->patternlen <= 2 ||
      +	    strstr(given->pattern, "**")) {
       		/* Not a cone pattern. */
     - 		pl->use_cone_patterns = 0;
       		warning(_("unrecognized pattern: '%s'"), given->pattern);
     + 		goto clear_hashmaps;
      
       diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
       --- a/t/t1091-sparse-checkout-builtin.sh
  6:  77a514f50b !  8:  d2a510a3bb sparse-checkout: warn on incorrect '*' in patterns
     @@ -34,8 +34,7 @@
      +	    *given->pattern == '*' ||
       	    strstr(given->pattern, "**")) {
       		/* Not a cone pattern. */
     - 		pl->use_cone_patterns = 0;
     -@@
     + 		warning(_("unrecognized pattern: '%s'"), given->pattern);
       		goto clear_hashmaps;
       	}
       
     @@ -57,7 +56,6 @@
      +			goto increment;
      +
      +		/* Not a cone pattern. */
     -+		pl->use_cone_patterns = 0;
      +		warning(_("unrecognized pattern: '%s'"), given->pattern);
      +		goto clear_hashmaps;
      +
  7:  09dbe1f902 !  9:  65c53d7526 sparse-checkout: properly match escaped characters
     @@ -9,7 +9,7 @@
          Create dup_and_filter_pattern() method to convert a pattern by
          removing escape characters and dropping an optional "/*" at the end.
          This method is available in dir.h as we will use it in
     -    builtin/sparse-chekcout.c in a later change.
     +    builtin/sparse-checkout.c in a later change.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
  8:  79b6e9a565 = 10:  c27a17a2fc sparse-checkout: write escaped patterns in cone mode
  -:  ---------- > 11:  526d5becbc sparse-checkout: use C-style quotes in 'list' subcommand
  -:  ---------- > 12:  1b5858adee sparse-checkout: improve docs around 'set' in cone mode

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v2 01/12] t1091: use check_files to reduce boilerplate
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 02/12] t1091: improve here-docs Derrick Stolee via GitGitGadget
                     ` (11 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When testing the sparse-checkout feature, we need to compare the
contents of the working-directory against some expected output.
Using here-docs was useful in the beginning, but became repetetive
as the test script grew.

Create a check_files helper to make the tests simpler and easier
to extend. It also reduces instances of bad here-doc whitespace.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1091-sparse-checkout-builtin.sh | 117 ++++++-----------------------
 1 file changed, 22 insertions(+), 95 deletions(-)

diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index ff7f8f7a1f..e058a20ad6 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -12,6 +12,13 @@ list_files() {
 	(cd "$1" && printf '%s\n' *)
 }
 
+check_files() {
+	list_files "$1" >actual &&
+	shift &&
+	printf "%s\n" $@ >expect &&
+	test_cmp expect actual
+}
+
 test_expect_success 'setup' '
 	git init repo &&
 	(
@@ -58,9 +65,7 @@ test_expect_success 'git sparse-checkout init' '
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	test_cmp_config -C repo true core.sparsecheckout &&
-	list_files repo >dir  &&
-	echo a >expect &&
-	test_cmp expect dir
+	check_files repo a
 '
 
 test_expect_success 'git sparse-checkout list after init' '
@@ -81,13 +86,7 @@ test_expect_success 'init with existing sparse-checkout' '
 		*folder*
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'clone --sparse' '
@@ -98,9 +97,7 @@ test_expect_success 'clone --sparse' '
 		!/*/
 	EOF
 	test_cmp expect actual &&
-	list_files clone >dir &&
-	echo a >expect &&
-	test_cmp expect dir
+	check_files clone a
 '
 
 test_expect_success 'set enables config' '
@@ -127,13 +124,7 @@ test_expect_success 'set sparse-checkout using builtin' '
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'set sparse-checkout using --stdin' '
@@ -147,13 +138,7 @@ test_expect_success 'set sparse-checkout using --stdin' '
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo "a folder1 folder2"
 '
 
 test_expect_success 'cone mode: match patterns' '
@@ -162,13 +147,7 @@ test_expect_success 'cone mode: match patterns' '
 	git -C repo read-tree -mu HEAD 2>err &&
 	test_i18ngrep ! "disabling cone patterns" err &&
 	git -C repo reset --hard &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'cone mode: warn on bad pattern' '
@@ -185,14 +164,7 @@ test_expect_success 'sparse-checkout disable' '
 	test_path_is_file repo/.git/info/sparse-checkout &&
 	git -C repo config --list >config &&
 	test_must_fail git config core.sparseCheckout &&
-	list_files repo >dir &&
-	cat >expect <<-EOF &&
-		a
-		deep
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a deep folder1 folder2
 '
 
 test_expect_success 'cone mode: init and set' '
@@ -204,24 +176,9 @@ test_expect_success 'cone mode: init and set' '
 	test_cmp expect dir &&
 	git -C repo sparse-checkout set deep/deeper1/deepest/ 2>err &&
 	test_must_be_empty err &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deep
-	EOF
-	test_cmp expect dir &&
-	list_files repo/deep >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deeper1
-	EOF
-	test_cmp expect dir &&
-	list_files repo/deep/deeper1 >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deepest
-	EOF
-	test_cmp expect dir &&
+	check_files repo a deep &&
+	check_files repo/deep a deeper1 &&
+	check_files repo/deep/deeper1 a deepest &&
 	cat >expect <<-EOF &&
 		/*
 		!/*/
@@ -237,13 +194,7 @@ test_expect_success 'cone mode: init and set' '
 		folder2
 	EOF
 	test_must_be_empty err &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	list_files repo >dir &&
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'cone mode: list' '
@@ -275,13 +226,7 @@ test_expect_success 'revert to old sparse-checkout on bad update' '
 	test_must_fail git -C repo sparse-checkout set deep/deeper1 2>err &&
 	test_i18ngrep "cannot set sparse-checkout patterns" err &&
 	test_cmp repo/.git/info/sparse-checkout expect &&
-	list_files repo/deep >dir &&
-	cat >expect <<-EOF &&
-		a
-		deeper1
-		deeper2
-	EOF
-	test_cmp dir expect
+	check_files repo/deep a deeper1 deeper2
 '
 
 test_expect_success 'revert to old sparse-checkout on empty update' '
@@ -332,12 +277,7 @@ test_expect_success 'cone mode: set with core.ignoreCase=true' '
 		/folder1/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1
 '
 
 test_expect_success 'interaction with submodules' '
@@ -351,21 +291,8 @@ test_expect_success 'interaction with submodules' '
 		git sparse-checkout init --cone &&
 		git sparse-checkout set folder1
 	) &&
-	list_files super >dir &&
-	cat >expect <<-\EOF &&
-		a
-		folder1
-		modules
-	EOF
-	test_cmp expect dir &&
-	list_files super/modules/child >dir &&
-	cat >expect <<-\EOF &&
-		a
-		deep
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files super a folder1 modules &&
+	check_files super/modules/child a deep folder1 folder2
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 02/12] t1091: improve here-docs
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 03/12] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
                     ` (10 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

t1091-sparse-checkout-builtin.sh uses here-docs to populate the
expected contents of the sparse-checkout file. These do not use
shell interpolation, so use "-\EOF" instead of "-EOF". Also use
proper tabbing.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1091-sparse-checkout-builtin.sh | 98 +++++++++++++++---------------
 1 file changed, 49 insertions(+), 49 deletions(-)

diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index e058a20ad6..e28e1c797f 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -46,11 +46,11 @@ test_expect_success 'git sparse-checkout list (empty)' '
 
 test_expect_success 'git sparse-checkout list (populated)' '
 	test_when_finished rm -f repo/.git/info/sparse-checkout &&
-	cat >repo/.git/info/sparse-checkout <<-EOF &&
-		/folder1/*
-		/deep/
-		**/a
-		!*bin*
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/folder1/*
+	/deep/
+	**/a
+	!*bin*
 	EOF
 	cp repo/.git/info/sparse-checkout expect &&
 	git -C repo sparse-checkout list >list &&
@@ -59,9 +59,9 @@ test_expect_success 'git sparse-checkout list (populated)' '
 
 test_expect_success 'git sparse-checkout init' '
 	git -C repo sparse-checkout init &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	test_cmp_config -C repo true core.sparsecheckout &&
@@ -70,9 +70,9 @@ test_expect_success 'git sparse-checkout init' '
 
 test_expect_success 'git sparse-checkout list after init' '
 	git -C repo sparse-checkout list >actual &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect actual
 '
@@ -80,10 +80,10 @@ test_expect_success 'git sparse-checkout list after init' '
 test_expect_success 'init with existing sparse-checkout' '
 	echo "*folder*" >> repo/.git/info/sparse-checkout &&
 	git -C repo sparse-checkout init &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		*folder*
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	*folder*
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	check_files repo a folder1 folder2
@@ -92,9 +92,9 @@ test_expect_success 'init with existing sparse-checkout' '
 test_expect_success 'clone --sparse' '
 	git clone --sparse repo clone &&
 	git -C clone sparse-checkout list >actual &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect actual &&
 	check_files clone a
@@ -116,10 +116,10 @@ test_expect_success 'set enables config' '
 
 test_expect_success 'set sparse-checkout using builtin' '
 	git -C repo sparse-checkout set "/*" "!/*/" "*folder*" &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		*folder*
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	*folder*
 	EOF
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
@@ -128,11 +128,11 @@ test_expect_success 'set sparse-checkout using builtin' '
 '
 
 test_expect_success 'set sparse-checkout using --stdin' '
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/folder1/
-		/folder2/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/folder1/
+	/folder2/
 	EOF
 	git -C repo sparse-checkout set --stdin <expect &&
 	git -C repo sparse-checkout list >actual &&
@@ -179,28 +179,28 @@ test_expect_success 'cone mode: init and set' '
 	check_files repo a deep &&
 	check_files repo/deep a deeper1 &&
 	check_files repo/deep/deeper1 a deepest &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/deep/
-		!/deep/*/
-		/deep/deeper1/
-		!/deep/deeper1/*/
-		/deep/deeper1/deepest/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
+	!/deep/*/
+	/deep/deeper1/
+	!/deep/deeper1/*/
+	/deep/deeper1/deepest/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	git -C repo sparse-checkout set --stdin 2>err <<-EOF &&
-		folder1
-		folder2
+	git -C repo sparse-checkout set --stdin 2>err <<-\EOF &&
+	folder1
+	folder2
 	EOF
 	test_must_be_empty err &&
 	check_files repo a folder1 folder2
 '
 
 test_expect_success 'cone mode: list' '
-	cat >expect <<-EOF &&
-		folder1
-		folder2
+	cat >expect <<-\EOF &&
+	folder1
+	folder2
 	EOF
 	git -C repo sparse-checkout set --stdin <expect &&
 	git -C repo sparse-checkout list >actual 2>err &&
@@ -211,10 +211,10 @@ test_expect_success 'cone mode: list' '
 test_expect_success 'cone mode: set with nested folders' '
 	git -C repo sparse-checkout set deep deep/deeper1/deepest 2>err &&
 	test_line_count = 0 err &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/deep/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
 	EOF
 	test_cmp repo/.git/info/sparse-checkout expect
 '
@@ -271,10 +271,10 @@ test_expect_success 'sparse-checkout (init|set|disable) fails with dirty status'
 test_expect_success 'cone mode: set with core.ignoreCase=true' '
 	git -C repo sparse-checkout init --cone &&
 	git -C repo -c core.ignoreCase=true sparse-checkout set folder1 &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/folder1/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/folder1/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	check_files repo a folder1
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 03/12] sparse-checkout: create leading directories
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 02/12] t1091: improve here-docs Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 04/12] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
                     ` (9 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git init' command creates the ".git/info" directory and fills it
with some default files. However, 'git worktree add' does not create
the info directory for that worktree. This causes a problem when running
"git sparse-checkout init" inside a worktree. While care was taken to
allow the sparse-checkout config to be specific to a worktree, this
initialization was untested.

Safely create the leading directories for the sparse-checkout file. This
is the safest thing to do even without worktrees, as a user could delete
their ".git/info" directory and expect Git to recover safely.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          |  4 ++++
 t/t1091-sparse-checkout-builtin.sh | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index b3bed891cb..3cee8ab46e 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -199,6 +199,10 @@ static int write_patterns_and_update(struct pattern_list *pl)
 	int result;
 
 	sparse_filename = get_sparse_checkout_filename();
+
+	if (safe_create_leading_directories(sparse_filename))
+		die(_("failed to create directory for sparse-checkout file"));
+
 	fd = hold_lock_file_for_update(&lk, sparse_filename,
 				      LOCK_DIE_ON_ERROR);
 
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index e28e1c797f..43d1f7520c 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -295,4 +295,14 @@ test_expect_success 'interaction with submodules' '
 	check_files super/modules/child a deep folder1 folder2
 '
 
+test_expect_success 'different sparse-checkouts with worktrees' '
+	git -C repo worktree add --detach ../worktree &&
+	check_files worktree "a deep folder1 folder2" &&
+	git -C worktree sparse-checkout init --cone &&
+	git -C repo sparse-checkout set folder1 &&
+	git -C worktree sparse-checkout set deep/deeper1 &&
+	check_files repo a folder1 &&
+	check_files worktree a deep
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 04/12] clone: fix --sparse option with URLs
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
                     ` (2 preceding siblings ...)
  2020-01-24 21:19   ` [PATCH v2 03/12] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
                     ` (8 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The --sparse option was added to the clone builtin in d89f09c (clone:
add --sparse mode, 2019-11-21) and was tested with a local path clone
in t1091-sparse-checkout-builtin.sh. However, due to a difference in
how local paths are handled versus URLs, this mechanism does not work
with URLs.

Modify the test to use a "file://" URL, which would output this error
before the code change:

  Cloning into 'clone'...
  fatal: cannot change to 'file://.../repo': No such file or directory
  error: failed to initialize sparse-checkout

These errors are due to using a "-C <path>" option to call 'git -C
<path> sparse-checkout init' but the URL is being given instead of
the target directory.

Update that target directory to evaluate this correctly. I have also
manually tested that https:// URLs are handled correctly as well.

Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/clone.c                    | 2 +-
 t/t1091-sparse-checkout-builtin.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 4348d962c9..2caefc44fb 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1130,7 +1130,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (option_required_reference.nr || option_optional_reference.nr)
 		setup_reference();
 
-	if (option_sparse_checkout && git_sparse_checkout_init(repo))
+	if (option_sparse_checkout && git_sparse_checkout_init(dir))
 		return 1;
 
 	remote = remote_get(option_origin);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 43d1f7520c..cf4a595c86 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -90,7 +90,7 @@ test_expect_success 'init with existing sparse-checkout' '
 '
 
 test_expect_success 'clone --sparse' '
-	git clone --sparse repo clone &&
+	git clone --sparse "file://$(pwd)/repo" clone &&
 	git -C clone sparse-checkout list >actual &&
 	cat >expect <<-\EOF &&
 	/*
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
                     ` (3 preceding siblings ...)
  2020-01-24 21:19   ` [PATCH v2 04/12] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Jeff King via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 06/12] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
                     ` (7 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Jeff King via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Jeff King

From: Jeff King <peff@peff.net>

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-sparse-checkout.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
index 3b341cf0fc..4834fb434d 100644
--- a/Documentation/git-sparse-checkout.txt
+++ b/Documentation/git-sparse-checkout.txt
@@ -106,7 +106,7 @@ The full pattern set allows for arbitrary pattern matches and complicated
 inclusion/exclusion rules. These can result in O(N*M) pattern matches when
 updating the index, where N is the number of patterns and M is the number
 of paths in the index. To combat this performance issue, a more restricted
-pattern set is allowed when `core.spareCheckoutCone` is enabled.
+pattern set is allowed when `core.sparseCheckoutCone` is enabled.
 
 The accepted patterns in the cone pattern set are:
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 06/12] sparse-checkout: cone mode does not recognize "**"
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
                     ` (4 preceding siblings ...)
  2020-01-24 21:19   ` [PATCH v2 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 07/12] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
                     ` (6 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When core.sparseCheckoutCone is enabled, the 'git sparse-checkout set'
command creates a restricted set of possible patterns that are used
by a custom algorithm to quickly match those patterns.

If a user manually edits the sparse-checkout file, then they could
create patterns that do not match these expectations. The cone-mode
matching algorithm can return incorrect results. The solution is to
detect these incorrect patterns, warn that we do not recognize them,
and revert to the standard algorithm.

Check each pattern for the "**" substring, and revert to the old
logic if seen. While technically a "/<dir>/**" pattern matches
the meaning of "/<dir>/", it is not one that would be written by
the sparse-checkout builtin in cone mode. Attempting to accept that
pattern change complicates the logic and instead we punt and do
not accept any instance of "**".

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              |  7 +++++-
 t/t1091-sparse-checkout-builtin.sh | 34 ++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 22d08e61c2..40fed73a94 100644
--- a/dir.c
+++ b/dir.c
@@ -651,11 +651,16 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 		return;
 	}
 
+	if (strstr(given->pattern, "**")) {
+		/* Not a cone pattern. */
+		warning(_("unrecognized pattern: '%s'"), given->pattern);
+		goto clear_hashmaps;
+	}
+
 	if (given->patternlen > 2 &&
 	    !strcmp(given->pattern + given->patternlen - 2, "/*")) {
 		if (!(given->flags & PATTERN_FLAG_NEGATIVE)) {
 			/* Not a cone pattern. */
-			pl->use_cone_patterns = 0;
 			warning(_("unrecognized pattern: '%s'"), given->pattern);
 			goto clear_hashmaps;
 		}
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index cf4a595c86..e2e45dc7fd 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -305,4 +305,38 @@ test_expect_success 'different sparse-checkouts with worktrees' '
 	check_files worktree a deep
 '
 
+check_read_tree_errors () {
+	REPO=$1
+	FILES=$2
+	ERRORS=$3
+	git -C $REPO read-tree -mu HEAD 2>err &&
+	if test -z "$ERRORS"
+	then
+		test_must_be_empty err
+	else
+		test_i18ngrep "$ERRORS" err
+	fi &&
+	check_files $REPO $FILES
+}
+
+test_expect_success 'pattern-checks: /A/**' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/folder1/**
+	EOF
+	check_read_tree_errors repo "a folder1" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: /A/**/B/' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/deep/**/deepest
+	EOF
+	check_read_tree_errors repo "a deep" "disabling cone pattern matching" &&
+	check_files repo/deep "deeper1" &&
+	check_files repo/deep/deeper1 "deepest"
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 07/12] sparse-checkout: detect short patterns
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
                     ` (5 preceding siblings ...)
  2020-01-24 21:19   ` [PATCH v2 06/12] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 08/12] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
                     ` (5 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the shortest pattern the sparse-checkout command will
write into the sparse-checkout file is "/*". This is handled carefully
in add_pattern_to_hashsets(), so warn if any other pattern is this
short. This will assist future pattern checks by allowing us to assume
there are at least three characters in the pattern.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 3 ++-
 t/t1091-sparse-checkout-builtin.sh | 9 +++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 40fed73a94..c2e585607e 100644
--- a/dir.c
+++ b/dir.c
@@ -651,7 +651,8 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 		return;
 	}
 
-	if (strstr(given->pattern, "**")) {
+	if (given->patternlen <= 2 ||
+	    strstr(given->pattern, "**")) {
 		/* Not a cone pattern. */
 		warning(_("unrecognized pattern: '%s'"), given->pattern);
 		goto clear_hashmaps;
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index e2e45dc7fd..2e57534799 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -339,4 +339,13 @@ test_expect_success 'pattern-checks: /A/**/B/' '
 	check_files repo/deep/deeper1 "deepest"
 '
 
+test_expect_success 'pattern-checks: too short' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/a
+	EOF
+	check_read_tree_errors repo "a" "disabling cone pattern matching"
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 08/12] sparse-checkout: warn on incorrect '*' in patterns
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
                     ` (6 preceding siblings ...)
  2020-01-24 21:19   ` [PATCH v2 07/12] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
                     ` (4 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the sparse-checkout commmand will write patterns that
allow faster pattern matching. This matching only works if the patterns
in the sparse-checkout file are those written by that command. Users
can edit the sparse-checkout file and create patterns that cause the
cone mode matching to fail.

The cone mode patterns may end in "/*" but otherwise an un-escaped
asterisk is invalid. Add checks to disable cone mode when seeing these
values.

A later change will properly handle escaped asterisks.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 29 +++++++++++++++++++++++++++++
 t/t1091-sparse-checkout-builtin.sh | 27 +++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

diff --git a/dir.c b/dir.c
index c2e585607e..7cb78c8b87 100644
--- a/dir.c
+++ b/dir.c
@@ -635,6 +635,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 	struct pattern_entry *translated;
 	char *truncated;
 	char *data = NULL;
+	const char *prev, *cur, *next;
 
 	if (!pl->use_cone_patterns)
 		return;
@@ -652,12 +653,40 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 	}
 
 	if (given->patternlen <= 2 ||
+	    *given->pattern == '*' ||
 	    strstr(given->pattern, "**")) {
 		/* Not a cone pattern. */
 		warning(_("unrecognized pattern: '%s'"), given->pattern);
 		goto clear_hashmaps;
 	}
 
+	prev = given->pattern;
+	cur = given->pattern + 1;
+	next = given->pattern + 2;
+
+	while (*cur) {
+		/* We care about *cur == '*' */
+		if (*cur != '*')
+			goto increment;
+
+		/* But only if *prev != '\\' */
+		if (*prev == '\\')
+			goto increment;
+
+		/* But a trailing '/' then '*' is fine */
+		if (*prev == '/' && *next == 0)
+			goto increment;
+
+		/* Not a cone pattern. */
+		warning(_("unrecognized pattern: '%s'"), given->pattern);
+		goto clear_hashmaps;
+
+	increment:
+		prev++;
+		cur++;
+		next++;
+	}
+
 	if (given->patternlen > 2 &&
 	    !strcmp(given->pattern + given->patternlen - 2, "/*")) {
 		if (!(given->flags & PATTERN_FLAG_NEGATIVE)) {
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 2e57534799..470900f6f4 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -348,4 +348,31 @@ test_expect_success 'pattern-checks: too short' '
 	check_read_tree_errors repo "a" "disabling cone pattern matching"
 '
 
+test_expect_success 'pattern-checks: trailing "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/a*
+	EOF
+	check_read_tree_errors repo "a" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: starting "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	*eep/
+	EOF
+	check_read_tree_errors repo "a deep" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: escaped "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/does\*not\*exist/
+	EOF
+	check_read_tree_errors repo "a" ""
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 09/12] sparse-checkout: properly match escaped characters
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
                     ` (7 preceding siblings ...)
  2020-01-24 21:19   ` [PATCH v2 08/12] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
                     ` (3 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the sparse-checkout feature uses hashset containment
queries to match paths. Make this algorithm respect escaped asterisk
(*) and backslash (\) characters.

Create dup_and_filter_pattern() method to convert a pattern by
removing escape characters and dropping an optional "/*" at the end.
This method is available in dir.h as we will use it in
builtin/sparse-checkout.c in a later change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 31 +++++++++++++++++++++++++++---
 dir.h                              |  1 +
 t/t1091-sparse-checkout-builtin.sh | 22 +++++++++++++++++----
 3 files changed, 47 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index 7cb78c8b87..6d8abc09c3 100644
--- a/dir.c
+++ b/dir.c
@@ -630,6 +630,32 @@ int pl_hashmap_cmp(const void *unused_cmp_data,
 	return strncmp(ee1->pattern, ee2->pattern, min_len);
 }
 
+char *dup_and_filter_pattern(const char *pattern)
+{
+	char *set, *read;
+	char *result = xstrdup(pattern);
+
+	set = result;
+	read = result;
+
+	while (*read) {
+		/* skip escape characters (once) */
+		if (*read == '\\')
+			read++;
+
+		*set = *read;
+
+		set++;
+		read++;
+	}
+	*set = 0;
+
+	if (*(read - 2) == '/' && *(read - 1) == '*')
+		*(read - 2) = 0;
+
+	return result;
+}
+
 static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern *given)
 {
 	struct pattern_entry *translated;
@@ -695,8 +721,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 			goto clear_hashmaps;
 		}
 
-		truncated = xstrdup(given->pattern);
-		truncated[given->patternlen - 2] = 0;
+		truncated = dup_and_filter_pattern(given->pattern);
 
 		translated = xmalloc(sizeof(struct pattern_entry));
 		translated->pattern = truncated;
@@ -730,7 +755,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 
 	translated = xmalloc(sizeof(struct pattern_entry));
 
-	translated->pattern = xstrdup(given->pattern);
+	translated->pattern = dup_and_filter_pattern(given->pattern);
 	translated->patternlen = given->patternlen;
 	hashmap_entry_init(&translated->ent,
 			   ignore_case ?
diff --git a/dir.h b/dir.h
index 77a43dbf89..6dcd9d33e7 100644
--- a/dir.h
+++ b/dir.h
@@ -304,6 +304,7 @@ int pl_hashmap_cmp(const void *unused_cmp_data,
 		   const struct hashmap_entry *a,
 		   const struct hashmap_entry *b,
 		   const void *key);
+char *dup_and_filter_pattern(const char *pattern);
 int hashmap_contains_parent(struct hashmap *map,
 			    const char *path,
 			    struct strbuf *buffer);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 470900f6f4..0a21a5e15d 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -366,13 +366,27 @@ test_expect_success 'pattern-checks: starting "*"' '
 	check_read_tree_errors repo "a deep" "disabling cone pattern matching"
 '
 
-test_expect_success 'pattern-checks: escaped "*"' '
-	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
+	git clone repo escaped &&
+	TREEOID=$(git -C escaped rev-parse HEAD:folder1) &&
+	NEWTREE=$(git -C escaped mktree <<-EOF
+	$(git -C escaped ls-tree HEAD)
+	040000 tree $TREEOID	zbad\\dir
+	040000 tree $TREEOID	zdoes*exist
+	EOF
+	) &&
+	COMMIT=$(git -C escaped commit-tree $NEWTREE -p HEAD) &&
+	git -C escaped reset --hard $COMMIT &&
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
+	git -C escaped sparse-checkout init --cone &&
+	cat >escaped/.git/info/sparse-checkout <<-\EOF &&
 	/*
 	!/*/
-	/does\*not\*exist/
+	/zbad\\dir/
+	/zdoes\*not\*exist/
+	/zdoes\*exist/
 	EOF
-	check_read_tree_errors repo "a" ""
+	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 10/12] sparse-checkout: write escaped patterns in cone mode
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
                     ` (8 preceding siblings ...)
  2020-01-24 21:19   ` [PATCH v2 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
                     ` (2 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

If a user somehow creates a directory with an asterisk (*) or backslash
(\), then the "git sparse-checkout set" command will struggle to provide
the correct pattern in the sparse-checkout file. When not in cone mode,
the provided pattern is written directly into the sparse-checkout file.
However, in cone mode we expect a list of paths to directories and then
we convert those into patterns.

Even more specifically, the goal is to always allow the following from
the root of a repo:

  git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin

The ls-tree command provides directory names with an unescaped asterisk.
It also quotes the directories that contain an escaped backslash. We
must remove these quotes, then keep the escaped backslashes.

However, there is some care needed for the timing of these escapes. The
in-memory pattern list is used to update the working directory before
writing the patterns to disk. Thus, we need the command to have the
unescaped names in the hashsets for the cone comparisons, then escape
the patterns later.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          | 48 ++++++++++++++++++++++++++++--
 t/t1091-sparse-checkout-builtin.sh | 21 +++++++++++--
 2 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 3cee8ab46e..61d2c30036 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -140,6 +140,22 @@ static int update_working_directory(struct pattern_list *pl)
 	return result;
 }
 
+static char *escaped_pattern(char *pattern)
+{
+	char *p = pattern;
+	struct strbuf final = STRBUF_INIT;
+
+	while (*p) {
+		if (*p == '*' || *p == '\\')
+			strbuf_addch(&final, '\\');
+
+		strbuf_addch(&final, *p);
+		p++;
+	}
+
+	return strbuf_detach(&final, NULL);
+}
+
 static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 {
 	int i;
@@ -164,10 +180,11 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 	fprintf(fp, "/*\n!/*/\n");
 
 	for (i = 0; i < sl.nr; i++) {
-		char *pattern = sl.items[i].string;
+		char *pattern = escaped_pattern(sl.items[i].string);
 
 		if (strlen(pattern))
 			fprintf(fp, "%s/\n!%s/*/\n", pattern, pattern);
+		free(pattern);
 	}
 
 	string_list_clear(&sl, 0);
@@ -185,8 +202,9 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 	string_list_remove_duplicates(&sl, 0);
 
 	for (i = 0; i < sl.nr; i++) {
-		char *pattern = sl.items[i].string;
+		char *pattern = escaped_pattern(sl.items[i].string);
 		fprintf(fp, "%s/\n", pattern);
+		free(pattern);
 	}
 }
 
@@ -337,7 +355,9 @@ static void insert_recursive_pattern(struct pattern_list *pl, struct strbuf *pat
 {
 	struct pattern_entry *e = xmalloc(sizeof(*e));
 	e->patternlen = path->len;
-	e->pattern = strbuf_detach(path, NULL);
+	e->pattern = dup_and_filter_pattern(path->buf);
+	strbuf_release(path);
+
 	hashmap_entry_init(&e->ent,
 			   ignore_case ?
 			   strihash(e->pattern) :
@@ -369,6 +389,7 @@ static void insert_recursive_pattern(struct pattern_list *pl, struct strbuf *pat
 
 static void strbuf_to_cone_pattern(struct strbuf *line, struct pattern_list *pl)
 {
+	int i;
 	strbuf_trim(line);
 
 	strbuf_trim_trailing_dir_sep(line);
@@ -376,6 +397,27 @@ static void strbuf_to_cone_pattern(struct strbuf *line, struct pattern_list *pl)
 	if (!line->len)
 		return;
 
+	for (i = 0; i < line->len; i++) {
+		if (line->buf[i] == '*') {
+			strbuf_insert(line, i, "\\", 1);
+			i++;
+		}
+
+		if (line->buf[i] == '\\') {
+			if (i < line->len - 1 && line->buf[i + 1] == '\\')
+				i++;
+			else
+				strbuf_insert(line, i, "\\", 1);
+
+			i++;
+		}
+	}
+
+	if (line->buf[0] == '"' && line->buf[line->len - 1] == '"') {
+		strbuf_remove(line, 0, 1);
+		strbuf_remove(line, line->len - 1, 1);
+	}
+
 	if (line->buf[0] != '/')
 		strbuf_insert(line, 0, "/", 1);
 
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 0a21a5e15d..2bb30cbe29 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -309,6 +309,9 @@ check_read_tree_errors () {
 	REPO=$1
 	FILES=$2
 	ERRORS=$3
+	git -C $REPO -c core.sparseCheckoutCone=false read-tree -mu HEAD 2>err &&
+	test_must_be_empty err &&
+	check_files $REPO "$FILES" &&
 	git -C $REPO read-tree -mu HEAD 2>err &&
 	if test -z "$ERRORS"
 	then
@@ -379,14 +382,28 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	git -C escaped reset --hard $COMMIT &&
 	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
 	git -C escaped sparse-checkout init --cone &&
-	cat >escaped/.git/info/sparse-checkout <<-\EOF &&
+	git -C escaped sparse-checkout set zbad\\dir zdoes\*not\*exist zdoes\*exist &&
+	cat >expect <<-\EOF &&
 	/*
 	!/*/
 	/zbad\\dir/
+	/zdoes\*exist/
 	/zdoes\*not\*exist/
+	EOF
+	test_cmp expect escaped/.git/info/sparse-checkout &&
+	check_read_tree_errors escaped "a zbad\\dir zdoes*exist" &&
+	git -C escaped ls-tree -d --name-only HEAD | git -C escaped sparse-checkout set --stdin &&
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
+	/folder1/
+	/folder2/
+	/zbad\\dir/
 	/zdoes\*exist/
 	EOF
-	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
+	test_cmp expect escaped/.git/info/sparse-checkout &&
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist"
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 11/12] sparse-checkout: use C-style quotes in 'list' subcommand
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
                     ` (9 preceding siblings ...)
  2020-01-24 21:19   ` [PATCH v2 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-24 21:19   ` [PATCH v2 12/12] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When in cone mode, the 'git sparse-checkout list' subcommand lists
the directories included in the sparse cone. When these directories
contain odd characters, such as a backslash, then we need to use
C-style quotes similar to 'git ls-tree'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          | 7 +++++--
 t/t1091-sparse-checkout-builtin.sh | 7 +++++--
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 61d2c30036..83c8e9bb0c 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -13,6 +13,7 @@
 #include "resolve-undo.h"
 #include "unpack-trees.h"
 #include "wt-status.h"
+#include "quote.h"
 
 static const char *empty_base = "";
 
@@ -77,8 +78,10 @@ static int sparse_checkout_list(int argc, const char **argv)
 
 		string_list_sort(&sl);
 
-		for (i = 0; i < sl.nr; i++)
-			printf("%s\n", sl.items[i].string);
+		for (i = 0; i < sl.nr; i++) {
+			quote_c_style(sl.items[i].string, NULL, stdout, 0);
+			printf("\n");
+		}
 
 		return 0;
 	}
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 2bb30cbe29..16dd924291 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -392,7 +392,8 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	EOF
 	test_cmp expect escaped/.git/info/sparse-checkout &&
 	check_read_tree_errors escaped "a zbad\\dir zdoes*exist" &&
-	git -C escaped ls-tree -d --name-only HEAD | git -C escaped sparse-checkout set --stdin &&
+	git -C escaped ls-tree -d --name-only HEAD >list-expect &&
+	git -C escaped sparse-checkout set --stdin <list-expect &&
 	cat >expect <<-\EOF &&
 	/*
 	!/*/
@@ -403,7 +404,9 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	/zdoes\*exist/
 	EOF
 	test_cmp expect escaped/.git/info/sparse-checkout &&
-	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist"
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
+	git -C escaped sparse-checkout list >list-actual &&
+	test_cmp list-expect list-actual
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v2 12/12] sparse-checkout: improve docs around 'set' in cone mode
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
                     ` (10 preceding siblings ...)
  2020-01-24 21:19   ` [PATCH v2 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
@ 2020-01-24 21:19   ` Derrick Stolee via GitGitGadget
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-24 21:19 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The existing documentation does not clarify how the 'set' subcommand
changes when core.sparseCheckoutCone is enabled. Correct this by
changing some language around the "A/B/C" example. Also include a
description of the input format matching the output of 'git ls-tree
--name-only'.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-sparse-checkout.txt | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
index 4834fb434d..0914619881 100644
--- a/Documentation/git-sparse-checkout.txt
+++ b/Documentation/git-sparse-checkout.txt
@@ -50,6 +50,14 @@ To avoid interfering with other worktrees, it first enables the
 +
 When the `--stdin` option is provided, the patterns are read from
 standard in as a newline-delimited list instead of from the arguments.
++
+When `core.sparseCheckoutCone` is enabled, the input list is considered a
+list of directories instead of sparse-checkout patterns. The command writes
+patterns to the sparse-checkout file to include all files contained in those
+directories (recursively) as well as files that are siblings of ancestor
+directories. The input format matches the output of `git ls-tree --name-only`.
+This includes interpreting pathnames that begin with a double quote (") as
+C-style quoted strings.
 
 'disable'::
 	Disable the `core.sparseCheckout` config setting, and restore the
@@ -128,9 +136,12 @@ the following patterns:
 ----------------
 
 This says "include everything in root, but nothing two levels below root."
-If we then add the folder `A/B/C` as a recursive pattern, the folders `A` and
-`A/B` are added as parent patterns. The resulting sparse-checkout file is
-now
+
+When in cone mode, the `git sparse-checkout set` subcommand takes a list of
+directories instead of a list of sparse-checkout patterns. In this mode,
+the command `git sparse-checkout set A/B/C` sets the directory `A/B/C` as
+a recursive pattern, the directories `A` and `A/B` are added as parent
+patterns. The resulting sparse-checkout file is now
 
 ----------------
 /*
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode
  2020-01-24 21:10         ` Derrick Stolee
@ 2020-01-24 21:42           ` Jeff King
  2020-01-28 15:03             ` Derrick Stolee
  0 siblings, 1 reply; 82+ messages in thread
From: Jeff King @ 2020-01-24 21:42 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, me, newren, Derrick Stolee

On Fri, Jan 24, 2020 at 04:10:21PM -0500, Derrick Stolee wrote:

> Hm. Perhaps you are right! The ls-tree output for the test example
> is:
> 
> 	deep
> 	folder1
> 	folder2
> 	"zbad\\dir"
> 	zdoes*exist
> 
> so the "zdoes*exist" value is not escaped. I believe the current
> logic does something extra: consider supplying this input to
> 'git sparse-checkout set --stdin':
> 
> 	deep
> 	folder1
> 	folder2
> 	"zbad\\dir"
> 	zdoes\*exist
> 
> then should we un-escape "\*" to "*"? Or is this not a valid input
> because a backslash should have been quoted into C-style quotes?

I'd think we should not un-escape anything, because we weren't told that
this was a C-style quoted string by the presence of an opening
double-quote. And that's how, say, update-index behaves:

  $ blob=$(echo foo | git hash-object -w --stdin)
  $ printf '100644 %s\t%s\n' \
      $blob 'just*asterisk' \
      $blob 'backslash\without\quotes' \
      $blob '"backslash\\with\\quotes"' |
    git update-index --index-info

which results in:

  $ git ls-files
  "backslash\\with\\quotes"
  "backslash\\without\\quotes"
  just*asterisk

  [same, but without quoting]
  $ git ls-files -z | tr '\0' '\n'
  backslash\with\quotes
  backslash\without\quotes
  just*asterisk

> The behavior in the current series allows this output that would
> never be written by "git ls-tree".

Yes, I think we'd never write that, because ls-tree would quote anything
with a backslash in it, even though it's not strictly necessary. But it
would be valid input to specify a file that has backslashes but not
double-quotes, and I think sparse-checkout should be changed to match
update-index here.

> I was playing around with this, and I think that quote_c_style() is
> necessary for the output, but we have a strange in-memory situation
> for the other escaping: we both fill the hashsets with the un-escaped
> data and fill the pattern list with the escaped patterns.

Yeah, but I think that the syntactic escaping on input might not have
identical rules to the escaping needed for the patterns.

So it makes sense to me to handle input as a separate mechanism, get a
pristine copy of what the user was trying to communicate to us, and then
re-escape whatever we need to put into the pattern list.

And ultimately the flow would be something like:

  - read input
    - if argument is from command-line, use it verbatim
    - else if reading stdin with "-z", use it verbatim
    - else if line starts with double-quote, unquote_c_style()
    - else use line verbatim
    - the result is a single pristine filename
  - fill hashset with pristine filenames
  - generate pattern list to write to sparse file, escaping filenames as
    necessary according to sparse-pattern rules

Obviously you don't have a "-z" yet, but I think it's something we'd
probably want in the long run. And anything coming from the command-line
shouldn't need quoting to get it to us either (and so we'd need to
escape before writing to the sparse file).

-Peff

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode
  2020-01-24 21:42           ` Jeff King
@ 2020-01-28 15:03             ` Derrick Stolee
  0 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee @ 2020-01-28 15:03 UTC (permalink / raw)
  To: Jeff King
  Cc: Derrick Stolee via GitGitGadget, git, me, newren, Derrick Stolee

On 1/24/2020 4:42 PM, Jeff King wrote:
> And ultimately the flow would be something like:
> 
>   - read input
>     - if argument is from command-line, use it verbatim
>     - else if reading stdin with "-z", use it verbatim
>     - else if line starts with double-quote, unquote_c_style()
>     - else use line verbatim
>     - the result is a single pristine filename
>   - fill hashset with pristine filenames
>   - generate pattern list to write to sparse file, escaping filenames as
>     necessary according to sparse-pattern rules
> 
> Obviously you don't have a "-z" yet, but I think it's something we'd
> probably want in the long run. And anything coming from the command-line
> shouldn't need quoting to get it to us either (and so we'd need to
> escape before writing to the sparse file).

This recommendation came async with my v2, so I'll follow shortly with
a v3 that uses this flow. I have something that I think works, after
slightly adapting my tests, but now I need to make sure that all the
patches still make sense and build cleanly.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 00/12] Harden the sparse-checkout builtin
  2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
                     ` (11 preceding siblings ...)
  2020-01-24 21:19   ` [PATCH v2 12/12] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26   ` Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
                       ` (12 more replies)
  12 siblings, 13 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee

This series is based on ds/sparse-list-in-cone-mode.

This series attempts to clean up some rough edges in the sparse-checkout
feature, especially around the cone mode.

Unfortunately, after the v2.25.0 release, we noticed an issue with the "git
clone --sparse" option when using a URL instead of a local path. This is
fixed and properly tested here.

Also, let's improve Git's response to these more complicated scenarios:

 1. Running "git sparse-checkout init" in a worktree would complain because
    the "info" dir doesn't exist.
 2. Tracked paths that include "*" and "\" in their filenames.
 3. If a user edits the sparse-checkout file to have non-cone pattern, such
    as "**" anywhere or "*" in the wrong place, then we should respond
    appropriately. That is: warn that the patterns are not cone-mode, then
    revert to the old logic.

Updates in V2:

 * Added C-style quoting to the output of "git sparse-checkout list" in cone
   mode.
 * Improved documentation.
 * Responded to most style feedback. Hopefully I didn't miss anything.
 * I was lingering on this a little to see if I could also fix the issue
   raised in [1], but I have not figured that one out, yet.

Update in V3:

 * Input now uses Peff's recommended pattern: unquote C-style strings over
   stdin and otherwise do not un-escape input.

[1] 
https://lore.kernel.org/git/062301d5d0bc$c3e17760$4ba46620$@Frontier.com/

Thanks, -Stolee

Derrick Stolee (11):
  t1091: use check_files to reduce boilerplate
  t1091: improve here-docs
  sparse-checkout: create leading directories
  clone: fix --sparse option with URLs
  sparse-checkout: cone mode does not recognize "**"
  sparse-checkout: detect short patterns
  sparse-checkout: warn on incorrect '*' in patterns
  sparse-checkout: properly match escaped characters
  sparse-checkout: write escaped patterns in cone mode
  sparse-checkout: use C-style quotes in 'list' subcommand
  sparse-checkout: improve docs around 'set' in cone mode

Jeff King (1):
  sparse-checkout: fix documentation typo for core.sparseCheckoutCone

 Documentation/git-sparse-checkout.txt |  19 +-
 builtin/clone.c                       |   2 +-
 builtin/sparse-checkout.c             |  48 +++-
 dir.c                                 |  68 +++++-
 t/t1091-sparse-checkout-builtin.sh    | 323 +++++++++++++++-----------
 5 files changed, 305 insertions(+), 155 deletions(-)


base-commit: 4fd683b6a35eabd23dd5183da7f654a1e1f00325
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-513%2Fderrickstolee%2Fsparse-harden-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-513/derrickstolee/sparse-harden-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/513

Range-diff vs v2:

  1:  1cc825412f =  1:  1cc825412f t1091: use check_files to reduce boilerplate
  2:  b7a6ad145a =  2:  b7a6ad145a t1091: improve here-docs
  3:  5497ad8778 =  3:  5497ad8778 sparse-checkout: create leading directories
  4:  4991a51f6d =  4:  4991a51f6d clone: fix --sparse option with URLs
  5:  ae78c3069b =  5:  ae78c3069b sparse-checkout: fix documentation typo for core.sparseCheckoutCone
  6:  2ad4d3e467 =  6:  2ad4d3e467 sparse-checkout: cone mode does not recognize "**"
  7:  aace064510 =  7:  aace064510 sparse-checkout: detect short patterns
  8:  d2a510a3bb =  8:  d2a510a3bb sparse-checkout: warn on incorrect '*' in patterns
  9:  65c53d7526 !  9:  9ea69e9069 sparse-checkout: properly match escaped characters
     @@ -20,7 +20,7 @@
       	return strncmp(ee1->pattern, ee2->pattern, min_len);
       }
       
     -+char *dup_and_filter_pattern(const char *pattern)
     ++static char *dup_and_filter_pattern(const char *pattern)
      +{
      +	char *set, *read;
      +	char *result = xstrdup(pattern);
     @@ -69,18 +69,6 @@
       	hashmap_entry_init(&translated->ent,
       			   ignore_case ?
      
     - diff --git a/dir.h b/dir.h
     - --- a/dir.h
     - +++ b/dir.h
     -@@
     - 		   const struct hashmap_entry *a,
     - 		   const struct hashmap_entry *b,
     - 		   const void *key);
     -+char *dup_and_filter_pattern(const char *pattern);
     - int hashmap_contains_parent(struct hashmap *map,
     - 			    const char *path,
     - 			    struct strbuf *buffer);
     -
       diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
       --- a/t/t1091-sparse-checkout-builtin.sh
       +++ b/t/t1091-sparse-checkout-builtin.sh
 10:  c27a17a2fc ! 10:  e2f9afc70c sparse-checkout: write escaped patterns in cone mode
     @@ -24,11 +24,24 @@
          unescaped names in the hashsets for the cone comparisons, then escape
          the patterns later.
      
     +    Use unquote_c_style() when parsing lines from stdin. Command-line
     +    arguments will be parsed as-is, assuming the user can do the correct
     +    level of escaping from their environment to match the exact directory
     +    names.
     +
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
       --- a/builtin/sparse-checkout.c
       +++ b/builtin/sparse-checkout.c
     +@@
     + #include "resolve-undo.h"
     + #include "unpack-trees.h"
     + #include "wt-status.h"
     ++#include "quote.h"
     + 
     + static const char *empty_base = "";
     + 
      @@
       	return result;
       }
     @@ -77,52 +90,28 @@
       }
       
      @@
     - {
     - 	struct pattern_entry *e = xmalloc(sizeof(*e));
     - 	e->patternlen = path->len;
     --	e->pattern = strbuf_detach(path, NULL);
     -+	e->pattern = dup_and_filter_pattern(path->buf);
     -+	strbuf_release(path);
     -+
     - 	hashmap_entry_init(&e->ent,
     - 			   ignore_case ?
     - 			   strihash(e->pattern) :
     -@@
     + 		pl.use_cone_patterns = 1;
       
     - static void strbuf_to_cone_pattern(struct strbuf *line, struct pattern_list *pl)
     - {
     -+	int i;
     - 	strbuf_trim(line);
     - 
     - 	strbuf_trim_trailing_dir_sep(line);
     -@@
     - 	if (!line->len)
     - 		return;
     - 
     -+	for (i = 0; i < line->len; i++) {
     -+		if (line->buf[i] == '*') {
     -+			strbuf_insert(line, i, "\\", 1);
     -+			i++;
     -+		}
     + 		if (set_opts.use_stdin) {
     +-			while (!strbuf_getline(&line, stdin))
     ++			struct strbuf unquoted = STRBUF_INIT;
     ++			while (!strbuf_getline(&line, stdin)) {
     ++				if (line.buf[0] == '"') {
     ++					strbuf_setlen(&unquoted, 0);
     ++					if (unquote_c_style(&unquoted, line.buf, NULL))
     ++						die(_("unable to unquote C-style string '%s'"),
     ++						line.buf);
      +
     -+		if (line->buf[i] == '\\') {
     -+			if (i < line->len - 1 && line->buf[i + 1] == '\\')
     -+				i++;
     -+			else
     -+				strbuf_insert(line, i, "\\", 1);
     ++					strbuf_swap(&unquoted, &line);
     ++				}
      +
     -+			i++;
     -+		}
     -+	}
     + 				strbuf_to_cone_pattern(&line, &pl);
     ++			}
      +
     -+	if (line->buf[0] == '"' && line->buf[line->len - 1] == '"') {
     -+		strbuf_remove(line, 0, 1);
     -+		strbuf_remove(line, line->len - 1, 1);
     -+	}
     -+
     - 	if (line->buf[0] != '/')
     - 		strbuf_insert(line, 0, "/", 1);
     - 
     ++			strbuf_release(&unquoted);
     + 		} else {
     + 			for (i = 0; i < argc; i++) {
     + 				strbuf_setlen(&line, 0);
      
       diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
       --- a/t/t1091-sparse-checkout-builtin.sh
     @@ -142,7 +131,7 @@
       	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
       	git -C escaped sparse-checkout init --cone &&
      -	cat >escaped/.git/info/sparse-checkout <<-\EOF &&
     -+	git -C escaped sparse-checkout set zbad\\dir zdoes\*not\*exist zdoes\*exist &&
     ++	git -C escaped sparse-checkout set zbad\\dir "zdoes*not*exist" "zdoes*exist" &&
      +	cat >expect <<-\EOF &&
       	/*
       	!/*/
 11:  526d5becbc ! 11:  ec714a4cf0 sparse-checkout: use C-style quotes in 'list' subcommand
     @@ -12,14 +12,6 @@
       diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
       --- a/builtin/sparse-checkout.c
       +++ b/builtin/sparse-checkout.c
     -@@
     - #include "resolve-undo.h"
     - #include "unpack-trees.h"
     - #include "wt-status.h"
     -+#include "quote.h"
     - 
     - static const char *empty_base = "";
     - 
      @@
       
       		string_list_sort(&sl);
 12:  1b5858adee = 12:  1867746d97 sparse-checkout: improve docs around 'set' in cone mode

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v3 01/12] t1091: use check_files to reduce boilerplate
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 02/12] t1091: improve here-docs Derrick Stolee via GitGitGadget
                       ` (11 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When testing the sparse-checkout feature, we need to compare the
contents of the working-directory against some expected output.
Using here-docs was useful in the beginning, but became repetetive
as the test script grew.

Create a check_files helper to make the tests simpler and easier
to extend. It also reduces instances of bad here-doc whitespace.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1091-sparse-checkout-builtin.sh | 117 ++++++-----------------------
 1 file changed, 22 insertions(+), 95 deletions(-)

diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index ff7f8f7a1f..e058a20ad6 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -12,6 +12,13 @@ list_files() {
 	(cd "$1" && printf '%s\n' *)
 }
 
+check_files() {
+	list_files "$1" >actual &&
+	shift &&
+	printf "%s\n" $@ >expect &&
+	test_cmp expect actual
+}
+
 test_expect_success 'setup' '
 	git init repo &&
 	(
@@ -58,9 +65,7 @@ test_expect_success 'git sparse-checkout init' '
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	test_cmp_config -C repo true core.sparsecheckout &&
-	list_files repo >dir  &&
-	echo a >expect &&
-	test_cmp expect dir
+	check_files repo a
 '
 
 test_expect_success 'git sparse-checkout list after init' '
@@ -81,13 +86,7 @@ test_expect_success 'init with existing sparse-checkout' '
 		*folder*
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'clone --sparse' '
@@ -98,9 +97,7 @@ test_expect_success 'clone --sparse' '
 		!/*/
 	EOF
 	test_cmp expect actual &&
-	list_files clone >dir &&
-	echo a >expect &&
-	test_cmp expect dir
+	check_files clone a
 '
 
 test_expect_success 'set enables config' '
@@ -127,13 +124,7 @@ test_expect_success 'set sparse-checkout using builtin' '
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'set sparse-checkout using --stdin' '
@@ -147,13 +138,7 @@ test_expect_success 'set sparse-checkout using --stdin' '
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo "a folder1 folder2"
 '
 
 test_expect_success 'cone mode: match patterns' '
@@ -162,13 +147,7 @@ test_expect_success 'cone mode: match patterns' '
 	git -C repo read-tree -mu HEAD 2>err &&
 	test_i18ngrep ! "disabling cone patterns" err &&
 	git -C repo reset --hard &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'cone mode: warn on bad pattern' '
@@ -185,14 +164,7 @@ test_expect_success 'sparse-checkout disable' '
 	test_path_is_file repo/.git/info/sparse-checkout &&
 	git -C repo config --list >config &&
 	test_must_fail git config core.sparseCheckout &&
-	list_files repo >dir &&
-	cat >expect <<-EOF &&
-		a
-		deep
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a deep folder1 folder2
 '
 
 test_expect_success 'cone mode: init and set' '
@@ -204,24 +176,9 @@ test_expect_success 'cone mode: init and set' '
 	test_cmp expect dir &&
 	git -C repo sparse-checkout set deep/deeper1/deepest/ 2>err &&
 	test_must_be_empty err &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deep
-	EOF
-	test_cmp expect dir &&
-	list_files repo/deep >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deeper1
-	EOF
-	test_cmp expect dir &&
-	list_files repo/deep/deeper1 >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deepest
-	EOF
-	test_cmp expect dir &&
+	check_files repo a deep &&
+	check_files repo/deep a deeper1 &&
+	check_files repo/deep/deeper1 a deepest &&
 	cat >expect <<-EOF &&
 		/*
 		!/*/
@@ -237,13 +194,7 @@ test_expect_success 'cone mode: init and set' '
 		folder2
 	EOF
 	test_must_be_empty err &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	list_files repo >dir &&
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'cone mode: list' '
@@ -275,13 +226,7 @@ test_expect_success 'revert to old sparse-checkout on bad update' '
 	test_must_fail git -C repo sparse-checkout set deep/deeper1 2>err &&
 	test_i18ngrep "cannot set sparse-checkout patterns" err &&
 	test_cmp repo/.git/info/sparse-checkout expect &&
-	list_files repo/deep >dir &&
-	cat >expect <<-EOF &&
-		a
-		deeper1
-		deeper2
-	EOF
-	test_cmp dir expect
+	check_files repo/deep a deeper1 deeper2
 '
 
 test_expect_success 'revert to old sparse-checkout on empty update' '
@@ -332,12 +277,7 @@ test_expect_success 'cone mode: set with core.ignoreCase=true' '
 		/folder1/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1
 '
 
 test_expect_success 'interaction with submodules' '
@@ -351,21 +291,8 @@ test_expect_success 'interaction with submodules' '
 		git sparse-checkout init --cone &&
 		git sparse-checkout set folder1
 	) &&
-	list_files super >dir &&
-	cat >expect <<-\EOF &&
-		a
-		folder1
-		modules
-	EOF
-	test_cmp expect dir &&
-	list_files super/modules/child >dir &&
-	cat >expect <<-\EOF &&
-		a
-		deep
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files super a folder1 modules &&
+	check_files super/modules/child a deep folder1 folder2
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 02/12] t1091: improve here-docs
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 03/12] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
                       ` (10 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

t1091-sparse-checkout-builtin.sh uses here-docs to populate the
expected contents of the sparse-checkout file. These do not use
shell interpolation, so use "-\EOF" instead of "-EOF". Also use
proper tabbing.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1091-sparse-checkout-builtin.sh | 98 +++++++++++++++---------------
 1 file changed, 49 insertions(+), 49 deletions(-)

diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index e058a20ad6..e28e1c797f 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -46,11 +46,11 @@ test_expect_success 'git sparse-checkout list (empty)' '
 
 test_expect_success 'git sparse-checkout list (populated)' '
 	test_when_finished rm -f repo/.git/info/sparse-checkout &&
-	cat >repo/.git/info/sparse-checkout <<-EOF &&
-		/folder1/*
-		/deep/
-		**/a
-		!*bin*
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/folder1/*
+	/deep/
+	**/a
+	!*bin*
 	EOF
 	cp repo/.git/info/sparse-checkout expect &&
 	git -C repo sparse-checkout list >list &&
@@ -59,9 +59,9 @@ test_expect_success 'git sparse-checkout list (populated)' '
 
 test_expect_success 'git sparse-checkout init' '
 	git -C repo sparse-checkout init &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	test_cmp_config -C repo true core.sparsecheckout &&
@@ -70,9 +70,9 @@ test_expect_success 'git sparse-checkout init' '
 
 test_expect_success 'git sparse-checkout list after init' '
 	git -C repo sparse-checkout list >actual &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect actual
 '
@@ -80,10 +80,10 @@ test_expect_success 'git sparse-checkout list after init' '
 test_expect_success 'init with existing sparse-checkout' '
 	echo "*folder*" >> repo/.git/info/sparse-checkout &&
 	git -C repo sparse-checkout init &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		*folder*
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	*folder*
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	check_files repo a folder1 folder2
@@ -92,9 +92,9 @@ test_expect_success 'init with existing sparse-checkout' '
 test_expect_success 'clone --sparse' '
 	git clone --sparse repo clone &&
 	git -C clone sparse-checkout list >actual &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect actual &&
 	check_files clone a
@@ -116,10 +116,10 @@ test_expect_success 'set enables config' '
 
 test_expect_success 'set sparse-checkout using builtin' '
 	git -C repo sparse-checkout set "/*" "!/*/" "*folder*" &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		*folder*
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	*folder*
 	EOF
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
@@ -128,11 +128,11 @@ test_expect_success 'set sparse-checkout using builtin' '
 '
 
 test_expect_success 'set sparse-checkout using --stdin' '
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/folder1/
-		/folder2/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/folder1/
+	/folder2/
 	EOF
 	git -C repo sparse-checkout set --stdin <expect &&
 	git -C repo sparse-checkout list >actual &&
@@ -179,28 +179,28 @@ test_expect_success 'cone mode: init and set' '
 	check_files repo a deep &&
 	check_files repo/deep a deeper1 &&
 	check_files repo/deep/deeper1 a deepest &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/deep/
-		!/deep/*/
-		/deep/deeper1/
-		!/deep/deeper1/*/
-		/deep/deeper1/deepest/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
+	!/deep/*/
+	/deep/deeper1/
+	!/deep/deeper1/*/
+	/deep/deeper1/deepest/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	git -C repo sparse-checkout set --stdin 2>err <<-EOF &&
-		folder1
-		folder2
+	git -C repo sparse-checkout set --stdin 2>err <<-\EOF &&
+	folder1
+	folder2
 	EOF
 	test_must_be_empty err &&
 	check_files repo a folder1 folder2
 '
 
 test_expect_success 'cone mode: list' '
-	cat >expect <<-EOF &&
-		folder1
-		folder2
+	cat >expect <<-\EOF &&
+	folder1
+	folder2
 	EOF
 	git -C repo sparse-checkout set --stdin <expect &&
 	git -C repo sparse-checkout list >actual 2>err &&
@@ -211,10 +211,10 @@ test_expect_success 'cone mode: list' '
 test_expect_success 'cone mode: set with nested folders' '
 	git -C repo sparse-checkout set deep deep/deeper1/deepest 2>err &&
 	test_line_count = 0 err &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/deep/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
 	EOF
 	test_cmp repo/.git/info/sparse-checkout expect
 '
@@ -271,10 +271,10 @@ test_expect_success 'sparse-checkout (init|set|disable) fails with dirty status'
 test_expect_success 'cone mode: set with core.ignoreCase=true' '
 	git -C repo sparse-checkout init --cone &&
 	git -C repo -c core.ignoreCase=true sparse-checkout set folder1 &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/folder1/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/folder1/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	check_files repo a folder1
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 03/12] sparse-checkout: create leading directories
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 02/12] t1091: improve here-docs Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 04/12] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
                       ` (9 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git init' command creates the ".git/info" directory and fills it
with some default files. However, 'git worktree add' does not create
the info directory for that worktree. This causes a problem when running
"git sparse-checkout init" inside a worktree. While care was taken to
allow the sparse-checkout config to be specific to a worktree, this
initialization was untested.

Safely create the leading directories for the sparse-checkout file. This
is the safest thing to do even without worktrees, as a user could delete
their ".git/info" directory and expect Git to recover safely.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          |  4 ++++
 t/t1091-sparse-checkout-builtin.sh | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index b3bed891cb..3cee8ab46e 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -199,6 +199,10 @@ static int write_patterns_and_update(struct pattern_list *pl)
 	int result;
 
 	sparse_filename = get_sparse_checkout_filename();
+
+	if (safe_create_leading_directories(sparse_filename))
+		die(_("failed to create directory for sparse-checkout file"));
+
 	fd = hold_lock_file_for_update(&lk, sparse_filename,
 				      LOCK_DIE_ON_ERROR);
 
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index e28e1c797f..43d1f7520c 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -295,4 +295,14 @@ test_expect_success 'interaction with submodules' '
 	check_files super/modules/child a deep folder1 folder2
 '
 
+test_expect_success 'different sparse-checkouts with worktrees' '
+	git -C repo worktree add --detach ../worktree &&
+	check_files worktree "a deep folder1 folder2" &&
+	git -C worktree sparse-checkout init --cone &&
+	git -C repo sparse-checkout set folder1 &&
+	git -C worktree sparse-checkout set deep/deeper1 &&
+	check_files repo a folder1 &&
+	check_files worktree a deep
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 04/12] clone: fix --sparse option with URLs
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                       ` (2 preceding siblings ...)
  2020-01-28 18:26     ` [PATCH v3 03/12] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
                       ` (8 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The --sparse option was added to the clone builtin in d89f09c (clone:
add --sparse mode, 2019-11-21) and was tested with a local path clone
in t1091-sparse-checkout-builtin.sh. However, due to a difference in
how local paths are handled versus URLs, this mechanism does not work
with URLs.

Modify the test to use a "file://" URL, which would output this error
before the code change:

  Cloning into 'clone'...
  fatal: cannot change to 'file://.../repo': No such file or directory
  error: failed to initialize sparse-checkout

These errors are due to using a "-C <path>" option to call 'git -C
<path> sparse-checkout init' but the URL is being given instead of
the target directory.

Update that target directory to evaluate this correctly. I have also
manually tested that https:// URLs are handled correctly as well.

Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/clone.c                    | 2 +-
 t/t1091-sparse-checkout-builtin.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 4348d962c9..2caefc44fb 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1130,7 +1130,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (option_required_reference.nr || option_optional_reference.nr)
 		setup_reference();
 
-	if (option_sparse_checkout && git_sparse_checkout_init(repo))
+	if (option_sparse_checkout && git_sparse_checkout_init(dir))
 		return 1;
 
 	remote = remote_get(option_origin);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 43d1f7520c..cf4a595c86 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -90,7 +90,7 @@ test_expect_success 'init with existing sparse-checkout' '
 '
 
 test_expect_success 'clone --sparse' '
-	git clone --sparse repo clone &&
+	git clone --sparse "file://$(pwd)/repo" clone &&
 	git -C clone sparse-checkout list >actual &&
 	cat >expect <<-\EOF &&
 	/*
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                       ` (3 preceding siblings ...)
  2020-01-28 18:26     ` [PATCH v3 04/12] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Jeff King via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 06/12] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
                       ` (7 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Jeff King via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Jeff King

From: Jeff King <peff@peff.net>

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-sparse-checkout.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
index 3b341cf0fc..4834fb434d 100644
--- a/Documentation/git-sparse-checkout.txt
+++ b/Documentation/git-sparse-checkout.txt
@@ -106,7 +106,7 @@ The full pattern set allows for arbitrary pattern matches and complicated
 inclusion/exclusion rules. These can result in O(N*M) pattern matches when
 updating the index, where N is the number of patterns and M is the number
 of paths in the index. To combat this performance issue, a more restricted
-pattern set is allowed when `core.spareCheckoutCone` is enabled.
+pattern set is allowed when `core.sparseCheckoutCone` is enabled.
 
 The accepted patterns in the cone pattern set are:
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 06/12] sparse-checkout: cone mode does not recognize "**"
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                       ` (4 preceding siblings ...)
  2020-01-28 18:26     ` [PATCH v3 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 07/12] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
                       ` (6 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When core.sparseCheckoutCone is enabled, the 'git sparse-checkout set'
command creates a restricted set of possible patterns that are used
by a custom algorithm to quickly match those patterns.

If a user manually edits the sparse-checkout file, then they could
create patterns that do not match these expectations. The cone-mode
matching algorithm can return incorrect results. The solution is to
detect these incorrect patterns, warn that we do not recognize them,
and revert to the standard algorithm.

Check each pattern for the "**" substring, and revert to the old
logic if seen. While technically a "/<dir>/**" pattern matches
the meaning of "/<dir>/", it is not one that would be written by
the sparse-checkout builtin in cone mode. Attempting to accept that
pattern change complicates the logic and instead we punt and do
not accept any instance of "**".

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              |  7 +++++-
 t/t1091-sparse-checkout-builtin.sh | 34 ++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 22d08e61c2..40fed73a94 100644
--- a/dir.c
+++ b/dir.c
@@ -651,11 +651,16 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 		return;
 	}
 
+	if (strstr(given->pattern, "**")) {
+		/* Not a cone pattern. */
+		warning(_("unrecognized pattern: '%s'"), given->pattern);
+		goto clear_hashmaps;
+	}
+
 	if (given->patternlen > 2 &&
 	    !strcmp(given->pattern + given->patternlen - 2, "/*")) {
 		if (!(given->flags & PATTERN_FLAG_NEGATIVE)) {
 			/* Not a cone pattern. */
-			pl->use_cone_patterns = 0;
 			warning(_("unrecognized pattern: '%s'"), given->pattern);
 			goto clear_hashmaps;
 		}
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index cf4a595c86..e2e45dc7fd 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -305,4 +305,38 @@ test_expect_success 'different sparse-checkouts with worktrees' '
 	check_files worktree a deep
 '
 
+check_read_tree_errors () {
+	REPO=$1
+	FILES=$2
+	ERRORS=$3
+	git -C $REPO read-tree -mu HEAD 2>err &&
+	if test -z "$ERRORS"
+	then
+		test_must_be_empty err
+	else
+		test_i18ngrep "$ERRORS" err
+	fi &&
+	check_files $REPO $FILES
+}
+
+test_expect_success 'pattern-checks: /A/**' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/folder1/**
+	EOF
+	check_read_tree_errors repo "a folder1" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: /A/**/B/' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/deep/**/deepest
+	EOF
+	check_read_tree_errors repo "a deep" "disabling cone pattern matching" &&
+	check_files repo/deep "deeper1" &&
+	check_files repo/deep/deeper1 "deepest"
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 07/12] sparse-checkout: detect short patterns
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                       ` (5 preceding siblings ...)
  2020-01-28 18:26     ` [PATCH v3 06/12] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 08/12] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
                       ` (5 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the shortest pattern the sparse-checkout command will
write into the sparse-checkout file is "/*". This is handled carefully
in add_pattern_to_hashsets(), so warn if any other pattern is this
short. This will assist future pattern checks by allowing us to assume
there are at least three characters in the pattern.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 3 ++-
 t/t1091-sparse-checkout-builtin.sh | 9 +++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 40fed73a94..c2e585607e 100644
--- a/dir.c
+++ b/dir.c
@@ -651,7 +651,8 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 		return;
 	}
 
-	if (strstr(given->pattern, "**")) {
+	if (given->patternlen <= 2 ||
+	    strstr(given->pattern, "**")) {
 		/* Not a cone pattern. */
 		warning(_("unrecognized pattern: '%s'"), given->pattern);
 		goto clear_hashmaps;
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index e2e45dc7fd..2e57534799 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -339,4 +339,13 @@ test_expect_success 'pattern-checks: /A/**/B/' '
 	check_files repo/deep/deeper1 "deepest"
 '
 
+test_expect_success 'pattern-checks: too short' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/a
+	EOF
+	check_read_tree_errors repo "a" "disabling cone pattern matching"
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 08/12] sparse-checkout: warn on incorrect '*' in patterns
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                       ` (6 preceding siblings ...)
  2020-01-28 18:26     ` [PATCH v3 07/12] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-28 18:26     ` [PATCH v3 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
                       ` (4 subsequent siblings)
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the sparse-checkout commmand will write patterns that
allow faster pattern matching. This matching only works if the patterns
in the sparse-checkout file are those written by that command. Users
can edit the sparse-checkout file and create patterns that cause the
cone mode matching to fail.

The cone mode patterns may end in "/*" but otherwise an un-escaped
asterisk is invalid. Add checks to disable cone mode when seeing these
values.

A later change will properly handle escaped asterisks.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 29 +++++++++++++++++++++++++++++
 t/t1091-sparse-checkout-builtin.sh | 27 +++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

diff --git a/dir.c b/dir.c
index c2e585607e..7cb78c8b87 100644
--- a/dir.c
+++ b/dir.c
@@ -635,6 +635,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 	struct pattern_entry *translated;
 	char *truncated;
 	char *data = NULL;
+	const char *prev, *cur, *next;
 
 	if (!pl->use_cone_patterns)
 		return;
@@ -652,12 +653,40 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 	}
 
 	if (given->patternlen <= 2 ||
+	    *given->pattern == '*' ||
 	    strstr(given->pattern, "**")) {
 		/* Not a cone pattern. */
 		warning(_("unrecognized pattern: '%s'"), given->pattern);
 		goto clear_hashmaps;
 	}
 
+	prev = given->pattern;
+	cur = given->pattern + 1;
+	next = given->pattern + 2;
+
+	while (*cur) {
+		/* We care about *cur == '*' */
+		if (*cur != '*')
+			goto increment;
+
+		/* But only if *prev != '\\' */
+		if (*prev == '\\')
+			goto increment;
+
+		/* But a trailing '/' then '*' is fine */
+		if (*prev == '/' && *next == 0)
+			goto increment;
+
+		/* Not a cone pattern. */
+		warning(_("unrecognized pattern: '%s'"), given->pattern);
+		goto clear_hashmaps;
+
+	increment:
+		prev++;
+		cur++;
+		next++;
+	}
+
 	if (given->patternlen > 2 &&
 	    !strcmp(given->pattern + given->patternlen - 2, "/*")) {
 		if (!(given->flags & PATTERN_FLAG_NEGATIVE)) {
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 2e57534799..470900f6f4 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -348,4 +348,31 @@ test_expect_success 'pattern-checks: too short' '
 	check_read_tree_errors repo "a" "disabling cone pattern matching"
 '
 
+test_expect_success 'pattern-checks: trailing "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/a*
+	EOF
+	check_read_tree_errors repo "a" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: starting "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	*eep/
+	EOF
+	check_read_tree_errors repo "a deep" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: escaped "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/does\*not\*exist/
+	EOF
+	check_read_tree_errors repo "a" ""
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 09/12] sparse-checkout: properly match escaped characters
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                       ` (7 preceding siblings ...)
  2020-01-28 18:26     ` [PATCH v3 08/12] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-29 10:03       ` Jeff King
  2020-01-28 18:26     ` [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
                       ` (3 subsequent siblings)
  12 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the sparse-checkout feature uses hashset containment
queries to match paths. Make this algorithm respect escaped asterisk
(*) and backslash (\) characters.

Create dup_and_filter_pattern() method to convert a pattern by
removing escape characters and dropping an optional "/*" at the end.
This method is available in dir.h as we will use it in
builtin/sparse-checkout.c in a later change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 31 +++++++++++++++++++++++++++---
 t/t1091-sparse-checkout-builtin.sh | 22 +++++++++++++++++----
 2 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index 7cb78c8b87..579f274d13 100644
--- a/dir.c
+++ b/dir.c
@@ -630,6 +630,32 @@ int pl_hashmap_cmp(const void *unused_cmp_data,
 	return strncmp(ee1->pattern, ee2->pattern, min_len);
 }
 
+static char *dup_and_filter_pattern(const char *pattern)
+{
+	char *set, *read;
+	char *result = xstrdup(pattern);
+
+	set = result;
+	read = result;
+
+	while (*read) {
+		/* skip escape characters (once) */
+		if (*read == '\\')
+			read++;
+
+		*set = *read;
+
+		set++;
+		read++;
+	}
+	*set = 0;
+
+	if (*(read - 2) == '/' && *(read - 1) == '*')
+		*(read - 2) = 0;
+
+	return result;
+}
+
 static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern *given)
 {
 	struct pattern_entry *translated;
@@ -695,8 +721,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 			goto clear_hashmaps;
 		}
 
-		truncated = xstrdup(given->pattern);
-		truncated[given->patternlen - 2] = 0;
+		truncated = dup_and_filter_pattern(given->pattern);
 
 		translated = xmalloc(sizeof(struct pattern_entry));
 		translated->pattern = truncated;
@@ -730,7 +755,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 
 	translated = xmalloc(sizeof(struct pattern_entry));
 
-	translated->pattern = xstrdup(given->pattern);
+	translated->pattern = dup_and_filter_pattern(given->pattern);
 	translated->patternlen = given->patternlen;
 	hashmap_entry_init(&translated->ent,
 			   ignore_case ?
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 470900f6f4..0a21a5e15d 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -366,13 +366,27 @@ test_expect_success 'pattern-checks: starting "*"' '
 	check_read_tree_errors repo "a deep" "disabling cone pattern matching"
 '
 
-test_expect_success 'pattern-checks: escaped "*"' '
-	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
+	git clone repo escaped &&
+	TREEOID=$(git -C escaped rev-parse HEAD:folder1) &&
+	NEWTREE=$(git -C escaped mktree <<-EOF
+	$(git -C escaped ls-tree HEAD)
+	040000 tree $TREEOID	zbad\\dir
+	040000 tree $TREEOID	zdoes*exist
+	EOF
+	) &&
+	COMMIT=$(git -C escaped commit-tree $NEWTREE -p HEAD) &&
+	git -C escaped reset --hard $COMMIT &&
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
+	git -C escaped sparse-checkout init --cone &&
+	cat >escaped/.git/info/sparse-checkout <<-\EOF &&
 	/*
 	!/*/
-	/does\*not\*exist/
+	/zbad\\dir/
+	/zdoes\*not\*exist/
+	/zdoes\*exist/
 	EOF
-	check_read_tree_errors repo "a" ""
+	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                       ` (8 preceding siblings ...)
  2020-01-28 18:26     ` [PATCH v3 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-29 10:17       ` Jeff King
  2020-01-28 18:26     ` [PATCH v3 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
                       ` (2 subsequent siblings)
  12 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

If a user somehow creates a directory with an asterisk (*) or backslash
(\), then the "git sparse-checkout set" command will struggle to provide
the correct pattern in the sparse-checkout file. When not in cone mode,
the provided pattern is written directly into the sparse-checkout file.
However, in cone mode we expect a list of paths to directories and then
we convert those into patterns.

Even more specifically, the goal is to always allow the following from
the root of a repo:

  git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin

The ls-tree command provides directory names with an unescaped asterisk.
It also quotes the directories that contain an escaped backslash. We
must remove these quotes, then keep the escaped backslashes.

However, there is some care needed for the timing of these escapes. The
in-memory pattern list is used to update the working directory before
writing the patterns to disk. Thus, we need the command to have the
unescaped names in the hashsets for the cone comparisons, then escape
the patterns later.

Use unquote_c_style() when parsing lines from stdin. Command-line
arguments will be parsed as-is, assuming the user can do the correct
level of escaping from their environment to match the exact directory
names.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          | 38 +++++++++++++++++++++++++++---
 t/t1091-sparse-checkout-builtin.sh | 21 +++++++++++++++--
 2 files changed, 54 insertions(+), 5 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 3cee8ab46e..61414fef18 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -13,6 +13,7 @@
 #include "resolve-undo.h"
 #include "unpack-trees.h"
 #include "wt-status.h"
+#include "quote.h"
 
 static const char *empty_base = "";
 
@@ -140,6 +141,22 @@ static int update_working_directory(struct pattern_list *pl)
 	return result;
 }
 
+static char *escaped_pattern(char *pattern)
+{
+	char *p = pattern;
+	struct strbuf final = STRBUF_INIT;
+
+	while (*p) {
+		if (*p == '*' || *p == '\\')
+			strbuf_addch(&final, '\\');
+
+		strbuf_addch(&final, *p);
+		p++;
+	}
+
+	return strbuf_detach(&final, NULL);
+}
+
 static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 {
 	int i;
@@ -164,10 +181,11 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 	fprintf(fp, "/*\n!/*/\n");
 
 	for (i = 0; i < sl.nr; i++) {
-		char *pattern = sl.items[i].string;
+		char *pattern = escaped_pattern(sl.items[i].string);
 
 		if (strlen(pattern))
 			fprintf(fp, "%s/\n!%s/*/\n", pattern, pattern);
+		free(pattern);
 	}
 
 	string_list_clear(&sl, 0);
@@ -185,8 +203,9 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 	string_list_remove_duplicates(&sl, 0);
 
 	for (i = 0; i < sl.nr; i++) {
-		char *pattern = sl.items[i].string;
+		char *pattern = escaped_pattern(sl.items[i].string);
 		fprintf(fp, "%s/\n", pattern);
+		free(pattern);
 	}
 }
 
@@ -423,8 +442,21 @@ static int sparse_checkout_set(int argc, const char **argv, const char *prefix)
 		pl.use_cone_patterns = 1;
 
 		if (set_opts.use_stdin) {
-			while (!strbuf_getline(&line, stdin))
+			struct strbuf unquoted = STRBUF_INIT;
+			while (!strbuf_getline(&line, stdin)) {
+				if (line.buf[0] == '"') {
+					strbuf_setlen(&unquoted, 0);
+					if (unquote_c_style(&unquoted, line.buf, NULL))
+						die(_("unable to unquote C-style string '%s'"),
+						line.buf);
+
+					strbuf_swap(&unquoted, &line);
+				}
+
 				strbuf_to_cone_pattern(&line, &pl);
+			}
+
+			strbuf_release(&unquoted);
 		} else {
 			for (i = 0; i < argc; i++) {
 				strbuf_setlen(&line, 0);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 0a21a5e15d..459715d541 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -309,6 +309,9 @@ check_read_tree_errors () {
 	REPO=$1
 	FILES=$2
 	ERRORS=$3
+	git -C $REPO -c core.sparseCheckoutCone=false read-tree -mu HEAD 2>err &&
+	test_must_be_empty err &&
+	check_files $REPO "$FILES" &&
 	git -C $REPO read-tree -mu HEAD 2>err &&
 	if test -z "$ERRORS"
 	then
@@ -379,14 +382,28 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	git -C escaped reset --hard $COMMIT &&
 	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
 	git -C escaped sparse-checkout init --cone &&
-	cat >escaped/.git/info/sparse-checkout <<-\EOF &&
+	git -C escaped sparse-checkout set zbad\\dir "zdoes*not*exist" "zdoes*exist" &&
+	cat >expect <<-\EOF &&
 	/*
 	!/*/
 	/zbad\\dir/
+	/zdoes\*exist/
 	/zdoes\*not\*exist/
+	EOF
+	test_cmp expect escaped/.git/info/sparse-checkout &&
+	check_read_tree_errors escaped "a zbad\\dir zdoes*exist" &&
+	git -C escaped ls-tree -d --name-only HEAD | git -C escaped sparse-checkout set --stdin &&
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
+	/folder1/
+	/folder2/
+	/zbad\\dir/
 	/zdoes\*exist/
 	EOF
-	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
+	test_cmp expect escaped/.git/info/sparse-checkout &&
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist"
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 11/12] sparse-checkout: use C-style quotes in 'list' subcommand
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                       ` (9 preceding siblings ...)
  2020-01-28 18:26     ` [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-29 10:23       ` Jeff King
  2020-01-28 18:26     ` [PATCH v3 12/12] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
  12 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When in cone mode, the 'git sparse-checkout list' subcommand lists
the directories included in the sparse cone. When these directories
contain odd characters, such as a backslash, then we need to use
C-style quotes similar to 'git ls-tree'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          | 6 ++++--
 t/t1091-sparse-checkout-builtin.sh | 7 +++++--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 61414fef18..b3c1e97dba 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -78,8 +78,10 @@ static int sparse_checkout_list(int argc, const char **argv)
 
 		string_list_sort(&sl);
 
-		for (i = 0; i < sl.nr; i++)
-			printf("%s\n", sl.items[i].string);
+		for (i = 0; i < sl.nr; i++) {
+			quote_c_style(sl.items[i].string, NULL, stdout, 0);
+			printf("\n");
+		}
 
 		return 0;
 	}
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 459715d541..7617fb027a 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -392,7 +392,8 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	EOF
 	test_cmp expect escaped/.git/info/sparse-checkout &&
 	check_read_tree_errors escaped "a zbad\\dir zdoes*exist" &&
-	git -C escaped ls-tree -d --name-only HEAD | git -C escaped sparse-checkout set --stdin &&
+	git -C escaped ls-tree -d --name-only HEAD >list-expect &&
+	git -C escaped sparse-checkout set --stdin <list-expect &&
 	cat >expect <<-\EOF &&
 	/*
 	!/*/
@@ -403,7 +404,9 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	/zdoes\*exist/
 	EOF
 	test_cmp expect escaped/.git/info/sparse-checkout &&
-	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist"
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
+	git -C escaped sparse-checkout list >list-actual &&
+	test_cmp list-expect list-actual
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v3 12/12] sparse-checkout: improve docs around 'set' in cone mode
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                       ` (10 preceding siblings ...)
  2020-01-28 18:26     ` [PATCH v3 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
@ 2020-01-28 18:26     ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
  12 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-28 18:26 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The existing documentation does not clarify how the 'set' subcommand
changes when core.sparseCheckoutCone is enabled. Correct this by
changing some language around the "A/B/C" example. Also include a
description of the input format matching the output of 'git ls-tree
--name-only'.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-sparse-checkout.txt | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
index 4834fb434d..0914619881 100644
--- a/Documentation/git-sparse-checkout.txt
+++ b/Documentation/git-sparse-checkout.txt
@@ -50,6 +50,14 @@ To avoid interfering with other worktrees, it first enables the
 +
 When the `--stdin` option is provided, the patterns are read from
 standard in as a newline-delimited list instead of from the arguments.
++
+When `core.sparseCheckoutCone` is enabled, the input list is considered a
+list of directories instead of sparse-checkout patterns. The command writes
+patterns to the sparse-checkout file to include all files contained in those
+directories (recursively) as well as files that are siblings of ancestor
+directories. The input format matches the output of `git ls-tree --name-only`.
+This includes interpreting pathnames that begin with a double quote (") as
+C-style quoted strings.
 
 'disable'::
 	Disable the `core.sparseCheckout` config setting, and restore the
@@ -128,9 +136,12 @@ the following patterns:
 ----------------
 
 This says "include everything in root, but nothing two levels below root."
-If we then add the folder `A/B/C` as a recursive pattern, the folders `A` and
-`A/B` are added as parent patterns. The resulting sparse-checkout file is
-now
+
+When in cone mode, the `git sparse-checkout set` subcommand takes a list of
+directories instead of a list of sparse-checkout patterns. In this mode,
+the command `git sparse-checkout set A/B/C` sets the directory `A/B/C` as
+a recursive pattern, the directories `A` and `A/B` are added as parent
+patterns. The resulting sparse-checkout file is now
 
 ----------------
 /*
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 09/12] sparse-checkout: properly match escaped characters
  2020-01-28 18:26     ` [PATCH v3 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
@ 2020-01-29 10:03       ` Jeff King
  2020-01-29 13:58         ` Derrick Stolee
  0 siblings, 1 reply; 82+ messages in thread
From: Jeff King @ 2020-01-29 10:03 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, newren, Derrick Stolee

On Tue, Jan 28, 2020 at 06:26:40PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> In cone mode, the sparse-checkout feature uses hashset containment
> queries to match paths. Make this algorithm respect escaped asterisk
> (*) and backslash (\) characters.

Do we also need to worry about other glob metacharacters? E.g., "?" or
ranges like "[A-Z]"?

> +static char *dup_and_filter_pattern(const char *pattern)
> +{
> +	char *set, *read;
> +	char *result = xstrdup(pattern);
> +
> +	set = result;
> +	read = result;
> +
> +	while (*read) {
> +		/* skip escape characters (once) */
> +		if (*read == '\\')
> +			read++;
> +
> +		*set = *read;
> +
> +		set++;
> +		read++;
> +	}
> +	*set = 0;
> +
> +	if (*(read - 2) == '/' && *(read - 1) == '*')
> +		*(read - 2) = 0;
> +
> +	return result;
> +}

Do we need to check that the pattern is longer than 1 character here? If
it's a single character, it seems like this "*(read - 2)" will
dereference the byte before the string.

-Peff

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode
  2020-01-28 18:26     ` [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
@ 2020-01-29 10:17       ` Jeff King
  2020-01-29 10:33         ` Jeff King
  0 siblings, 1 reply; 82+ messages in thread
From: Jeff King @ 2020-01-29 10:17 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, newren, Derrick Stolee

On Tue, Jan 28, 2020 at 06:26:41PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> If a user somehow creates a directory with an asterisk (*) or backslash
> (\), then the "git sparse-checkout set" command will struggle to provide
> the correct pattern in the sparse-checkout file. When not in cone mode,
> the provided pattern is written directly into the sparse-checkout file.
> However, in cone mode we expect a list of paths to directories and then
> we convert those into patterns.

Is this really about cone mode? It seems more like it is about --stdin.
I.e., what are the rules for when the input is a filename and when it is
a pattern? In our earlier discussion, I had assumed that command-line
arguments to "sparse-checkout set" were actual filenames, and "--stdin"
just read them from stdin.

But looking at the documentation, they are always called "patterns" on
the command-line. Should the "--stdin" documentation make it clear that
we are no longer taking patterns, but instead actual filenames?

Or am I confused, and in non-cone-mode the "ls-tree | sparse-checkout"
pipeline is not supposed to work at all? (I.e., they really are always
patterns)?

> Even more specifically, the goal is to always allow the following from
> the root of a repo:
> 
>   git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin
> 
> The ls-tree command provides directory names with an unescaped asterisk.
> It also quotes the directories that contain an escaped backslash. We
> must remove these quotes, then keep the escaped backslashes.
> 
> However, there is some care needed for the timing of these escapes. The
> in-memory pattern list is used to update the working directory before
> writing the patterns to disk. Thus, we need the command to have the
> unescaped names in the hashsets for the cone comparisons, then escape
> the patterns later.

OK, this part make sense.

> Use unquote_c_style() when parsing lines from stdin. Command-line
> arguments will be parsed as-is, assuming the user can do the correct
> level of escaping from their environment to match the exact directory
> names.

I think there's two issues here: escaping characters from the shell so
that they make it intact to Git, and the question of whether Git is
expecting patterns or raw filenames. I agree the user is responsible for
the shell half, but I think we need to clarify what we're expecting.
I.e., if I say:

 git sparse-checkout set 'f*'

am I trying to match "foo", or the literal file "f*"?

> +static char *escaped_pattern(char *pattern)
> +{
> +	char *p = pattern;
> +	struct strbuf final = STRBUF_INIT;
> +
> +	while (*p) {
> +		if (*p == '*' || *p == '\\')
> +			strbuf_addch(&final, '\\');
> +
> +		strbuf_addch(&final, *p);
> +		p++;
> +	}
> +
> +	return strbuf_detach(&final, NULL);
> +}

Do we need to catch other metacharacters here (using is_glob_special()
perhaps)?

> @@ -423,8 +442,21 @@ static int sparse_checkout_set(int argc, const char **argv, const char *prefix)
>  		pl.use_cone_patterns = 1;
>  
>  		if (set_opts.use_stdin) {
> -			while (!strbuf_getline(&line, stdin))
> +			struct strbuf unquoted = STRBUF_INIT;
> +			while (!strbuf_getline(&line, stdin)) {
> +				if (line.buf[0] == '"') {
> +					strbuf_setlen(&unquoted, 0);

A minor nit, but strbuf_reset(&unquoted) would be more idiomatic here.

> +					if (unquote_c_style(&unquoted, line.buf, NULL))
> +						die(_("unable to unquote C-style string '%s'"),
> +						line.buf);
> +
> +					strbuf_swap(&unquoted, &line);
> +				}
> +
>  				strbuf_to_cone_pattern(&line, &pl);
> +			}

OK, overall this input procedure makes sense to me.

-Peff

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 11/12] sparse-checkout: use C-style quotes in 'list' subcommand
  2020-01-28 18:26     ` [PATCH v3 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
@ 2020-01-29 10:23       ` Jeff King
  0 siblings, 0 replies; 82+ messages in thread
From: Jeff King @ 2020-01-29 10:23 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, newren, Derrick Stolee

On Tue, Jan 28, 2020 at 06:26:42PM +0000, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@microsoft.com>
> 
> When in cone mode, the 'git sparse-checkout list' subcommand lists
> the directories included in the sparse cone. When these directories
> contain odd characters, such as a backslash, then we need to use
> C-style quotes similar to 'git ls-tree'.

Makes sense, and the code looks correct to me.

-Peff

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode
  2020-01-29 10:17       ` Jeff King
@ 2020-01-29 10:33         ` Jeff King
  2020-01-29 14:16           ` Derrick Stolee
  0 siblings, 1 reply; 82+ messages in thread
From: Jeff King @ 2020-01-29 10:33 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget; +Cc: git, me, newren, Derrick Stolee

On Wed, Jan 29, 2020 at 05:17:13AM -0500, Jeff King wrote:

> > From: Derrick Stolee <dstolee@microsoft.com>
> > 
> > If a user somehow creates a directory with an asterisk (*) or backslash
> > (\), then the "git sparse-checkout set" command will struggle to provide
> > the correct pattern in the sparse-checkout file. When not in cone mode,
> > the provided pattern is written directly into the sparse-checkout file.
> > However, in cone mode we expect a list of paths to directories and then
> > we convert those into patterns.
> 
> Is this really about cone mode? It seems more like it is about --stdin.
> I.e., what are the rules for when the input is a filename and when it is
> a pattern? In our earlier discussion, I had assumed that command-line
> arguments to "sparse-checkout set" were actual filenames, and "--stdin"
> just read them from stdin.
> 
> But looking at the documentation, they are always called "patterns" on
> the command-line. Should the "--stdin" documentation make it clear that
> we are no longer taking patterns, but instead actual filenames?
> 
> Or am I confused, and in non-cone-mode the "ls-tree | sparse-checkout"
> pipeline is not supposed to work at all? (I.e., they really are always
> patterns)?

Hmph, sorry, I _was_ just confused. I was reading a copy of the manpage
without your final patch, which made things much clearer.

So OK, I think the resulting documentation does make things clear. And
this is just about cone mode, not --stdin. Please ignore my ramblings in
the rest of the replied-to message. But...

> > Even more specifically, the goal is to always allow the following from
> > the root of a repo:
> > 
> >   git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin
> > 
> > The ls-tree command provides directory names with an unescaped asterisk.
> > It also quotes the directories that contain an escaped backslash. We
> > must remove these quotes, then keep the escaped backslashes.
> > 
> > However, there is some care needed for the timing of these escapes. The
> > in-memory pattern list is used to update the working directory before
> > writing the patterns to disk. Thus, we need the command to have the
> > unescaped names in the hashsets for the cone comparisons, then escape
> > the patterns later.
> 
> OK, this part make sense.

You could also demonstrate this even without --stdin with something
like:

  git config core.sparsecheckoutcone true
  git sparse-checkout set 'foo*bar'

which should take that as a literal filename and put the pattern
'foo\*bar' in the sparse-checkout file. And your tests do cover that.

So really there are two separate bugs here, and it might be a little
easier to explain the "timing of these escapes" thing by doing them
separately. I.e., the case above needs escaping and we could demonstrate
the bug with a command-line "set".  And then follow up by fixing the
problem with correctly de-quoting --stdin.

> > +static char *escaped_pattern(char *pattern)
> [...]
> Do we need to catch other metacharacters here (using is_glob_special()
> perhaps)?

After de-confusing myself, I think the individual code comments I wrote
still apply though (especially this one).

-Peff

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 09/12] sparse-checkout: properly match escaped characters
  2020-01-29 10:03       ` Jeff King
@ 2020-01-29 13:58         ` Derrick Stolee
  2020-01-29 14:04           ` Derrick Stolee
  0 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee @ 2020-01-29 13:58 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: git, me, newren, Derrick Stolee

On 1/29/2020 5:03 AM, Jeff King wrote:
> On Tue, Jan 28, 2020 at 06:26:40PM +0000, Derrick Stolee via GitGitGadget wrote:
> 
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> In cone mode, the sparse-checkout feature uses hashset containment
>> queries to match paths. Make this algorithm respect escaped asterisk
>> (*) and backslash (\) characters.
> 
> Do we also need to worry about other glob metacharacters? E.g., "?" or
> ranges like "[A-Z]"?

These are not part of the .gitignore patterns [1].

[1] https://git-scm.com/docs/gitignore#_pattern_format

>> +static char *dup_and_filter_pattern(const char *pattern)
>> +{
>> +	char *set, *read;
>> +	char *result = xstrdup(pattern);
>> +
>> +	set = result;
>> +	read = result;
>> +
>> +	while (*read) {
>> +		/* skip escape characters (once) */
>> +		if (*read == '\\')
>> +			read++;
>> +
>> +		*set = *read;
>> +
>> +		set++;
>> +		read++;
>> +	}
>> +	*set = 0;
>> +
>> +	if (*(read - 2) == '/' && *(read - 1) == '*')
>> +		*(read - 2) = 0;
>> +
>> +	return result;
>> +}
> 
> Do we need to check that the pattern is longer than 1 character here? If
> it's a single character, it seems like this "*(read - 2)" will
> dereference the byte before the string.

This method is only called by add_pattern_to_hashsets(), which
has a guard against paths of length less than 2, but thats' no
excuse for dangerous pointer arithmetic here.

But you also point out an even more confusing thing: why are we
modifying based on the 'read' pointer, and not the 'set' pointer?
This seems to work _accidentally_ only when the pattern has "<something>/*"
and "<something>" has no escape characters.

I had to recall exactly why we are dropping this "/*", but it's because
the pattern _actually_ ends with "/*/" but the in-memory pattern has
already dropped that last slash and applied PATTERN_FLAG_MUSTBEDIR.

Here is a diff that I can apply to this patch to fix this problem
_and_ demonstrate it in the tests:

diff --git a/dir.c b/dir.c
index 579f274d13..277577c8bf 100644
--- a/dir.c
+++ b/dir.c
@@ -633,6 +633,7 @@ int pl_hashmap_cmp(const void *unused_cmp_data,
 static char *dup_and_filter_pattern(const char *pattern)
 {
        char *set, *read;
+       size_t count  = 0;
        char *result = xstrdup(pattern);
 
        set = result;
@@ -647,11 +648,14 @@ static char *dup_and_filter_pattern(const char *pattern)
 
                set++;
                read++;
+               count++;
        }
        *set = 0;
 
-       if (*(read - 2) == '/' && *(read - 1) == '*')
-               *(read - 2) = 0;
+       if (count > 2 &&
+           *(set - 1) == '*' &&
+           *(set - 2) == '/')
+               *(set - 2) = 0;
 
        return result;
 }
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 0a21a5e15d..20b0465f77 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -383,6 +383,7 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
        /*
        !/*/
        /zbad\\dir/
+       !/zbad\\dir/*/
        /zdoes\*not\*exist/
        /zdoes\*exist/
        EOF

With this extra line in the test, but compiling the old version of this patch,
the test fails with:

'err' is not empty, it contains:
+ cat err
warning: unrecognized negative pattern: '/zbad\\dir/*'
warning: disabling cone pattern matching

To ensure this negative pattern exists in the later patch where we set
the patterns using the builtin, I'll add "zbad\\dir/bogus" to the list
of directories to include, which will add another pattern to the set.

Thanks,
-Stolee


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 09/12] sparse-checkout: properly match escaped characters
  2020-01-29 13:58         ` Derrick Stolee
@ 2020-01-29 14:04           ` Derrick Stolee
  0 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee @ 2020-01-29 14:04 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: git, me, newren, Derrick Stolee

On 1/29/2020 8:58 AM, Derrick Stolee wrote:
> On 1/29/2020 5:03 AM, Jeff King wrote:
>> On Tue, Jan 28, 2020 at 06:26:40PM +0000, Derrick Stolee via GitGitGadget wrote:
>>
>>> From: Derrick Stolee <dstolee@microsoft.com>
>>>
>>> In cone mode, the sparse-checkout feature uses hashset containment
>>> queries to match paths. Make this algorithm respect escaped asterisk
>>> (*) and backslash (\) characters.
>>
>> Do we also need to worry about other glob metacharacters? E.g., "?" or
>> ranges like "[A-Z]"?
> 
> These are not part of the .gitignore patterns [1].
> 
> [1] https://git-scm.com/docs/gitignore#_pattern_format

I should read things more carefully. There is also this information in
one of the bullets:

	An asterisk "*" matches anything except a slash. The character
	"?" matches any one character except "/". The range notation,
	e.g. [a-zA-Z], can be used to match one of the characters in a range.
	See fnmatch(3) and the FNM_PATHNAME flag for a more detailed
	description.

So this series does not attempt to properly work with globs, and I'll
need to test those a bit. Certainly they shouldn't work in cone mode,
so an extra patch to remove those would be simple. Input sanitizing
would be interesting, and I'll see what `git ls-tree` would output
with paths containing these characters.

-Stolee

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode
  2020-01-29 10:33         ` Jeff King
@ 2020-01-29 14:16           ` Derrick Stolee
  2020-01-29 14:39             ` Derrick Stolee
  2020-01-30  7:29             ` Jeff King
  0 siblings, 2 replies; 82+ messages in thread
From: Derrick Stolee @ 2020-01-29 14:16 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: git, me, newren, Derrick Stolee

On 1/29/2020 5:33 AM, Jeff King wrote:
> On Wed, Jan 29, 2020 at 05:17:13AM -0500, Jeff King wrote:
> 
>>> From: Derrick Stolee <dstolee@microsoft.com>
>>>
>>> If a user somehow creates a directory with an asterisk (*) or backslash
>>> (\), then the "git sparse-checkout set" command will struggle to provide
>>> the correct pattern in the sparse-checkout file. When not in cone mode,
>>> the provided pattern is written directly into the sparse-checkout file.
>>> However, in cone mode we expect a list of paths to directories and then
>>> we convert those into patterns.
>>
>> Is this really about cone mode? It seems more like it is about --stdin.
>> I.e., what are the rules for when the input is a filename and when it is
>> a pattern? In our earlier discussion, I had assumed that command-line
>> arguments to "sparse-checkout set" were actual filenames, and "--stdin"
>> just read them from stdin.
>>
>> But looking at the documentation, they are always called "patterns" on
>> the command-line. Should the "--stdin" documentation make it clear that
>> we are no longer taking patterns, but instead actual filenames?
>>
>> Or am I confused, and in non-cone-mode the "ls-tree | sparse-checkout"
>> pipeline is not supposed to work at all? (I.e., they really are always
>> patterns)?
> 
> Hmph, sorry, I _was_ just confused. I was reading a copy of the manpage
> without your final patch, which made things much clearer.
> 
> So OK, I think the resulting documentation does make things clear. And
> this is just about cone mode, not --stdin. Please ignore my ramblings in
> the rest of the replied-to message. But...
> 
>>> Even more specifically, the goal is to always allow the following from
>>> the root of a repo:
>>>
>>>   git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin
>>>
>>> The ls-tree command provides directory names with an unescaped asterisk.
>>> It also quotes the directories that contain an escaped backslash. We
>>> must remove these quotes, then keep the escaped backslashes.
>>>
>>> However, there is some care needed for the timing of these escapes. The
>>> in-memory pattern list is used to update the working directory before
>>> writing the patterns to disk. Thus, we need the command to have the
>>> unescaped names in the hashsets for the cone comparisons, then escape
>>> the patterns later.
>>
>> OK, this part make sense.
> 
> You could also demonstrate this even without --stdin with something
> like:
> 
>   git config core.sparsecheckoutcone true
>   git sparse-checkout set 'foo*bar'
> 
> which should take that as a literal filename and put the pattern
> 'foo\*bar' in the sparse-checkout file. And your tests do cover that.
> 
> So really there are two separate bugs here, and it might be a little
> easier to explain the "timing of these escapes" thing by doing them
> separately. I.e., the case above needs escaping and we could demonstrate
> the bug with a command-line "set".  And then follow up by fixing the
> problem with correctly de-quoting --stdin.

I've locally split the commit into two parts. That makes things much
simpler to read.

>>> +static char *escaped_pattern(char *pattern)
>> [...]
>> Do we need to catch other metacharacters here (using is_glob_special()
>> perhaps)?
> 
> After de-confusing myself, I think the individual code comments I wrote
> still apply though (especially this one).

I've applied the smaller comments and am now investigating the right
thing to do with other is_glob_special() characters. There is a small
chance that I can replace any "c == '*' || c == '\'" with is_glob_special(),
but we shall see. At the very least, I'll need to expand my tests.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode
  2020-01-29 14:16           ` Derrick Stolee
@ 2020-01-29 14:39             ` Derrick Stolee
  2020-01-30  7:29             ` Jeff King
  1 sibling, 0 replies; 82+ messages in thread
From: Derrick Stolee @ 2020-01-29 14:39 UTC (permalink / raw)
  To: Jeff King, Derrick Stolee via GitGitGadget
  Cc: git, me, newren, Derrick Stolee

On 1/29/2020 9:16 AM, Derrick Stolee wrote:
> On 1/29/2020 5:33 AM, Jeff King wrote:
>> On Wed, Jan 29, 2020 at 05:17:13AM -0500, Jeff King wrote:
>>>> +static char *escaped_pattern(char *pattern)
>>> [...]
>>> Do we need to catch other metacharacters here (using is_glob_special()
>>> perhaps)?
>>
>> After de-confusing myself, I think the individual code comments I wrote
>> still apply though (especially this one).
> 
> I've applied the smaller comments and am now investigating the right
> thing to do with other is_glob_special() characters. There is a small
> chance that I can replace any "c == '*' || c == '\'" with is_glob_special(),
> but we shall see. At the very least, I'll need to expand my tests.

I think I have a handle on these cases, and I've pushed it to my GGG PR.
I'll let this version settle a bit for more review before updating it
with a v4.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode
  2020-01-29 14:16           ` Derrick Stolee
  2020-01-29 14:39             ` Derrick Stolee
@ 2020-01-30  7:29             ` Jeff King
  2020-01-30 15:01               ` Derrick Stolee
  1 sibling, 1 reply; 82+ messages in thread
From: Jeff King @ 2020-01-30  7:29 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Derrick Stolee via GitGitGadget, git, me, newren, Derrick Stolee

On Wed, Jan 29, 2020 at 09:16:11AM -0500, Derrick Stolee wrote:

> I've applied the smaller comments and am now investigating the right
> thing to do with other is_glob_special() characters. There is a small
> chance that I can replace any "c == '*' || c == '\'" with is_glob_special(),
> but we shall see. At the very least, I'll need to expand my tests.

Yeah, that's all I'd expect to need. You mentioned earlier about how
ls-tree would output them, but I don't think it would matter. Now that
you're using unquote_c_style(), you'll get the literal filenames no
matter which way ls-tree decides to quote them (and I don't think it
would quote '?', just as it wouldn't '*', because those are not
syntactically significant in its output).

-Peff

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode
  2020-01-30  7:29             ` Jeff King
@ 2020-01-30 15:01               ` Derrick Stolee
  0 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee @ 2020-01-30 15:01 UTC (permalink / raw)
  To: Jeff King
  Cc: Derrick Stolee via GitGitGadget, git, me, newren, Derrick Stolee

On 1/30/2020 2:29 AM, Jeff King wrote:
> On Wed, Jan 29, 2020 at 09:16:11AM -0500, Derrick Stolee wrote:
> 
>> I've applied the smaller comments and am now investigating the right
>> thing to do with other is_glob_special() characters. There is a small
>> chance that I can replace any "c == '*' || c == '\'" with is_glob_special(),
>> but we shall see. At the very least, I'll need to expand my tests.
> 
> Yeah, that's all I'd expect to need. You mentioned earlier about how
> ls-tree would output them, but I don't think it would matter. Now that
> you're using unquote_c_style(), you'll get the literal filenames no
> matter which way ls-tree decides to quote them (and I don't think it
> would quote '?', just as it wouldn't '*', because those are not
> syntactically significant in its output).

Yes, even this case for 'git ls-tree' gets covered in the final
version of the test:

test_expect_success BSLASHPSPEC 'pattern-checks: escaped characters' '
	git clone repo escaped &&
	TREEOID=$(git -C escaped rev-parse HEAD:folder1) &&
	NEWTREE=$(git -C escaped mktree <<-EOF
	$(git -C escaped ls-tree HEAD)
	040000 tree $TREEOID	zbad\\dir
	040000 tree $TREEOID	zdoes*exist
	040000 tree $TREEOID	zglob[!a]?
	EOF
	) &&
	COMMIT=$(git -C escaped commit-tree $NEWTREE -p HEAD) &&
	git -C escaped reset --hard $COMMIT &&
	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" zglob[!a]? &&
	git -C escaped sparse-checkout init --cone &&
	git -C escaped sparse-checkout set zbad\\dir/bogus "zdoes*not*exist" "zdoes*exist" "zglob[!a]?" &&
	cat >expect <<-\EOF &&
	/*
	!/*/
	/zbad\\dir/
	!/zbad\\dir/*/
	/zbad\\dir/bogus/
	/zdoes\*exist/
	/zdoes\*not\*exist/
	/zglob\[!a]\?/
	EOF
	test_cmp expect escaped/.git/info/sparse-checkout &&
	check_read_tree_errors escaped "a zbad\\dir zdoes*exist zglob[!a]?" &&
	git -C escaped ls-tree -d --name-only HEAD >list-expect &&
	git -C escaped sparse-checkout set --stdin <list-expect &&
	cat >expect <<-\EOF &&
	/*
	!/*/
	/deep/
	/folder1/
	/folder2/
	/zbad\\dir/
	/zdoes\*exist/
	/zglob\[!a]\?/
	EOF
	test_cmp expect escaped/.git/info/sparse-checkout &&
	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" zglob[!a]? &&
	git -C escaped sparse-checkout list >list-actual &&
	test_cmp list-expect list-actual
'

Thanks,
-Stolee



^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v4 00/15] Harden the sparse-checkout builtin
  2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                       ` (11 preceding siblings ...)
  2020-01-28 18:26     ` [PATCH v3 12/12] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16     ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 01/15] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
                         ` (15 more replies)
  12 siblings, 16 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee

This series is based on ds/sparse-list-in-cone-mode.

This series attempts to clean up some rough edges in the sparse-checkout
feature, especially around the cone mode.

Unfortunately, after the v2.25.0 release, we noticed an issue with the "git
clone --sparse" option when using a URL instead of a local path. This is
fixed and properly tested here.

Also, let's improve Git's response to these more complicated scenarios:

 1. Running "git sparse-checkout init" in a worktree would complain because
    the "info" dir doesn't exist.
 2. Tracked paths that include "*" and "\" in their filenames.
 3. If a user edits the sparse-checkout file to have non-cone pattern, such
    as "**" anywhere or "*" in the wrong place, then we should respond
    appropriately. That is: warn that the patterns are not cone-mode, then
    revert to the old logic.

Updates in V2:

 * Added C-style quoting to the output of "git sparse-checkout list" in cone
   mode.
 * Improved documentation.
 * Responded to most style feedback. Hopefully I didn't miss anything.
 * I was lingering on this a little to see if I could also fix the issue
   raised in [1], but I have not figured that one out, yet.

Update in V3:

 * Input now uses Peff's recommended pattern: unquote C-style strings over
   stdin and otherwise do not un-escape input.

[1] 
https://lore.kernel.org/git/062301d5d0bc$c3e17760$4ba46620$@Frontier.com/

Thanks, -Stolee

Derrick Stolee (14):
  t1091: use check_files to reduce boilerplate
  t1091: improve here-docs
  sparse-checkout: create leading directories
  clone: fix --sparse option with URLs
  sparse-checkout: cone mode does not recognize "**"
  sparse-checkout: detect short patterns
  sparse-checkout: warn on globs in cone patterns
  sparse-checkout: properly match escaped characters
  sparse-checkout: write escaped patterns in cone mode
  sparse-checkout: unquote C-style strings over --stdin
  sparse-checkout: use C-style quotes in 'list' subcommand
  sparse-checkout: escape all glob characters on write
  sparse-checkout: improve docs around 'set' in cone mode
  sparse-checkout: fix cone mode behavior mismatch

Jeff King (1):
  sparse-checkout: fix documentation typo for core.sparseCheckoutCone

 Documentation/git-sparse-checkout.txt |  19 +-
 builtin/clone.c                       |   2 +-
 builtin/sparse-checkout.c             |  48 +++-
 dir.c                                 |  79 +++++-
 t/t1091-sparse-checkout-builtin.sh    | 352 +++++++++++++++-----------
 unpack-trees.c                        |   2 +-
 6 files changed, 346 insertions(+), 156 deletions(-)


base-commit: 4fd683b6a35eabd23dd5183da7f654a1e1f00325
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-513%2Fderrickstolee%2Fsparse-harden-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-513/derrickstolee/sparse-harden-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/513

Range-diff vs v3:

  1:  1cc825412f =  1:  1cc825412f t1091: use check_files to reduce boilerplate
  2:  b7a6ad145a =  2:  b7a6ad145a t1091: improve here-docs
  3:  5497ad8778 =  3:  5497ad8778 sparse-checkout: create leading directories
  4:  4991a51f6d =  4:  4991a51f6d clone: fix --sparse option with URLs
  5:  ae78c3069b =  5:  ae78c3069b sparse-checkout: fix documentation typo for core.sparseCheckoutCone
  6:  2ad4d3e467 =  6:  2ad4d3e467 sparse-checkout: cone mode does not recognize "**"
  7:  aace064510 =  7:  aace064510 sparse-checkout: detect short patterns
  8:  d2a510a3bb !  8:  66caabef5f sparse-checkout: warn on incorrect '*' in patterns
     @@ -1,6 +1,6 @@
      Author: Derrick Stolee <dstolee@microsoft.com>
      
     -    sparse-checkout: warn on incorrect '*' in patterns
     +    sparse-checkout: warn on globs in cone patterns
      
          In cone mode, the sparse-checkout commmand will write patterns that
          allow faster pattern matching. This matching only works if the patterns
     @@ -9,10 +9,10 @@
          cone mode matching to fail.
      
          The cone mode patterns may end in "/*" but otherwise an un-escaped
     -    asterisk is invalid. Add checks to disable cone mode when seeing these
     -    values.
     +    asterisk or other glob character is invalid. Add checks to disable
     +    cone mode when seeing these values.
      
     -    A later change will properly handle escaped asterisks.
     +    A later change will properly handle escaped globs.
      
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
     @@ -43,16 +43,23 @@
      +	next = given->pattern + 2;
      +
      +	while (*cur) {
     -+		/* We care about *cur == '*' */
     -+		if (*cur != '*')
     ++		/* Watch for glob characters '*', '\', '[', '?' */
     ++		if (!is_glob_special(*cur))
      +			goto increment;
      +
      +		/* But only if *prev != '\\' */
      +		if (*prev == '\\')
      +			goto increment;
      +
     ++		/* But allow the initial '\' */
     ++		if (*cur == '\\' &&
     ++		    is_glob_special(*next))
     ++			goto increment;
     ++
      +		/* But a trailing '/' then '*' is fine */
     -+		if (*prev == '/' && *next == 0)
     ++		if (*prev == '/' &&
     ++		    *cur == '*' &&
     ++		    *next == 0)
      +			goto increment;
      +
      +		/* Not a cone pattern. */
     @@ -94,6 +101,18 @@
      +	check_read_tree_errors repo "a deep" "disabling cone pattern matching"
      +'
      +
     ++test_expect_success 'pattern-checks: contained glob characters' '
     ++	for c in "[a]" "\\" "?" "*"
     ++	do
     ++		cat >repo/.git/info/sparse-checkout <<-EOF &&
     ++		/*
     ++		!/*/
     ++		something$c-else/
     ++		EOF
     ++		check_read_tree_errors repo "a" "disabling cone pattern matching"
     ++	done
     ++'
     ++
      +test_expect_success 'pattern-checks: escaped "*"' '
      +	cat >repo/.git/info/sparse-checkout <<-\EOF &&
      +	/*
  9:  9ea69e9069 !  9:  4c86d01f0e sparse-checkout: properly match escaped characters
     @@ -23,6 +23,7 @@
      +static char *dup_and_filter_pattern(const char *pattern)
      +{
      +	char *set, *read;
     ++	size_t count  = 0;
      +	char *result = xstrdup(pattern);
      +
      +	set = result;
     @@ -37,11 +38,14 @@
      +
      +		set++;
      +		read++;
     ++		count++;
      +	}
      +	*set = 0;
      +
     -+	if (*(read - 2) == '/' && *(read - 1) == '*')
     -+		*(read - 2) = 0;
     ++	if (count > 2 &&
     ++	    *(set - 1) == '*' &&
     ++	    *(set - 2) == '/')
     ++		*(set - 2) = 0;
      +
      +	return result;
      +}
     @@ -73,7 +77,7 @@
       --- a/t/t1091-sparse-checkout-builtin.sh
       +++ b/t/t1091-sparse-checkout-builtin.sh
      @@
     - 	check_read_tree_errors repo "a deep" "disabling cone pattern matching"
     + 	done
       '
       
      -test_expect_success 'pattern-checks: escaped "*"' '
     @@ -96,6 +100,7 @@
       	!/*/
      -	/does\*not\*exist/
      +	/zbad\\dir/
     ++	!/zbad\\dir/*/
      +	/zdoes\*not\*exist/
      +	/zdoes\*exist/
       	EOF
 10:  e2f9afc70c ! 10:  0b9346f67b sparse-checkout: write escaped patterns in cone mode
     @@ -9,26 +9,12 @@
          However, in cone mode we expect a list of paths to directories and then
          we convert those into patterns.
      
     -    Even more specifically, the goal is to always allow the following from
     -    the root of a repo:
     -
     -      git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin
     -
     -    The ls-tree command provides directory names with an unescaped asterisk.
     -    It also quotes the directories that contain an escaped backslash. We
     -    must remove these quotes, then keep the escaped backslashes.
     -
          However, there is some care needed for the timing of these escapes. The
          in-memory pattern list is used to update the working directory before
          writing the patterns to disk. Thus, we need the command to have the
          unescaped names in the hashsets for the cone comparisons, then escape
          the patterns later.
      
     -    Use unquote_c_style() when parsing lines from stdin. Command-line
     -    arguments will be parsed as-is, assuming the user can do the correct
     -    level of escaping from their environment to match the exact directory
     -    names.
     -
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
     @@ -89,29 +75,6 @@
       	}
       }
       
     -@@
     - 		pl.use_cone_patterns = 1;
     - 
     - 		if (set_opts.use_stdin) {
     --			while (!strbuf_getline(&line, stdin))
     -+			struct strbuf unquoted = STRBUF_INIT;
     -+			while (!strbuf_getline(&line, stdin)) {
     -+				if (line.buf[0] == '"') {
     -+					strbuf_setlen(&unquoted, 0);
     -+					if (unquote_c_style(&unquoted, line.buf, NULL))
     -+						die(_("unable to unquote C-style string '%s'"),
     -+						line.buf);
     -+
     -+					strbuf_swap(&unquoted, &line);
     -+				}
     -+
     - 				strbuf_to_cone_pattern(&line, &pl);
     -+			}
     -+
     -+			strbuf_release(&unquoted);
     - 		} else {
     - 			for (i = 0; i < argc; i++) {
     - 				strbuf_setlen(&line, 0);
      
       diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
       --- a/t/t1091-sparse-checkout-builtin.sh
     @@ -131,29 +94,18 @@
       	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
       	git -C escaped sparse-checkout init --cone &&
      -	cat >escaped/.git/info/sparse-checkout <<-\EOF &&
     -+	git -C escaped sparse-checkout set zbad\\dir "zdoes*not*exist" "zdoes*exist" &&
     ++	git -C escaped sparse-checkout set zbad\\dir/bogus "zdoes*not*exist" "zdoes*exist" &&
      +	cat >expect <<-\EOF &&
       	/*
       	!/*/
       	/zbad\\dir/
     -+	/zdoes\*exist/
     - 	/zdoes\*not\*exist/
     -+	EOF
     -+	test_cmp expect escaped/.git/info/sparse-checkout &&
     -+	check_read_tree_errors escaped "a zbad\\dir zdoes*exist" &&
     -+	git -C escaped ls-tree -d --name-only HEAD | git -C escaped sparse-checkout set --stdin &&
     -+	cat >expect <<-\EOF &&
     -+	/*
     -+	!/*/
     -+	/deep/
     -+	/folder1/
     -+	/folder2/
     -+	/zbad\\dir/
     + 	!/zbad\\dir/*/
     +-	/zdoes\*not\*exist/
     ++	/zbad\\dir/bogus/
       	/zdoes\*exist/
     ++	/zdoes\*not\*exist/
       	EOF
     --	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
      +	test_cmp expect escaped/.git/info/sparse-checkout &&
     -+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist"
     + 	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
       '
       
     - test_done
  -:  ---------- > 11:  9f682e6076 sparse-checkout: unquote C-style strings over --stdin
 11:  ec714a4cf0 = 12:  e2c6f85617 sparse-checkout: use C-style quotes in 'list' subcommand
  -:  ---------- > 13:  54be8e89eb sparse-checkout: escape all glob characters on write
 12:  1867746d97 = 14:  3dd8f97b3a sparse-checkout: improve docs around 'set' in cone mode
  -:  ---------- > 15:  5e9fcce75f sparse-checkout: fix cone mode behavior mismatch

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 82+ messages in thread

* [PATCH v4 01/15] t1091: use check_files to reduce boilerplate
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 02/15] t1091: improve here-docs Derrick Stolee via GitGitGadget
                         ` (14 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When testing the sparse-checkout feature, we need to compare the
contents of the working-directory against some expected output.
Using here-docs was useful in the beginning, but became repetetive
as the test script grew.

Create a check_files helper to make the tests simpler and easier
to extend. It also reduces instances of bad here-doc whitespace.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1091-sparse-checkout-builtin.sh | 117 ++++++-----------------------
 1 file changed, 22 insertions(+), 95 deletions(-)

diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index ff7f8f7a1f..e058a20ad6 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -12,6 +12,13 @@ list_files() {
 	(cd "$1" && printf '%s\n' *)
 }
 
+check_files() {
+	list_files "$1" >actual &&
+	shift &&
+	printf "%s\n" $@ >expect &&
+	test_cmp expect actual
+}
+
 test_expect_success 'setup' '
 	git init repo &&
 	(
@@ -58,9 +65,7 @@ test_expect_success 'git sparse-checkout init' '
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	test_cmp_config -C repo true core.sparsecheckout &&
-	list_files repo >dir  &&
-	echo a >expect &&
-	test_cmp expect dir
+	check_files repo a
 '
 
 test_expect_success 'git sparse-checkout list after init' '
@@ -81,13 +86,7 @@ test_expect_success 'init with existing sparse-checkout' '
 		*folder*
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'clone --sparse' '
@@ -98,9 +97,7 @@ test_expect_success 'clone --sparse' '
 		!/*/
 	EOF
 	test_cmp expect actual &&
-	list_files clone >dir &&
-	echo a >expect &&
-	test_cmp expect dir
+	check_files clone a
 '
 
 test_expect_success 'set enables config' '
@@ -127,13 +124,7 @@ test_expect_success 'set sparse-checkout using builtin' '
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'set sparse-checkout using --stdin' '
@@ -147,13 +138,7 @@ test_expect_success 'set sparse-checkout using --stdin' '
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo "a folder1 folder2"
 '
 
 test_expect_success 'cone mode: match patterns' '
@@ -162,13 +147,7 @@ test_expect_success 'cone mode: match patterns' '
 	git -C repo read-tree -mu HEAD 2>err &&
 	test_i18ngrep ! "disabling cone patterns" err &&
 	git -C repo reset --hard &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'cone mode: warn on bad pattern' '
@@ -185,14 +164,7 @@ test_expect_success 'sparse-checkout disable' '
 	test_path_is_file repo/.git/info/sparse-checkout &&
 	git -C repo config --list >config &&
 	test_must_fail git config core.sparseCheckout &&
-	list_files repo >dir &&
-	cat >expect <<-EOF &&
-		a
-		deep
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files repo a deep folder1 folder2
 '
 
 test_expect_success 'cone mode: init and set' '
@@ -204,24 +176,9 @@ test_expect_success 'cone mode: init and set' '
 	test_cmp expect dir &&
 	git -C repo sparse-checkout set deep/deeper1/deepest/ 2>err &&
 	test_must_be_empty err &&
-	list_files repo >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deep
-	EOF
-	test_cmp expect dir &&
-	list_files repo/deep >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deeper1
-	EOF
-	test_cmp expect dir &&
-	list_files repo/deep/deeper1 >dir  &&
-	cat >expect <<-EOF &&
-		a
-		deepest
-	EOF
-	test_cmp expect dir &&
+	check_files repo a deep &&
+	check_files repo/deep a deeper1 &&
+	check_files repo/deep/deeper1 a deepest &&
 	cat >expect <<-EOF &&
 		/*
 		!/*/
@@ -237,13 +194,7 @@ test_expect_success 'cone mode: init and set' '
 		folder2
 	EOF
 	test_must_be_empty err &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-		folder2
-	EOF
-	list_files repo >dir &&
-	test_cmp expect dir
+	check_files repo a folder1 folder2
 '
 
 test_expect_success 'cone mode: list' '
@@ -275,13 +226,7 @@ test_expect_success 'revert to old sparse-checkout on bad update' '
 	test_must_fail git -C repo sparse-checkout set deep/deeper1 2>err &&
 	test_i18ngrep "cannot set sparse-checkout patterns" err &&
 	test_cmp repo/.git/info/sparse-checkout expect &&
-	list_files repo/deep >dir &&
-	cat >expect <<-EOF &&
-		a
-		deeper1
-		deeper2
-	EOF
-	test_cmp dir expect
+	check_files repo/deep a deeper1 deeper2
 '
 
 test_expect_success 'revert to old sparse-checkout on empty update' '
@@ -332,12 +277,7 @@ test_expect_success 'cone mode: set with core.ignoreCase=true' '
 		/folder1/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	list_files repo >dir &&
-	cat >expect <<-EOF &&
-		a
-		folder1
-	EOF
-	test_cmp expect dir
+	check_files repo a folder1
 '
 
 test_expect_success 'interaction with submodules' '
@@ -351,21 +291,8 @@ test_expect_success 'interaction with submodules' '
 		git sparse-checkout init --cone &&
 		git sparse-checkout set folder1
 	) &&
-	list_files super >dir &&
-	cat >expect <<-\EOF &&
-		a
-		folder1
-		modules
-	EOF
-	test_cmp expect dir &&
-	list_files super/modules/child >dir &&
-	cat >expect <<-\EOF &&
-		a
-		deep
-		folder1
-		folder2
-	EOF
-	test_cmp expect dir
+	check_files super a folder1 modules &&
+	check_files super/modules/child a deep folder1 folder2
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 02/15] t1091: improve here-docs
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 01/15] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 03/15] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
                         ` (13 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

t1091-sparse-checkout-builtin.sh uses here-docs to populate the
expected contents of the sparse-checkout file. These do not use
shell interpolation, so use "-\EOF" instead of "-EOF". Also use
proper tabbing.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1091-sparse-checkout-builtin.sh | 98 +++++++++++++++---------------
 1 file changed, 49 insertions(+), 49 deletions(-)

diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index e058a20ad6..e28e1c797f 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -46,11 +46,11 @@ test_expect_success 'git sparse-checkout list (empty)' '
 
 test_expect_success 'git sparse-checkout list (populated)' '
 	test_when_finished rm -f repo/.git/info/sparse-checkout &&
-	cat >repo/.git/info/sparse-checkout <<-EOF &&
-		/folder1/*
-		/deep/
-		**/a
-		!*bin*
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/folder1/*
+	/deep/
+	**/a
+	!*bin*
 	EOF
 	cp repo/.git/info/sparse-checkout expect &&
 	git -C repo sparse-checkout list >list &&
@@ -59,9 +59,9 @@ test_expect_success 'git sparse-checkout list (populated)' '
 
 test_expect_success 'git sparse-checkout init' '
 	git -C repo sparse-checkout init &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	test_cmp_config -C repo true core.sparsecheckout &&
@@ -70,9 +70,9 @@ test_expect_success 'git sparse-checkout init' '
 
 test_expect_success 'git sparse-checkout list after init' '
 	git -C repo sparse-checkout list >actual &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect actual
 '
@@ -80,10 +80,10 @@ test_expect_success 'git sparse-checkout list after init' '
 test_expect_success 'init with existing sparse-checkout' '
 	echo "*folder*" >> repo/.git/info/sparse-checkout &&
 	git -C repo sparse-checkout init &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		*folder*
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	*folder*
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	check_files repo a folder1 folder2
@@ -92,9 +92,9 @@ test_expect_success 'init with existing sparse-checkout' '
 test_expect_success 'clone --sparse' '
 	git clone --sparse repo clone &&
 	git -C clone sparse-checkout list >actual &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
 	EOF
 	test_cmp expect actual &&
 	check_files clone a
@@ -116,10 +116,10 @@ test_expect_success 'set enables config' '
 
 test_expect_success 'set sparse-checkout using builtin' '
 	git -C repo sparse-checkout set "/*" "!/*/" "*folder*" &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		*folder*
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	*folder*
 	EOF
 	git -C repo sparse-checkout list >actual &&
 	test_cmp expect actual &&
@@ -128,11 +128,11 @@ test_expect_success 'set sparse-checkout using builtin' '
 '
 
 test_expect_success 'set sparse-checkout using --stdin' '
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/folder1/
-		/folder2/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/folder1/
+	/folder2/
 	EOF
 	git -C repo sparse-checkout set --stdin <expect &&
 	git -C repo sparse-checkout list >actual &&
@@ -179,28 +179,28 @@ test_expect_success 'cone mode: init and set' '
 	check_files repo a deep &&
 	check_files repo/deep a deeper1 &&
 	check_files repo/deep/deeper1 a deepest &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/deep/
-		!/deep/*/
-		/deep/deeper1/
-		!/deep/deeper1/*/
-		/deep/deeper1/deepest/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
+	!/deep/*/
+	/deep/deeper1/
+	!/deep/deeper1/*/
+	/deep/deeper1/deepest/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
-	git -C repo sparse-checkout set --stdin 2>err <<-EOF &&
-		folder1
-		folder2
+	git -C repo sparse-checkout set --stdin 2>err <<-\EOF &&
+	folder1
+	folder2
 	EOF
 	test_must_be_empty err &&
 	check_files repo a folder1 folder2
 '
 
 test_expect_success 'cone mode: list' '
-	cat >expect <<-EOF &&
-		folder1
-		folder2
+	cat >expect <<-\EOF &&
+	folder1
+	folder2
 	EOF
 	git -C repo sparse-checkout set --stdin <expect &&
 	git -C repo sparse-checkout list >actual 2>err &&
@@ -211,10 +211,10 @@ test_expect_success 'cone mode: list' '
 test_expect_success 'cone mode: set with nested folders' '
 	git -C repo sparse-checkout set deep deep/deeper1/deepest 2>err &&
 	test_line_count = 0 err &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/deep/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
 	EOF
 	test_cmp repo/.git/info/sparse-checkout expect
 '
@@ -271,10 +271,10 @@ test_expect_success 'sparse-checkout (init|set|disable) fails with dirty status'
 test_expect_success 'cone mode: set with core.ignoreCase=true' '
 	git -C repo sparse-checkout init --cone &&
 	git -C repo -c core.ignoreCase=true sparse-checkout set folder1 &&
-	cat >expect <<-EOF &&
-		/*
-		!/*/
-		/folder1/
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/folder1/
 	EOF
 	test_cmp expect repo/.git/info/sparse-checkout &&
 	check_files repo a folder1
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 03/15] sparse-checkout: create leading directories
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 01/15] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 02/15] t1091: improve here-docs Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 04/15] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
                         ` (12 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The 'git init' command creates the ".git/info" directory and fills it
with some default files. However, 'git worktree add' does not create
the info directory for that worktree. This causes a problem when running
"git sparse-checkout init" inside a worktree. While care was taken to
allow the sparse-checkout config to be specific to a worktree, this
initialization was untested.

Safely create the leading directories for the sparse-checkout file. This
is the safest thing to do even without worktrees, as a user could delete
their ".git/info" directory and expect Git to recover safely.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          |  4 ++++
 t/t1091-sparse-checkout-builtin.sh | 10 ++++++++++
 2 files changed, 14 insertions(+)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index b3bed891cb..3cee8ab46e 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -199,6 +199,10 @@ static int write_patterns_and_update(struct pattern_list *pl)
 	int result;
 
 	sparse_filename = get_sparse_checkout_filename();
+
+	if (safe_create_leading_directories(sparse_filename))
+		die(_("failed to create directory for sparse-checkout file"));
+
 	fd = hold_lock_file_for_update(&lk, sparse_filename,
 				      LOCK_DIE_ON_ERROR);
 
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index e28e1c797f..43d1f7520c 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -295,4 +295,14 @@ test_expect_success 'interaction with submodules' '
 	check_files super/modules/child a deep folder1 folder2
 '
 
+test_expect_success 'different sparse-checkouts with worktrees' '
+	git -C repo worktree add --detach ../worktree &&
+	check_files worktree "a deep folder1 folder2" &&
+	git -C worktree sparse-checkout init --cone &&
+	git -C repo sparse-checkout set folder1 &&
+	git -C worktree sparse-checkout set deep/deeper1 &&
+	check_files repo a folder1 &&
+	check_files worktree a deep
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 04/15] clone: fix --sparse option with URLs
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (2 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 03/15] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 05/15] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
                         ` (11 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The --sparse option was added to the clone builtin in d89f09c (clone:
add --sparse mode, 2019-11-21) and was tested with a local path clone
in t1091-sparse-checkout-builtin.sh. However, due to a difference in
how local paths are handled versus URLs, this mechanism does not work
with URLs.

Modify the test to use a "file://" URL, which would output this error
before the code change:

  Cloning into 'clone'...
  fatal: cannot change to 'file://.../repo': No such file or directory
  error: failed to initialize sparse-checkout

These errors are due to using a "-C <path>" option to call 'git -C
<path> sparse-checkout init' but the URL is being given instead of
the target directory.

Update that target directory to evaluate this correctly. I have also
manually tested that https:// URLs are handled correctly as well.

Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/clone.c                    | 2 +-
 t/t1091-sparse-checkout-builtin.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 4348d962c9..2caefc44fb 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1130,7 +1130,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (option_required_reference.nr || option_optional_reference.nr)
 		setup_reference();
 
-	if (option_sparse_checkout && git_sparse_checkout_init(repo))
+	if (option_sparse_checkout && git_sparse_checkout_init(dir))
 		return 1;
 
 	remote = remote_get(option_origin);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 43d1f7520c..cf4a595c86 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -90,7 +90,7 @@ test_expect_success 'init with existing sparse-checkout' '
 '
 
 test_expect_success 'clone --sparse' '
-	git clone --sparse repo clone &&
+	git clone --sparse "file://$(pwd)/repo" clone &&
 	git -C clone sparse-checkout list >actual &&
 	cat >expect <<-\EOF &&
 	/*
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 05/15] sparse-checkout: fix documentation typo for core.sparseCheckoutCone
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (3 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 04/15] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Jeff King via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 06/15] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
                         ` (10 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Jeff King via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Jeff King

From: Jeff King <peff@peff.net>

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-sparse-checkout.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
index 3b341cf0fc..4834fb434d 100644
--- a/Documentation/git-sparse-checkout.txt
+++ b/Documentation/git-sparse-checkout.txt
@@ -106,7 +106,7 @@ The full pattern set allows for arbitrary pattern matches and complicated
 inclusion/exclusion rules. These can result in O(N*M) pattern matches when
 updating the index, where N is the number of patterns and M is the number
 of paths in the index. To combat this performance issue, a more restricted
-pattern set is allowed when `core.spareCheckoutCone` is enabled.
+pattern set is allowed when `core.sparseCheckoutCone` is enabled.
 
 The accepted patterns in the cone pattern set are:
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 06/15] sparse-checkout: cone mode does not recognize "**"
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (4 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 05/15] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 07/15] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
                         ` (9 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When core.sparseCheckoutCone is enabled, the 'git sparse-checkout set'
command creates a restricted set of possible patterns that are used
by a custom algorithm to quickly match those patterns.

If a user manually edits the sparse-checkout file, then they could
create patterns that do not match these expectations. The cone-mode
matching algorithm can return incorrect results. The solution is to
detect these incorrect patterns, warn that we do not recognize them,
and revert to the standard algorithm.

Check each pattern for the "**" substring, and revert to the old
logic if seen. While technically a "/<dir>/**" pattern matches
the meaning of "/<dir>/", it is not one that would be written by
the sparse-checkout builtin in cone mode. Attempting to accept that
pattern change complicates the logic and instead we punt and do
not accept any instance of "**".

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              |  7 +++++-
 t/t1091-sparse-checkout-builtin.sh | 34 ++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 22d08e61c2..40fed73a94 100644
--- a/dir.c
+++ b/dir.c
@@ -651,11 +651,16 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 		return;
 	}
 
+	if (strstr(given->pattern, "**")) {
+		/* Not a cone pattern. */
+		warning(_("unrecognized pattern: '%s'"), given->pattern);
+		goto clear_hashmaps;
+	}
+
 	if (given->patternlen > 2 &&
 	    !strcmp(given->pattern + given->patternlen - 2, "/*")) {
 		if (!(given->flags & PATTERN_FLAG_NEGATIVE)) {
 			/* Not a cone pattern. */
-			pl->use_cone_patterns = 0;
 			warning(_("unrecognized pattern: '%s'"), given->pattern);
 			goto clear_hashmaps;
 		}
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index cf4a595c86..e2e45dc7fd 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -305,4 +305,38 @@ test_expect_success 'different sparse-checkouts with worktrees' '
 	check_files worktree a deep
 '
 
+check_read_tree_errors () {
+	REPO=$1
+	FILES=$2
+	ERRORS=$3
+	git -C $REPO read-tree -mu HEAD 2>err &&
+	if test -z "$ERRORS"
+	then
+		test_must_be_empty err
+	else
+		test_i18ngrep "$ERRORS" err
+	fi &&
+	check_files $REPO $FILES
+}
+
+test_expect_success 'pattern-checks: /A/**' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/folder1/**
+	EOF
+	check_read_tree_errors repo "a folder1" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: /A/**/B/' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/deep/**/deepest
+	EOF
+	check_read_tree_errors repo "a deep" "disabling cone pattern matching" &&
+	check_files repo/deep "deeper1" &&
+	check_files repo/deep/deeper1 "deepest"
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 07/15] sparse-checkout: detect short patterns
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (5 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 06/15] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 08/15] sparse-checkout: warn on globs in cone patterns Derrick Stolee via GitGitGadget
                         ` (8 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the shortest pattern the sparse-checkout command will
write into the sparse-checkout file is "/*". This is handled carefully
in add_pattern_to_hashsets(), so warn if any other pattern is this
short. This will assist future pattern checks by allowing us to assume
there are at least three characters in the pattern.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 3 ++-
 t/t1091-sparse-checkout-builtin.sh | 9 +++++++++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 40fed73a94..c2e585607e 100644
--- a/dir.c
+++ b/dir.c
@@ -651,7 +651,8 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 		return;
 	}
 
-	if (strstr(given->pattern, "**")) {
+	if (given->patternlen <= 2 ||
+	    strstr(given->pattern, "**")) {
 		/* Not a cone pattern. */
 		warning(_("unrecognized pattern: '%s'"), given->pattern);
 		goto clear_hashmaps;
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index e2e45dc7fd..2e57534799 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -339,4 +339,13 @@ test_expect_success 'pattern-checks: /A/**/B/' '
 	check_files repo/deep/deeper1 "deepest"
 '
 
+test_expect_success 'pattern-checks: too short' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/a
+	EOF
+	check_read_tree_errors repo "a" "disabling cone pattern matching"
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 08/15] sparse-checkout: warn on globs in cone patterns
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (6 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 07/15] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 09/15] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
                         ` (7 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the sparse-checkout commmand will write patterns that
allow faster pattern matching. This matching only works if the patterns
in the sparse-checkout file are those written by that command. Users
can edit the sparse-checkout file and create patterns that cause the
cone mode matching to fail.

The cone mode patterns may end in "/*" but otherwise an un-escaped
asterisk or other glob character is invalid. Add checks to disable
cone mode when seeing these values.

A later change will properly handle escaped globs.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 36 +++++++++++++++++++++++++++
 t/t1091-sparse-checkout-builtin.sh | 39 ++++++++++++++++++++++++++++++
 2 files changed, 75 insertions(+)

diff --git a/dir.c b/dir.c
index c2e585607e..71d28331f3 100644
--- a/dir.c
+++ b/dir.c
@@ -635,6 +635,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 	struct pattern_entry *translated;
 	char *truncated;
 	char *data = NULL;
+	const char *prev, *cur, *next;
 
 	if (!pl->use_cone_patterns)
 		return;
@@ -652,12 +653,47 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 	}
 
 	if (given->patternlen <= 2 ||
+	    *given->pattern == '*' ||
 	    strstr(given->pattern, "**")) {
 		/* Not a cone pattern. */
 		warning(_("unrecognized pattern: '%s'"), given->pattern);
 		goto clear_hashmaps;
 	}
 
+	prev = given->pattern;
+	cur = given->pattern + 1;
+	next = given->pattern + 2;
+
+	while (*cur) {
+		/* Watch for glob characters '*', '\', '[', '?' */
+		if (!is_glob_special(*cur))
+			goto increment;
+
+		/* But only if *prev != '\\' */
+		if (*prev == '\\')
+			goto increment;
+
+		/* But allow the initial '\' */
+		if (*cur == '\\' &&
+		    is_glob_special(*next))
+			goto increment;
+
+		/* But a trailing '/' then '*' is fine */
+		if (*prev == '/' &&
+		    *cur == '*' &&
+		    *next == 0)
+			goto increment;
+
+		/* Not a cone pattern. */
+		warning(_("unrecognized pattern: '%s'"), given->pattern);
+		goto clear_hashmaps;
+
+	increment:
+		prev++;
+		cur++;
+		next++;
+	}
+
 	if (given->patternlen > 2 &&
 	    !strcmp(given->pattern + given->patternlen - 2, "/*")) {
 		if (!(given->flags & PATTERN_FLAG_NEGATIVE)) {
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 2e57534799..c732abeacd 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -348,4 +348,43 @@ test_expect_success 'pattern-checks: too short' '
 	check_read_tree_errors repo "a" "disabling cone pattern matching"
 '
 
+test_expect_success 'pattern-checks: trailing "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/a*
+	EOF
+	check_read_tree_errors repo "a" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: starting "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	*eep/
+	EOF
+	check_read_tree_errors repo "a deep" "disabling cone pattern matching"
+'
+
+test_expect_success 'pattern-checks: contained glob characters' '
+	for c in "[a]" "\\" "?" "*"
+	do
+		cat >repo/.git/info/sparse-checkout <<-EOF &&
+		/*
+		!/*/
+		something$c-else/
+		EOF
+		check_read_tree_errors repo "a" "disabling cone pattern matching"
+	done
+'
+
+test_expect_success 'pattern-checks: escaped "*"' '
+	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+	/*
+	!/*/
+	/does\*not\*exist/
+	EOF
+	check_read_tree_errors repo "a" ""
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 09/15] sparse-checkout: properly match escaped characters
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (7 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 08/15] sparse-checkout: warn on globs in cone patterns Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 10/15] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
                         ` (6 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

In cone mode, the sparse-checkout feature uses hashset containment
queries to match paths. Make this algorithm respect escaped asterisk
(*) and backslash (\) characters.

Create dup_and_filter_pattern() method to convert a pattern by
removing escape characters and dropping an optional "/*" at the end.
This method is available in dir.h as we will use it in
builtin/sparse-checkout.c in a later change.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 dir.c                              | 35 +++++++++++++++++++++++++++---
 t/t1091-sparse-checkout-builtin.sh | 23 ++++++++++++++++----
 2 files changed, 51 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index 71d28331f3..7ac0920b71 100644
--- a/dir.c
+++ b/dir.c
@@ -630,6 +630,36 @@ int pl_hashmap_cmp(const void *unused_cmp_data,
 	return strncmp(ee1->pattern, ee2->pattern, min_len);
 }
 
+static char *dup_and_filter_pattern(const char *pattern)
+{
+	char *set, *read;
+	size_t count  = 0;
+	char *result = xstrdup(pattern);
+
+	set = result;
+	read = result;
+
+	while (*read) {
+		/* skip escape characters (once) */
+		if (*read == '\\')
+			read++;
+
+		*set = *read;
+
+		set++;
+		read++;
+		count++;
+	}
+	*set = 0;
+
+	if (count > 2 &&
+	    *(set - 1) == '*' &&
+	    *(set - 2) == '/')
+		*(set - 2) = 0;
+
+	return result;
+}
+
 static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern *given)
 {
 	struct pattern_entry *translated;
@@ -702,8 +732,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 			goto clear_hashmaps;
 		}
 
-		truncated = xstrdup(given->pattern);
-		truncated[given->patternlen - 2] = 0;
+		truncated = dup_and_filter_pattern(given->pattern);
 
 		translated = xmalloc(sizeof(struct pattern_entry));
 		translated->pattern = truncated;
@@ -737,7 +766,7 @@ static void add_pattern_to_hashsets(struct pattern_list *pl, struct path_pattern
 
 	translated = xmalloc(sizeof(struct pattern_entry));
 
-	translated->pattern = xstrdup(given->pattern);
+	translated->pattern = dup_and_filter_pattern(given->pattern);
 	translated->patternlen = given->patternlen;
 	hashmap_entry_init(&translated->ent,
 			   ignore_case ?
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index c732abeacd..9ea700896d 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -378,13 +378,28 @@ test_expect_success 'pattern-checks: contained glob characters' '
 	done
 '
 
-test_expect_success 'pattern-checks: escaped "*"' '
-	cat >repo/.git/info/sparse-checkout <<-\EOF &&
+test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
+	git clone repo escaped &&
+	TREEOID=$(git -C escaped rev-parse HEAD:folder1) &&
+	NEWTREE=$(git -C escaped mktree <<-EOF
+	$(git -C escaped ls-tree HEAD)
+	040000 tree $TREEOID	zbad\\dir
+	040000 tree $TREEOID	zdoes*exist
+	EOF
+	) &&
+	COMMIT=$(git -C escaped commit-tree $NEWTREE -p HEAD) &&
+	git -C escaped reset --hard $COMMIT &&
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
+	git -C escaped sparse-checkout init --cone &&
+	cat >escaped/.git/info/sparse-checkout <<-\EOF &&
 	/*
 	!/*/
-	/does\*not\*exist/
+	/zbad\\dir/
+	!/zbad\\dir/*/
+	/zdoes\*not\*exist/
+	/zdoes\*exist/
 	EOF
-	check_read_tree_errors repo "a" ""
+	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 10/15] sparse-checkout: write escaped patterns in cone mode
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (8 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 09/15] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 11/15] sparse-checkout: unquote C-style strings over --stdin Derrick Stolee via GitGitGadget
                         ` (5 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

If a user somehow creates a directory with an asterisk (*) or backslash
(\), then the "git sparse-checkout set" command will struggle to provide
the correct pattern in the sparse-checkout file. When not in cone mode,
the provided pattern is written directly into the sparse-checkout file.
However, in cone mode we expect a list of paths to directories and then
we convert those into patterns.

However, there is some care needed for the timing of these escapes. The
in-memory pattern list is used to update the working directory before
writing the patterns to disk. Thus, we need the command to have the
unescaped names in the hashsets for the cone comparisons, then escape
the patterns later.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          | 23 +++++++++++++++++++++--
 t/t1091-sparse-checkout-builtin.sh | 10 ++++++++--
 2 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 3cee8ab46e..cc86b8a014 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -13,6 +13,7 @@
 #include "resolve-undo.h"
 #include "unpack-trees.h"
 #include "wt-status.h"
+#include "quote.h"
 
 static const char *empty_base = "";
 
@@ -140,6 +141,22 @@ static int update_working_directory(struct pattern_list *pl)
 	return result;
 }
 
+static char *escaped_pattern(char *pattern)
+{
+	char *p = pattern;
+	struct strbuf final = STRBUF_INIT;
+
+	while (*p) {
+		if (*p == '*' || *p == '\\')
+			strbuf_addch(&final, '\\');
+
+		strbuf_addch(&final, *p);
+		p++;
+	}
+
+	return strbuf_detach(&final, NULL);
+}
+
 static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 {
 	int i;
@@ -164,10 +181,11 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 	fprintf(fp, "/*\n!/*/\n");
 
 	for (i = 0; i < sl.nr; i++) {
-		char *pattern = sl.items[i].string;
+		char *pattern = escaped_pattern(sl.items[i].string);
 
 		if (strlen(pattern))
 			fprintf(fp, "%s/\n!%s/*/\n", pattern, pattern);
+		free(pattern);
 	}
 
 	string_list_clear(&sl, 0);
@@ -185,8 +203,9 @@ static void write_cone_to_file(FILE *fp, struct pattern_list *pl)
 	string_list_remove_duplicates(&sl, 0);
 
 	for (i = 0; i < sl.nr; i++) {
-		char *pattern = sl.items[i].string;
+		char *pattern = escaped_pattern(sl.items[i].string);
 		fprintf(fp, "%s/\n", pattern);
+		free(pattern);
 	}
 }
 
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 9ea700896d..fb8718e64a 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -309,6 +309,9 @@ check_read_tree_errors () {
 	REPO=$1
 	FILES=$2
 	ERRORS=$3
+	git -C $REPO -c core.sparseCheckoutCone=false read-tree -mu HEAD 2>err &&
+	test_must_be_empty err &&
+	check_files $REPO "$FILES" &&
 	git -C $REPO read-tree -mu HEAD 2>err &&
 	if test -z "$ERRORS"
 	then
@@ -391,14 +394,17 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	git -C escaped reset --hard $COMMIT &&
 	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
 	git -C escaped sparse-checkout init --cone &&
-	cat >escaped/.git/info/sparse-checkout <<-\EOF &&
+	git -C escaped sparse-checkout set zbad\\dir/bogus "zdoes*not*exist" "zdoes*exist" &&
+	cat >expect <<-\EOF &&
 	/*
 	!/*/
 	/zbad\\dir/
 	!/zbad\\dir/*/
-	/zdoes\*not\*exist/
+	/zbad\\dir/bogus/
 	/zdoes\*exist/
+	/zdoes\*not\*exist/
 	EOF
+	test_cmp expect escaped/.git/info/sparse-checkout &&
 	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
 '
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 11/15] sparse-checkout: unquote C-style strings over --stdin
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (9 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 10/15] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 12/15] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
                         ` (4 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

If a user somehow creates a directory with an asterisk (*) or backslash
(\), then the "git sparse-checkout set" command will struggle to provide
the correct pattern in the sparse-checkout file. When not in cone mode,
the provided pattern is written directly into the sparse-checkout file.
However, in cone mode we expect a list of paths to directories and then
we convert those into patterns.

Even more specifically, the goal is to always allow the following from
the root of a repo:

  git ls-tree --name-only -d HEAD | git sparse-checkout set --stdin

The ls-tree command provides directory names with an unescaped asterisk.
It also quotes the directories that contain an escaped backslash. We
must remove these quotes, then keep the escaped backslashes.

Use unquote_c_style() when parsing lines from stdin. Command-line
arguments will be parsed as-is, assuming the user can do the correct
level of escaping from their environment to match the exact directory
names.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          | 15 ++++++++++++++-
 t/t1091-sparse-checkout-builtin.sh | 14 +++++++++++++-
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index cc86b8a014..6083aa10f2 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -442,8 +442,21 @@ static int sparse_checkout_set(int argc, const char **argv, const char *prefix)
 		pl.use_cone_patterns = 1;
 
 		if (set_opts.use_stdin) {
-			while (!strbuf_getline(&line, stdin))
+			struct strbuf unquoted = STRBUF_INIT;
+			while (!strbuf_getline(&line, stdin)) {
+				if (line.buf[0] == '"') {
+					strbuf_reset(&unquoted);
+					if (unquote_c_style(&unquoted, line.buf, NULL))
+						die(_("unable to unquote C-style string '%s'"),
+						line.buf);
+
+					strbuf_swap(&unquoted, &line);
+				}
+
 				strbuf_to_cone_pattern(&line, &pl);
+			}
+
+			strbuf_release(&unquoted);
 		} else {
 			for (i = 0; i < argc; i++) {
 				strbuf_setlen(&line, 0);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index fb8718e64a..a46a310740 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -405,7 +405,19 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	/zdoes\*not\*exist/
 	EOF
 	test_cmp expect escaped/.git/info/sparse-checkout &&
-	check_read_tree_errors escaped "a zbad\\dir zdoes*exist"
+	check_read_tree_errors escaped "a zbad\\dir zdoes*exist" &&
+	git -C escaped ls-tree -d --name-only HEAD | git -C escaped sparse-checkout set --stdin &&
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/deep/
+	/folder1/
+	/folder2/
+	/zbad\\dir/
+	/zdoes\*exist/
+	EOF
+	test_cmp expect escaped/.git/info/sparse-checkout &&
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist"
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 12/15] sparse-checkout: use C-style quotes in 'list' subcommand
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (10 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 11/15] sparse-checkout: unquote C-style strings over --stdin Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 13/15] sparse-checkout: escape all glob characters on write Derrick Stolee via GitGitGadget
                         ` (3 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

When in cone mode, the 'git sparse-checkout list' subcommand lists
the directories included in the sparse cone. When these directories
contain odd characters, such as a backslash, then we need to use
C-style quotes similar to 'git ls-tree'.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          | 6 ++++--
 t/t1091-sparse-checkout-builtin.sh | 7 +++++--
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index 6083aa10f2..facdb6bda7 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -78,8 +78,10 @@ static int sparse_checkout_list(int argc, const char **argv)
 
 		string_list_sort(&sl);
 
-		for (i = 0; i < sl.nr; i++)
-			printf("%s\n", sl.items[i].string);
+		for (i = 0; i < sl.nr; i++) {
+			quote_c_style(sl.items[i].string, NULL, stdout, 0);
+			printf("\n");
+		}
 
 		return 0;
 	}
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index a46a310740..545e8d5ebe 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -406,7 +406,8 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	EOF
 	test_cmp expect escaped/.git/info/sparse-checkout &&
 	check_read_tree_errors escaped "a zbad\\dir zdoes*exist" &&
-	git -C escaped ls-tree -d --name-only HEAD | git -C escaped sparse-checkout set --stdin &&
+	git -C escaped ls-tree -d --name-only HEAD >list-expect &&
+	git -C escaped sparse-checkout set --stdin <list-expect &&
 	cat >expect <<-\EOF &&
 	/*
 	!/*/
@@ -417,7 +418,9 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	/zdoes\*exist/
 	EOF
 	test_cmp expect escaped/.git/info/sparse-checkout &&
-	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist"
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
+	git -C escaped sparse-checkout list >list-actual &&
+	test_cmp list-expect list-actual
 '
 
 test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 13/15] sparse-checkout: escape all glob characters on write
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (11 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 12/15] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 14/15] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
                         ` (2 subsequent siblings)
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The sparse-checkout patterns allow special globs according to
fnmatch(3). When writing cone-mode patterns for paths containing
these characters, they must be escaped.

Use is_glob_special() to check which characters must be escaped
this way, and add a path to the tests that contains all glob
characters at once. Note that ']' is not special, since the
initial bracket '[' is escaped.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 builtin/sparse-checkout.c          |  2 +-
 t/t1091-sparse-checkout-builtin.sh | 13 ++++++++-----
 2 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
index facdb6bda7..7aeb384362 100644
--- a/builtin/sparse-checkout.c
+++ b/builtin/sparse-checkout.c
@@ -149,7 +149,7 @@ static char *escaped_pattern(char *pattern)
 	struct strbuf final = STRBUF_INIT;
 
 	while (*p) {
-		if (*p == '*' || *p == '\\')
+		if (is_glob_special(*p))
 			strbuf_addch(&final, '\\');
 
 		strbuf_addch(&final, *p);
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 545e8d5ebe..37e9304ef3 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -381,20 +381,21 @@ test_expect_success 'pattern-checks: contained glob characters' '
 	done
 '
 
-test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
+test_expect_success BSLASHPSPEC 'pattern-checks: escaped characters' '
 	git clone repo escaped &&
 	TREEOID=$(git -C escaped rev-parse HEAD:folder1) &&
 	NEWTREE=$(git -C escaped mktree <<-EOF
 	$(git -C escaped ls-tree HEAD)
 	040000 tree $TREEOID	zbad\\dir
 	040000 tree $TREEOID	zdoes*exist
+	040000 tree $TREEOID	zglob[!a]?
 	EOF
 	) &&
 	COMMIT=$(git -C escaped commit-tree $NEWTREE -p HEAD) &&
 	git -C escaped reset --hard $COMMIT &&
-	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" zglob[!a]? &&
 	git -C escaped sparse-checkout init --cone &&
-	git -C escaped sparse-checkout set zbad\\dir/bogus "zdoes*not*exist" "zdoes*exist" &&
+	git -C escaped sparse-checkout set zbad\\dir/bogus "zdoes*not*exist" "zdoes*exist" "zglob[!a]?" &&
 	cat >expect <<-\EOF &&
 	/*
 	!/*/
@@ -403,9 +404,10 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	/zbad\\dir/bogus/
 	/zdoes\*exist/
 	/zdoes\*not\*exist/
+	/zglob\[!a]\?/
 	EOF
 	test_cmp expect escaped/.git/info/sparse-checkout &&
-	check_read_tree_errors escaped "a zbad\\dir zdoes*exist" &&
+	check_read_tree_errors escaped "a zbad\\dir zdoes*exist zglob[!a]?" &&
 	git -C escaped ls-tree -d --name-only HEAD >list-expect &&
 	git -C escaped sparse-checkout set --stdin <list-expect &&
 	cat >expect <<-\EOF &&
@@ -416,9 +418,10 @@ test_expect_success BSLASHPSPEC 'pattern-checks: escaped "*"' '
 	/folder2/
 	/zbad\\dir/
 	/zdoes\*exist/
+	/zglob\[!a]\?/
 	EOF
 	test_cmp expect escaped/.git/info/sparse-checkout &&
-	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" &&
+	check_files escaped "a deep folder1 folder2 zbad\\dir zdoes*exist" zglob[!a]? &&
 	git -C escaped sparse-checkout list >list-actual &&
 	test_cmp list-expect list-actual
 '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 14/15] sparse-checkout: improve docs around 'set' in cone mode
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (12 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 13/15] sparse-checkout: escape all glob characters on write Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:16       ` [PATCH v4 15/15] sparse-checkout: fix cone mode behavior mismatch Derrick Stolee via GitGitGadget
  2020-01-31 20:36       ` [PATCH v4 00/15] Harden the sparse-checkout builtin Elijah Newren
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The existing documentation does not clarify how the 'set' subcommand
changes when core.sparseCheckoutCone is enabled. Correct this by
changing some language around the "A/B/C" example. Also include a
description of the input format matching the output of 'git ls-tree
--name-only'.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 Documentation/git-sparse-checkout.txt | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
index 4834fb434d..0914619881 100644
--- a/Documentation/git-sparse-checkout.txt
+++ b/Documentation/git-sparse-checkout.txt
@@ -50,6 +50,14 @@ To avoid interfering with other worktrees, it first enables the
 +
 When the `--stdin` option is provided, the patterns are read from
 standard in as a newline-delimited list instead of from the arguments.
++
+When `core.sparseCheckoutCone` is enabled, the input list is considered a
+list of directories instead of sparse-checkout patterns. The command writes
+patterns to the sparse-checkout file to include all files contained in those
+directories (recursively) as well as files that are siblings of ancestor
+directories. The input format matches the output of `git ls-tree --name-only`.
+This includes interpreting pathnames that begin with a double quote (") as
+C-style quoted strings.
 
 'disable'::
 	Disable the `core.sparseCheckout` config setting, and restore the
@@ -128,9 +136,12 @@ the following patterns:
 ----------------
 
 This says "include everything in root, but nothing two levels below root."
-If we then add the folder `A/B/C` as a recursive pattern, the folders `A` and
-`A/B` are added as parent patterns. The resulting sparse-checkout file is
-now
+
+When in cone mode, the `git sparse-checkout set` subcommand takes a list of
+directories instead of a list of sparse-checkout patterns. In this mode,
+the command `git sparse-checkout set A/B/C` sets the directory `A/B/C` as
+a recursive pattern, the directories `A` and `A/B` are added as parent
+patterns. The resulting sparse-checkout file is now
 
 ----------------
 /*
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 82+ messages in thread

* [PATCH v4 15/15] sparse-checkout: fix cone mode behavior mismatch
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (13 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 14/15] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
@ 2020-01-31 20:16       ` Derrick Stolee via GitGitGadget
  2020-01-31 20:36       ` [PATCH v4 00/15] Harden the sparse-checkout builtin Elijah Newren
  15 siblings, 0 replies; 82+ messages in thread
From: Derrick Stolee via GitGitGadget @ 2020-01-31 20:16 UTC (permalink / raw)
  To: git; +Cc: me, peff, newren, Derrick Stolee, Derrick Stolee

From: Derrick Stolee <dstolee@microsoft.com>

The intention of the special "cone mode" in the sparse-checkout
feature is to always match the same patterns that are matched by the
same sparse-checkout file as when cone mode is disabled.

When a file path is given to "git sparse-checkout set" in cone mode,
then the cone mode improperly matches the file as a recursive path.
When setting the skip-worktree bits, files were not expecting the
MATCHED_RECURSIVE response, and hence these were left out of the
matched cone.

Fix this bug by checking for MATCHED_RECURSIVE in addition to MATCHED
and add a test that prevents regression.

Reported-by: Finn Bryant <finnbryant@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 t/t1091-sparse-checkout-builtin.sh | 12 ++++++++++++
 unpack-trees.c                     |  2 +-
 2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
index 37e9304ef3..7d982096fb 100755
--- a/t/t1091-sparse-checkout-builtin.sh
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -305,6 +305,18 @@ test_expect_success 'different sparse-checkouts with worktrees' '
 	check_files worktree a deep
 '
 
+test_expect_success 'set using filename keeps file on-disk' '
+	git -C repo sparse-checkout set a deep &&
+	cat >expect <<-\EOF &&
+	/*
+	!/*/
+	/a/
+	/deep/
+	EOF
+	test_cmp expect repo/.git/info/sparse-checkout &&
+	check_files repo a deep
+'
+
 check_read_tree_errors () {
 	REPO=$1
 	FILES=$2
diff --git a/unpack-trees.c b/unpack-trees.c
index 3789a22cf0..78425ce74b 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1416,7 +1416,7 @@ static int clear_ce_flags_1(struct index_state *istate,
 						name, &dtype, pl, istate);
 		if (ret == UNDECIDED)
 			ret = default_match;
-		if (ret == MATCHED)
+		if (ret == MATCHED || ret == MATCHED_RECURSIVE)
 			ce->ce_flags &= ~clear_mask;
 		cache++;
 		progress_nr++;
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 82+ messages in thread

* Re: [PATCH v4 00/15] Harden the sparse-checkout builtin
  2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
                         ` (14 preceding siblings ...)
  2020-01-31 20:16       ` [PATCH v4 15/15] sparse-checkout: fix cone mode behavior mismatch Derrick Stolee via GitGitGadget
@ 2020-01-31 20:36       ` Elijah Newren
  2020-02-03 14:09         ` Derrick Stolee
  15 siblings, 1 reply; 82+ messages in thread
From: Elijah Newren @ 2020-01-31 20:36 UTC (permalink / raw)
  To: Derrick Stolee via GitGitGadget
  Cc: Git Mailing List, Taylor Blau, Jeff King, Derrick Stolee

On Fri, Jan 31, 2020 at 12:16 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> This series is based on ds/sparse-list-in-cone-mode.
>
> This series attempts to clean up some rough edges in the sparse-checkout
> feature, especially around the cone mode.
>
> Unfortunately, after the v2.25.0 release, we noticed an issue with the "git
> clone --sparse" option when using a URL instead of a local path. This is
> fixed and properly tested here.
>
> Also, let's improve Git's response to these more complicated scenarios:
>
>  1. Running "git sparse-checkout init" in a worktree would complain because
>     the "info" dir doesn't exist.
>  2. Tracked paths that include "*" and "\" in their filenames.
>  3. If a user edits the sparse-checkout file to have non-cone pattern, such
>     as "**" anywhere or "*" in the wrong place, then we should respond
>     appropriately. That is: warn that the patterns are not cone-mode, then
>     revert to the old logic.
>
> Updates in V2:
>
>  * Added C-style quoting to the output of "git sparse-checkout list" in cone
>    mode.
>  * Improved documentation.
>  * Responded to most style feedback. Hopefully I didn't miss anything.
>  * I was lingering on this a little to see if I could also fix the issue
>    raised in [1], but I have not figured that one out, yet.
>
> Update in V3:
>
>  * Input now uses Peff's recommended pattern: unquote C-style strings over
>    stdin and otherwise do not un-escape input.

...and updates in V4 are?  (I looked over the range-diff which
definitely helps, but a summary would still be nice.)

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v4 00/15] Harden the sparse-checkout builtin
  2020-01-31 20:36       ` [PATCH v4 00/15] Harden the sparse-checkout builtin Elijah Newren
@ 2020-02-03 14:09         ` Derrick Stolee
  2020-02-08 23:32           ` Taylor Blau
  0 siblings, 1 reply; 82+ messages in thread
From: Derrick Stolee @ 2020-02-03 14:09 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Taylor Blau, Jeff King, Derrick Stolee,
	Finn Bryant

On 1/31/2020 3:36 PM, Elijah Newren wrote:
> On Fri, Jan 31, 2020 at 12:16 PM Derrick Stolee via GitGitGadget
>> Update in V3:
>>
>>  * Input now uses Peff's recommended pattern: unquote C-style strings over
>>    stdin and otherwise do not un-escape input.
> 
> ...and updates in V4 are?  (I looked over the range-diff which
> definitely helps, but a summary would still be nice.)

Sorry! I definitely should have double-checked before sending.

Updates in V4:

* Special-character checking now considers all glob characters
  ( '[', '*', '\\', '?' ) See Patches 8 and 13.

* Patch 10 is is now split into two (Patches 10 and 11), to properly
  escape patterns and to unquote C-style strings. 

* The file/directory path bug reported in [1] is fixed in Patch 15.

Thanks,
-Stolee

[1] https://lore.kernel.org/git/CADSBhNbbO=aq-Oo2MpzDMN2VAX4m6f9Jb-eCtVVX1NfWKE9zJw@mail.gmail.com/

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v4 00/15] Harden the sparse-checkout builtin
  2020-02-03 14:09         ` Derrick Stolee
@ 2020-02-08 23:32           ` Taylor Blau
  2020-02-09 17:27             ` Junio C Hamano
  0 siblings, 1 reply; 82+ messages in thread
From: Taylor Blau @ 2020-02-08 23:32 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Elijah Newren, Git Mailing List, Taylor Blau, Jeff King,
	Derrick Stolee, Finn Bryant

Hi Derrick,

On Mon, Feb 03, 2020 at 09:09:54AM -0500, Derrick Stolee wrote:
> On 1/31/2020 3:36 PM, Elijah Newren wrote:
> > On Fri, Jan 31, 2020 at 12:16 PM Derrick Stolee via GitGitGadget
> >> Update in V3:
> >>
> >>  * Input now uses Peff's recommended pattern: unquote C-style strings over
> >>    stdin and otherwise do not un-escape input.
> >
> > ...and updates in V4 are?  (I looked over the range-diff which
> > definitely helps, but a summary would still be nice.)
>
> Sorry! I definitely should have double-checked before sending.
>
> Updates in V4:
>
> * Special-character checking now considers all glob characters
>   ( '[', '*', '\\', '?' ) See Patches 8 and 13.
>
> * Patch 10 is is now split into two (Patches 10 and 11), to properly
>   escape patterns and to unquote C-style strings.
>
> * The file/directory path bug reported in [1] is fixed in Patch 15.

Thanks for including these. I haven't been super active in the earlier
rounds of review on this series, but I gave a thorough look to what you
have in v4, and it all looks good to me.

Please consider this:

  Reviewed-by: Taylor Blau <me@ttaylorr.com>

> Thanks,
> -Stolee
>
> [1] https://lore.kernel.org/git/CADSBhNbbO=aq-Oo2MpzDMN2VAX4m6f9Jb-eCtVVX1NfWKE9zJw@mail.gmail.com/

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 82+ messages in thread

* Re: [PATCH v4 00/15] Harden the sparse-checkout builtin
  2020-02-08 23:32           ` Taylor Blau
@ 2020-02-09 17:27             ` Junio C Hamano
  0 siblings, 0 replies; 82+ messages in thread
From: Junio C Hamano @ 2020-02-09 17:27 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Derrick Stolee, Elijah Newren, Git Mailing List, Jeff King,
	Derrick Stolee, Finn Bryant

Taylor Blau <me@ttaylorr.com> writes:

> On Mon, Feb 03, 2020 at 09:09:54AM -0500, Derrick Stolee wrote:
> ...
> Thanks for including these. I haven't been super active in the earlier
> rounds of review on this series, but I gave a thorough look to what you
> have in v4, and it all looks good to me.
>
> Please consider this:
>
>   Reviewed-by: Taylor Blau <me@ttaylorr.com>

Thanks, all.

^ permalink raw reply	[flat|nested] 82+ messages in thread

end of thread, other threads:[~2020-02-09 17:27 UTC | newest]

Thread overview: 82+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-14 19:25 [PATCH 0/8] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
2020-01-14 19:25 ` [PATCH 1/8] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-16 21:40   ` Junio C Hamano
2020-01-14 19:25 ` [PATCH 2/8] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-16 21:46   ` Junio C Hamano
2020-01-14 19:25 ` [PATCH 3/8] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-14 19:30   ` Taylor Blau
2020-01-14 19:25 ` [PATCH 4/8] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-14 21:16   ` Jeff King
2020-01-14 19:25 ` [PATCH 5/8] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-14 19:26 ` [PATCH 6/8] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
2020-01-14 19:26 ` [PATCH 7/8] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-14 21:21   ` Jeff King
2020-01-14 22:08     ` Derrick Stolee
2020-01-14 19:26 ` [PATCH 8/8] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-14 21:25   ` Jeff King
2020-01-14 22:11     ` Derrick Stolee
2020-01-14 22:48       ` Jeff King
2020-01-24 21:10         ` Derrick Stolee
2020-01-24 21:42           ` Jeff King
2020-01-28 15:03             ` Derrick Stolee
2020-01-14 19:34 ` [PATCH 0/8] Harden the sparse-checkout builtin Taylor Blau
2020-01-14 19:44   ` Derrick Stolee
2020-01-14 21:31     ` Jeff King
2020-01-15 19:16 ` Junio C Hamano
2020-01-15 20:32   ` Derrick Stolee
2020-01-24 21:19 ` [PATCH v2 00/12] " Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 02/12] t1091: improve here-docs Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 03/12] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 04/12] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 06/12] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 07/12] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 08/12] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
2020-01-24 21:19   ` [PATCH v2 12/12] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
2020-01-28 18:26   ` [PATCH v3 00/12] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 01/12] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 02/12] t1091: improve here-docs Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 03/12] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 04/12] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 05/12] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 06/12] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 07/12] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 08/12] sparse-checkout: warn on incorrect '*' in patterns Derrick Stolee via GitGitGadget
2020-01-28 18:26     ` [PATCH v3 09/12] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-29 10:03       ` Jeff King
2020-01-29 13:58         ` Derrick Stolee
2020-01-29 14:04           ` Derrick Stolee
2020-01-28 18:26     ` [PATCH v3 10/12] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-29 10:17       ` Jeff King
2020-01-29 10:33         ` Jeff King
2020-01-29 14:16           ` Derrick Stolee
2020-01-29 14:39             ` Derrick Stolee
2020-01-30  7:29             ` Jeff King
2020-01-30 15:01               ` Derrick Stolee
2020-01-28 18:26     ` [PATCH v3 11/12] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
2020-01-29 10:23       ` Jeff King
2020-01-28 18:26     ` [PATCH v3 12/12] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
2020-01-31 20:16     ` [PATCH v4 00/15] Harden the sparse-checkout builtin Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 01/15] t1091: use check_files to reduce boilerplate Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 02/15] t1091: improve here-docs Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 03/15] sparse-checkout: create leading directories Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 04/15] clone: fix --sparse option with URLs Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 05/15] sparse-checkout: fix documentation typo for core.sparseCheckoutCone Jeff King via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 06/15] sparse-checkout: cone mode does not recognize "**" Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 07/15] sparse-checkout: detect short patterns Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 08/15] sparse-checkout: warn on globs in cone patterns Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 09/15] sparse-checkout: properly match escaped characters Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 10/15] sparse-checkout: write escaped patterns in cone mode Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 11/15] sparse-checkout: unquote C-style strings over --stdin Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 12/15] sparse-checkout: use C-style quotes in 'list' subcommand Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 13/15] sparse-checkout: escape all glob characters on write Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 14/15] sparse-checkout: improve docs around 'set' in cone mode Derrick Stolee via GitGitGadget
2020-01-31 20:16       ` [PATCH v4 15/15] sparse-checkout: fix cone mode behavior mismatch Derrick Stolee via GitGitGadget
2020-01-31 20:36       ` [PATCH v4 00/15] Harden the sparse-checkout builtin Elijah Newren
2020-02-03 14:09         ` Derrick Stolee
2020-02-08 23:32           ` Taylor Blau
2020-02-09 17:27             ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).