git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* BUG: git worktree add directory-path fails due to over-aggressive disallowing of worktrees with same suffix
@ 2019-02-11 23:50 Cameron Gunnin
  2019-02-12 22:44 ` Eric Sunshine
  0 siblings, 1 reply; 8+ messages in thread
From: Cameron Gunnin @ 2019-02-11 23:50 UTC (permalink / raw)
  To: git@vger.kernel.org

The bug:
  git init worktree-test-repo
  git --git-dir=worktree-test-repo/.git commit --allow-empty -m "first commit"
  git --git-dir=worktree-test-repo/.git branch branch1
  git --git-dir=worktree-test-repo/.git branch branch2
  mkdir unique-path-1 unique-path-2
  cd unique-path-1
  git --git-dir=../worktree-test-repo/.git worktree add subdir branch1
  cd ../unique-path-2
  git --git-dir=../worktree-test-repo/.git worktree add subdir branch2
  # FAILS WITH: fatal: 'subdir' is a missing but already registered worktree; use 'add -f' to override, or 'prune' or 'remove' to clear

Note: the following will succeed, implying that it is the suffix matching that is going awry:
  cd ..
  git --git-dir=worktree-test-repo/.git worktree add unique-path-2/subdir branch2

This appears to have been introduced by the following commit:
  commit cb56f55c16c128e18449c9417dc045f787c1b663
  Author: Eric Sunshine <sunshine@sunshineco.com>
  Date:   Tue Aug 28 17:20:22 2018 -0400

      worktree: disallow adding same path multiple times

The intention of the above commit was to avoid a bad case where a worktree at a given path was removed manually then re-created via 'git worktree add', leading to two entries in 'git worktree list' for the same directory. However, the bug shown here is a valid use case, as we're technically using two different paths.

The fix, I think, should be applied to builtin/worktree.c to the validate_worktree_add method. After finding a worktree that matches the suffix (via find_worktree), it should check that the absolute path of the found worktree is the same as the absolute path of the worktree being added, and allow the add when they are different. Or, perhaps there should be a way to invoke 'find_worktree' such that it only finds absolute path matches.

Thank you for your time,
- Cameron Gunnin


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: BUG: git worktree add directory-path fails due to over-aggressive disallowing of worktrees with same suffix
  2019-02-11 23:50 BUG: git worktree add directory-path fails due to over-aggressive disallowing of worktrees with same suffix Cameron Gunnin
@ 2019-02-12 22:44 ` Eric Sunshine
  2020-02-24  9:08   ` [PATCH 0/3] git-worktree: fix "add" being fooled by suffix matching Eric Sunshine
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Sunshine @ 2019-02-12 22:44 UTC (permalink / raw)
  To: Cameron Gunnin; +Cc: git@vger.kernel.org

On Mon, Feb 11, 2019 at 6:50 PM Cameron Gunnin
<cameron.gunnin@synopsys.com> wrote:
> The bug:
>   cd unique-path-1
>   git --git-dir=../worktree-test-repo/.git worktree add subdir branch1
>   cd ../unique-path-2
>   git --git-dir=../worktree-test-repo/.git worktree add subdir branch2
>   # FAILS WITH: fatal: 'subdir' is a missing but already registered worktree; use 'add -f' to override, or 'prune' or 'remove' to clear
>
> This appears to have been introduced by the following commit:
>       worktree: disallow adding same path multiple times
>
> The fix, I think, should be applied to builtin/worktree.c to the
> validate_worktree_add method. After finding a worktree that matches
> the suffix (via find_worktree), it should check that the absolute
> path of the found worktree is the same as the absolute path of the
> worktree being added, and allow the add when they are different. Or,
> perhaps there should be a way to invoke 'find_worktree' such that it
> only finds absolute path matches.

When crafting cb56f55c16 (worktree: disallow adding same path multiple
times, 2018-08-28), I flip-flopped between two implementations: (1)
using find_worktree(), and (2) manually scanning the worktree list
with absolute path comparison. The latter approach, would not have
suffered from this problem. The one ugly bit of the manual scan was
that it was a bit too cozy with the underlying worktree
implementation, so I eventually went with the find_worktree() approach
despite its obvious drawback of doing suffix matching (though, I
didn't feel particularly comfortable with the decision).

A likely reasonable approach to fixing it would probably be to
introduce new worktree API to find a worktree without magic suffix
matching (i.e. literal absolute path matching) which would avoid
having to imbue the higher-level "git worktree" command with that
low-level knowledge, and then take advantage of that new API.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 0/3] git-worktree: fix "add" being fooled by suffix matching
  2019-02-12 22:44 ` Eric Sunshine
@ 2020-02-24  9:08   ` Eric Sunshine
  2020-02-24  9:08     ` [PATCH 1/3] worktree: improve find_worktree() documentation Eric Sunshine
                       ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Eric Sunshine @ 2020-02-24  9:08 UTC (permalink / raw)
  To: git; +Cc: Duy Nguyen, Cameron Gunnin, Eric Sunshine

"git worktree add <path>" performs various checks before approving
<path> as a valid location for the new worktree. One such check is
asking whether <path> is already a registered worktree. Unfortunately,
this check can be fooled by the magic suffix matching performed by
find_worktree() as a convenience to minimize typing when identifying
worktrees on the command-line.

That the check could be fooled was a known issue, and a special-case
work-around for accidental matching against the main worktree was
already in place even when the validation check was first
implemented[1]. Since that time, an additional case of accidental suffix
matching has been reported[2], which was not covered by the existing
special case.

Rather than adding more band-aids to the validation to work around magic
matching performed by find_worktree(), this patch series fixes the
problem once and for all by locating a worktree deterministically based
only on pathname.

[1]: https://lore.kernel.org/git/CAPig+cQh8hxeoVjLHDKhAcZVQPpPT5v0AUY8gsL9=qfJ7z-L2A@mail.gmail.com/
[2]: https://lore.kernel.org/git/0308570E-AAA3-43B8-A592-F4DA9760DBED@synopsys.com/

Eric Sunshine (3):
  worktree: improve find_worktree() documentation
  worktree: add utility to find worktree by pathname
  worktree: don't allow "add" validation to be fooled by suffix matching

 builtin/worktree.c      |  9 +--------
 t/t2400-worktree-add.sh |  9 +++++++++
 worktree.c              | 16 ++++++++++------
 worktree.h              | 20 ++++++++++++++++++--
 4 files changed, 38 insertions(+), 16 deletions(-)

-- 
2.25.1.526.gf05a752211


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] worktree: improve find_worktree() documentation
  2020-02-24  9:08   ` [PATCH 0/3] git-worktree: fix "add" being fooled by suffix matching Eric Sunshine
@ 2020-02-24  9:08     ` Eric Sunshine
  2020-02-24  9:08     ` [PATCH 2/3] worktree: add utility to find worktree by pathname Eric Sunshine
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Eric Sunshine @ 2020-02-24  9:08 UTC (permalink / raw)
  To: git; +Cc: Duy Nguyen, Cameron Gunnin, Eric Sunshine

Do a better job of explaining that find_worktree()'s main purpose is to
locate a worktree based upon input from a user which may be some sort of
shorthand for identifying a worktree rather than an actual path. For
instance, one shorthand a user can use to identify a worktree is by
unique path suffix (i.e. given worktrees at paths "foo/bar" and
"foo/baz", the latter can be identified simply as "baz"). The actual
heuristics find_worktree() uses to select a worktree may be expanded in
the future (for instance, one day it may allow worktree selection by
<id> of the .git/worktrees/<id>/ administrative directory), thus the
documentation does not provide a precise description of how matching is
performed, instead leaving it open-ended to allow for future
enhancement.

While at it, drop mention of the non-NULL requirement of `prefix` since
NULL has long been allowed. For instance, prefix_filename() has
explicitly allowed NULL since 116fb64e43 (prefix_filename: drop length
parameter, 2017-03-20), and find_worktree() itself since e4da43b1f0
(prefix_filename: return newly allocated string, 2017-03-20).

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 worktree.h | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/worktree.h b/worktree.h
index caecc7a281..b8a851b92b 100644
--- a/worktree.h
+++ b/worktree.h
@@ -44,8 +44,18 @@ int submodule_uses_worktrees(const char *path);
 const char *get_worktree_git_dir(const struct worktree *wt);
 
 /*
- * Search a worktree that can be unambiguously identified by
- * "arg". "prefix" must not be NULL.
+ * Search for the worktree identified unambiguously by `arg` -- typically
+ * supplied by the user via the command-line -- which may be a pathname or some
+ * shorthand uniquely identifying a worktree, thus making it convenient for the
+ * user to specify a worktree with minimal typing. For instance, if the last
+ * component (say, "foo") of a worktree's pathname is unique among worktrees
+ * (say, "work/foo" and "work/bar"), it can be used to identify the worktree
+ * unambiguously.
+ *
+ * `prefix` should be the `prefix` handed to top-level Git commands along with
+ * `argc` and `argv`.
+ *
+ * Return the worktree identified by `arg`, or NULL if not found.
  */
 struct worktree *find_worktree(struct worktree **list,
 			       const char *prefix,
-- 
2.25.1.526.gf05a752211


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/3] worktree: add utility to find worktree by pathname
  2020-02-24  9:08   ` [PATCH 0/3] git-worktree: fix "add" being fooled by suffix matching Eric Sunshine
  2020-02-24  9:08     ` [PATCH 1/3] worktree: improve find_worktree() documentation Eric Sunshine
@ 2020-02-24  9:08     ` Eric Sunshine
  2020-02-24  9:08     ` [PATCH 3/3] worktree: don't allow "add" validation to be fooled by suffix matching Eric Sunshine
  2020-02-24 21:02     ` [PATCH 0/3] git-worktree: fix "add" being " Junio C Hamano
  3 siblings, 0 replies; 8+ messages in thread
From: Eric Sunshine @ 2020-02-24  9:08 UTC (permalink / raw)
  To: git; +Cc: Duy Nguyen, Cameron Gunnin, Eric Sunshine

find_worktree() employs heuristics to match user provided input -- which
may be a pathname or some sort of shorthand -- with an actual worktree.
Although this convenience allows a user to identify a worktree with
minimal typing, the black-box nature of these heuristics makes it
potentially difficult for callers which already know the exact path of a
worktree to be confident that the correct worktree will be returned for
any specific pathname (particularly a relative one), especially as the
heuristics are enhanced and updated.

Therefore, add a companion function, find_worktree_by_path(), which
deterministically identifies a worktree strictly by pathname with no
interpretation and no magic matching.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 worktree.c | 16 ++++++++++------
 worktree.h |  6 ++++++
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/worktree.c b/worktree.c
index 5b4793caa3..43c6685d4e 100644
--- a/worktree.c
+++ b/worktree.c
@@ -215,7 +215,6 @@ struct worktree *find_worktree(struct worktree **list,
 			       const char *arg)
 {
 	struct worktree *wt;
-	char *path;
 	char *to_free = NULL;
 
 	if ((wt = find_worktree_by_suffix(list, arg)))
@@ -223,11 +222,17 @@ struct worktree *find_worktree(struct worktree **list,
 
 	if (prefix)
 		arg = to_free = prefix_filename(prefix, arg);
-	path = real_pathdup(arg, 0);
-	if (!path) {
-		free(to_free);
+	wt = find_worktree_by_path(list, arg);
+	free(to_free);
+	return wt;
+}
+
+struct worktree *find_worktree_by_path(struct worktree **list, const char *p)
+{
+	char *path = real_pathdup(p, 0);
+
+	if (!path)
 		return NULL;
-	}
 	for (; *list; list++) {
 		const char *wt_path = real_path_if_valid((*list)->path);
 
@@ -235,7 +240,6 @@ struct worktree *find_worktree(struct worktree **list,
 			break;
 	}
 	free(path);
-	free(to_free);
 	return *list;
 }
 
diff --git a/worktree.h b/worktree.h
index b8a851b92b..d242a6e71c 100644
--- a/worktree.h
+++ b/worktree.h
@@ -61,6 +61,12 @@ struct worktree *find_worktree(struct worktree **list,
 			       const char *prefix,
 			       const char *arg);
 
+/*
+ * Return the worktree corresponding to `path`, or NULL if no such worktree
+ * exists.
+ */
+struct worktree *find_worktree_by_path(struct worktree **, const char *path);
+
 /*
  * Return true if the given worktree is the main one.
  */
-- 
2.25.1.526.gf05a752211


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 3/3] worktree: don't allow "add" validation to be fooled by suffix matching
  2020-02-24  9:08   ` [PATCH 0/3] git-worktree: fix "add" being fooled by suffix matching Eric Sunshine
  2020-02-24  9:08     ` [PATCH 1/3] worktree: improve find_worktree() documentation Eric Sunshine
  2020-02-24  9:08     ` [PATCH 2/3] worktree: add utility to find worktree by pathname Eric Sunshine
@ 2020-02-24  9:08     ` Eric Sunshine
  2020-02-24 21:06       ` Junio C Hamano
  2020-02-24 21:02     ` [PATCH 0/3] git-worktree: fix "add" being " Junio C Hamano
  3 siblings, 1 reply; 8+ messages in thread
From: Eric Sunshine @ 2020-02-24  9:08 UTC (permalink / raw)
  To: git; +Cc: Duy Nguyen, Cameron Gunnin, Eric Sunshine

"git worktree add <path>" performs various checks before approving
<path> as a valid location for the new worktree. Aside from ensuring
that <path> does not already exist, one of the questions it asks is
whether <path> is already a registered worktree. To perform this check,
it queries find_worktree() and disallows the "add" operation if
find_worktree() finds a match for <path>. As a convenience, however,
find_worktree() casts an overly wide net to allow users to identify
worktrees by shorthand in order to keep typing to a minimum. For
instance, it performs suffix matching which, given subtrees "foo/bar"
and "foo/baz", can correctly select the latter when asked only for
"baz".

"add" validation knows the exact path it is interrogating, so this sort
of heuristic-based matching is, at best, questionable for this use-case
and, at worst, may may accidentally interpret <path> as matching an
existing worktree and incorrectly report it as already registered even
when it isn't. (In fact, validate_worktree_add() already contains a
special case to avoid accidentally matching against the main worktree,
precisely due to this problem.)

Avoid the problem of potential accidental matching against an existing
worktree by instead taking advantage of find_worktree_by_path() which
matches paths deterministically, without applying any sort of magic
shorthand matching performed by find_worktree().

Reported-by: Cameron Gunnin <cameron.gunnin@synopsys.com>
Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
---
 builtin/worktree.c      | 9 +--------
 t/t2400-worktree-add.sh | 9 +++++++++
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/builtin/worktree.c b/builtin/worktree.c
index d6bc5263f1..24f22800f3 100644
--- a/builtin/worktree.c
+++ b/builtin/worktree.c
@@ -234,14 +234,7 @@ static void validate_worktree_add(const char *path, const struct add_opts *opts)
 		die(_("'%s' already exists"), path);
 
 	worktrees = get_worktrees(0);
-	/*
-	 * find_worktree()'s suffix matching may undesirably find the main
-	 * rather than a linked worktree (for instance, when the basenames
-	 * of the main worktree and the one being created are the same).
-	 * We're only interested in linked worktrees, so skip the main
-	 * worktree with +1.
-	 */
-	wt = find_worktree(worktrees + 1, NULL, path);
+	wt = find_worktree_by_path(worktrees, path);
 	if (!wt)
 		goto done;
 
diff --git a/t/t2400-worktree-add.sh b/t/t2400-worktree-add.sh
index b5ece19460..5a7495474a 100755
--- a/t/t2400-worktree-add.sh
+++ b/t/t2400-worktree-add.sh
@@ -570,6 +570,15 @@ test_expect_success '"add" an existing locked but missing worktree' '
 	git worktree add --force --force --detach gnoo
 '
 
+test_expect_success '"add" not tripped up by magic worktree matching"' '
+	# if worktree "sub1/bar" exists, "git worktree add bar" in distinct
+	# directory `sub2` should not mistakenly complain that `bar` is an
+	# already-registered worktree
+	mkdir sub1 sub2 &&
+	git -C sub1 --git-dir=../.git worktree add --detach bozo &&
+	git -C sub2 --git-dir=../.git worktree add --detach bozo
+'
+
 test_expect_success FUNNYNAMES 'sanitize generated worktree name' '
 	git worktree add --detach ".  weird*..?.lock.lock" &&
 	test -d .git/worktrees/---weird-.-
-- 
2.25.1.526.gf05a752211


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 0/3] git-worktree: fix "add" being fooled by suffix matching
  2020-02-24  9:08   ` [PATCH 0/3] git-worktree: fix "add" being fooled by suffix matching Eric Sunshine
                       ` (2 preceding siblings ...)
  2020-02-24  9:08     ` [PATCH 3/3] worktree: don't allow "add" validation to be fooled by suffix matching Eric Sunshine
@ 2020-02-24 21:02     ` Junio C Hamano
  3 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2020-02-24 21:02 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Duy Nguyen, Cameron Gunnin

Eric Sunshine <sunshine@sunshineco.com> writes:

> Rather than adding more band-aids to the validation to work around magic
> matching performed by find_worktree(), this patch series fixes the
> problem once and for all by locating a worktree deterministically based
> only on pathname.

Makes sense.  I wonder if strange synonyms that make more than one
paths string refer to the same filesystem entity (read: case
insensitive or unicode smashing filesystems Windows and macintoshes
use) have similar effect on the codepath, and if so, if we want to
do anything about it (my personal answers are "I don't know but I
would not be surprised if it is the case" and "No, just tell users
not to hurt themselves and move on").

Will queue.  Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] worktree: don't allow "add" validation to be fooled by suffix matching
  2020-02-24  9:08     ` [PATCH 3/3] worktree: don't allow "add" validation to be fooled by suffix matching Eric Sunshine
@ 2020-02-24 21:06       ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2020-02-24 21:06 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: git, Duy Nguyen, Cameron Gunnin

Eric Sunshine <sunshine@sunshineco.com> writes:

> Avoid the problem of potential accidental matching against an existing
> worktree by instead taking advantage of find_worktree_by_path() which
> matches paths deterministically, without applying any sort of magic
> shorthand matching performed by find_worktree().

Combination of 2/3 and 3/3 makes perfect sense as "fix once and for
all" solution.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-02-24 21:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-11 23:50 BUG: git worktree add directory-path fails due to over-aggressive disallowing of worktrees with same suffix Cameron Gunnin
2019-02-12 22:44 ` Eric Sunshine
2020-02-24  9:08   ` [PATCH 0/3] git-worktree: fix "add" being fooled by suffix matching Eric Sunshine
2020-02-24  9:08     ` [PATCH 1/3] worktree: improve find_worktree() documentation Eric Sunshine
2020-02-24  9:08     ` [PATCH 2/3] worktree: add utility to find worktree by pathname Eric Sunshine
2020-02-24  9:08     ` [PATCH 3/3] worktree: don't allow "add" validation to be fooled by suffix matching Eric Sunshine
2020-02-24 21:06       ` Junio C Hamano
2020-02-24 21:02     ` [PATCH 0/3] git-worktree: fix "add" being " Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).