git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] test: some testcases failed if cwd is on a symlink
@ 2012-07-24  8:00 Jiang Xin
  2012-07-24  8:24 ` Stefano Lattarini
                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Jiang Xin @ 2012-07-24  8:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Jiang Xin

Run command 'git rev-parse --git-dir' under subdir will return realpath
of '.git' directory. Some test scripts compare this realpath against
"$TRASH_DIRECTORY", they are not equal if current working directory is
on a symlink.

In this fix, get realpath of "$TRASH_DIRECTORY", store it in
"$TRASH_REALPATH" variable, and use it when necessary.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
---
 t/t4035-diff-quiet.sh  |  8 +++++---
 t/t9903-bash-prompt.sh | 13 +++++++------
 2 个文件被修改,插入 12 行(+),删除 9 行(-)

diff --git a/t/t4035-diff-quiet.sh b/t/t4035-diff-quiet.sh
index 23141..5855 100755
--- a/t/t4035-diff-quiet.sh
+++ b/t/t4035-diff-quiet.sh
@@ -4,6 +4,8 @@ test_description='Return value of diffs'
 
 . ./test-lib.sh
 
+TRASH_REALPATH="$(cd "$TRASH_DIRECTORY"; pwd -P)"
+
 test_expect_success 'setup' '
 	echo 1 >a &&
 	git add . &&
@@ -102,7 +104,7 @@ test_expect_success 'git diff, one file outside repo' '
 
 test_expect_success 'git diff, both files outside repo' '
 	(
-		GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/test-outside" &&
+		GIT_CEILING_DIRECTORIES="$TRASH_REALPATH/test-outside" &&
 		export GIT_CEILING_DIRECTORIES &&
 		cd test-outside/non/git &&
 		test_expect_code 0 git diff --quiet a matching-file &&
@@ -120,7 +122,7 @@ test_expect_success 'git diff --ignore-space-at-eol, one file outside repo' '
 
 test_expect_success 'git diff --ignore-space-at-eol, both files outside repo' '
 	(
-		GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/test-outside" &&
+		GIT_CEILING_DIRECTORIES="$TRASH_REALPATH/test-outside" &&
 		export GIT_CEILING_DIRECTORIES &&
 		cd test-outside/non/git &&
 		test_expect_code 0 git diff --quiet --ignore-space-at-eol a trailing-space &&
@@ -139,7 +141,7 @@ test_expect_success 'git diff --ignore-all-space, one file outside repo' '
 
 test_expect_success 'git diff --ignore-all-space, both files outside repo' '
 	(
-		GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/test-outside" &&
+		GIT_CEILING_DIRECTORIES="$TRASH_REALPATH/test-outside" &&
 		export GIT_CEILING_DIRECTORIES &&
 		cd test-outside/non/git &&
 		test_expect_code 0 git diff --quiet --ignore-all-space a trailing-space &&
diff --git a/t/t9903-bash-prompt.sh b/t/t9903-bash-prompt.sh
index f17c1f..275bc 100755
--- a/t/t9903-bash-prompt.sh
+++ b/t/t9903-bash-prompt.sh
@@ -10,6 +10,7 @@ test_description='test git-specific bash prompt functions'
 . "$GIT_BUILD_DIR/contrib/completion/git-prompt.sh"
 
 actual="$TRASH_DIRECTORY/actual"
+TRASH_REALPATH="$(cd "$TRASH_DIRECTORY"; pwd -P)"
 
 test_expect_success 'setup for prompt tests' '
 	mkdir -p subdir/subsubdir &&
@@ -59,7 +60,7 @@ test_expect_success 'gitdir - .git directory in cwd' '
 '
 
 test_expect_success 'gitdir - .git directory in parent' '
-	echo "$TRASH_DIRECTORY/.git" > expected &&
+	echo "$TRASH_REALPATH/.git" > expected &&
 	(
 		cd subdir/subsubdir &&
 		__gitdir > "$actual"
@@ -77,7 +78,7 @@ test_expect_success 'gitdir - cwd is a .git directory' '
 '
 
 test_expect_success 'gitdir - parent is a .git directory' '
-	echo "$TRASH_DIRECTORY/.git" > expected &&
+	echo "$TRASH_REALPATH/.git" > expected &&
 	(
 		cd .git/refs/heads &&
 		__gitdir > "$actual"
@@ -115,7 +116,7 @@ test_expect_success 'gitdir - non-existing $GIT_DIR' '
 '
 
 test_expect_success 'gitdir - gitfile in cwd' '
-	echo "$TRASH_DIRECTORY/otherrepo/.git" > expected &&
+	echo "$TRASH_REALPATH/otherrepo/.git" > expected &&
 	echo "gitdir: $TRASH_DIRECTORY/otherrepo/.git" > subdir/.git &&
 	test_when_finished "rm -f subdir/.git" &&
 	(
@@ -126,7 +127,7 @@ test_expect_success 'gitdir - gitfile in cwd' '
 '
 
 test_expect_success 'gitdir - gitfile in parent' '
-	echo "$TRASH_DIRECTORY/otherrepo/.git" > expected &&
+	echo "$TRASH_REALPATH/otherrepo/.git" > expected &&
 	echo "gitdir: $TRASH_DIRECTORY/otherrepo/.git" > subdir/.git &&
 	test_when_finished "rm -f subdir/.git" &&
 	(
@@ -137,7 +138,7 @@ test_expect_success 'gitdir - gitfile in parent' '
 '
 
 test_expect_success SYMLINKS 'gitdir - resulting path avoids symlinks' '
-	echo "$TRASH_DIRECTORY/otherrepo/.git" > expected &&
+	echo "$TRASH_REALPATH/otherrepo/.git" > expected &&
 	mkdir otherrepo/dir &&
 	test_when_finished "rm -rf otherrepo/dir" &&
 	ln -s otherrepo/dir link &&
@@ -152,7 +153,7 @@ test_expect_success SYMLINKS 'gitdir - resulting path avoids symlinks' '
 test_expect_success 'gitdir - not a git repository' '
 	(
 		cd subdir/subsubdir &&
-		GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY" &&
+		GIT_CEILING_DIRECTORIES="$TRASH_REALPATH" &&
 		export GIT_CEILING_DIRECTORIES &&
 		test_must_fail __gitdir
 	)
-- 
1.7.12.rc0.16.gf4916ac

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] test: some testcases failed if cwd is on a symlink
  2012-07-24  8:00 [PATCH] test: some testcases failed if cwd is on a symlink Jiang Xin
@ 2012-07-24  8:24 ` Stefano Lattarini
  2012-07-24 10:59 ` Pete Wyckoff
  2012-07-24 14:47 ` Junio C Hamano
  2 siblings, 0 replies; 38+ messages in thread
From: Stefano Lattarini @ 2012-07-24  8:24 UTC (permalink / raw)
  To: Jiang Xin; +Cc: Junio C Hamano, Git List

Some grammatical nits about the commit message.  I hope this doesn't
come across as too picky/annoying ...  And you might want to wait for
a native to confirm whether these nits are actually all warranted.

On 07/24/2012 10:00 AM, Jiang Xin wrote:
> Run
>
s/Run/Running/

> command 'git rev-parse --git-dir' under subdir
>
s/under subdir/under a subdir/.  Or even better IMHO,
s/under subdir/in a subdir/

> will return realpath
>
s/realpath/the realpath/

> of '.git' directory.
>
s/of/of the/

> Some test scripts compare this realpath against
> "$TRASH_DIRECTORY", they are not equal
>
s/they are not/but they are not/

> if current working directory is on a symlink.
>
s/current/the current/

> In this fix, get realpath
>
s/realpath/the realpath/

> of "$TRASH_DIRECTORY", store it in
> "$TRASH_REALPATH" variable, and use it when necessary.
> 
> Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
> ---
>  t/t4035-diff-quiet.sh  |  8 +++++---
>  t/t9903-bash-prompt.sh | 13 +++++++------
>  2 个文件被修改,插入 12 行(+),删除 9 行(-)
> 
> diff --git a/t/t4035-diff-quiet.sh b/t/t4035-diff-quiet.sh
> index 23141..5855 100755
> --- a/t/t4035-diff-quiet.sh
> +++ b/t/t4035-diff-quiet.sh
> @@ -4,6 +4,8 @@ test_description='Return value of diffs'
>  
>  . ./test-lib.sh
>  
> +TRASH_REALPATH="$(cd "$TRASH_DIRECTORY"; pwd -P)"
> +
BTW, the outer quotes are not needed; this is enough:

    TRASH_REALPATH=$(cd "$TRASH_DIRECTORY"; pwd -P)

Regards,
  Stefano

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] test: some testcases failed if cwd is on a symlink
  2012-07-24  8:00 [PATCH] test: some testcases failed if cwd is on a symlink Jiang Xin
  2012-07-24  8:24 ` Stefano Lattarini
@ 2012-07-24 10:59 ` Pete Wyckoff
  2012-07-24 14:47 ` Junio C Hamano
  2 siblings, 0 replies; 38+ messages in thread
From: Pete Wyckoff @ 2012-07-24 10:59 UTC (permalink / raw)
  To: Jiang Xin; +Cc: Junio C Hamano, Git List

worldhello.net@gmail.com wrote on Tue, 24 Jul 2012 16:00 +0800:
> Run command 'git rev-parse --git-dir' under subdir will return realpath
> of '.git' directory. Some test scripts compare this realpath against
> "$TRASH_DIRECTORY", they are not equal if current working directory is
> on a symlink.
> 
> In this fix, get realpath of "$TRASH_DIRECTORY", store it in
> "$TRASH_REALPATH" variable, and use it when necessary.

We have the same problem with p4.  I fixed it in t98* in
23bd0c9 (git p4 test: use real_path to resolve p4 client
symlinks, 2012-06-27), but maybe an exported
TRASH_DIRECTORY_REAL_PATH might be generally useful.

> +TRASH_REALPATH="$(cd "$TRASH_DIRECTORY"; pwd -P)"

There's a portable test helper that does this already:

    path=$(test-path-utils real_path "$path")

		-- Pete

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] test: some testcases failed if cwd is on a symlink
  2012-07-24  8:00 [PATCH] test: some testcases failed if cwd is on a symlink Jiang Xin
  2012-07-24  8:24 ` Stefano Lattarini
  2012-07-24 10:59 ` Pete Wyckoff
@ 2012-07-24 14:47 ` Junio C Hamano
  2012-07-24 22:06   ` Jiang Xin
  2012-08-18 14:41   ` Michael Haggerty
  2 siblings, 2 replies; 38+ messages in thread
From: Junio C Hamano @ 2012-07-24 14:47 UTC (permalink / raw)
  To: Jiang Xin; +Cc: Git List

Jiang Xin <worldhello.net@gmail.com> writes:

> Run command 'git rev-parse --git-dir' under subdir will return realpath
> of '.git' directory. Some test scripts compare this realpath against
> "$TRASH_DIRECTORY", they are not equal if current working directory is
> on a symlink.
>
> In this fix, get realpath of "$TRASH_DIRECTORY", store it in
> "$TRASH_REALPATH" variable, and use it when necessary.

I wonder if running test in a real directory (in other words, "fix"
your cwd) may be a simpler, more robust and generally a better
solution, e.g. something silly like...

diff --git a/t/test-lib.sh b/t/test-lib.sh
index acda33d..7f6fb0a 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -15,6 +15,8 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see http://www.gnu.org/licenses/ .
 
+cd "$(pwd -P)"
+
 # if --tee was passed, write the output not only to the terminal, but
 # additionally to the file test-results/$BASENAME.out, too.
 case "$GIT_TEST_TEE_STARTED, $* " in

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH] test: some testcases failed if cwd is on a symlink
  2012-07-24 14:47 ` Junio C Hamano
@ 2012-07-24 22:06   ` Jiang Xin
  2012-08-18 14:41   ` Michael Haggerty
  1 sibling, 0 replies; 38+ messages in thread
From: Jiang Xin @ 2012-07-24 22:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List

2012/7/24 Junio C Hamano <gitster@pobox.com>:
> I wonder if running test in a real directory (in other words, "fix"
> your cwd) may be a simpler, more robust and generally a better
> solution, e.g. something silly like...
>
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index acda33d..7f6fb0a 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -15,6 +15,8 @@
>  # You should have received a copy of the GNU General Public License
>  # along with this program.  If not, see http://www.gnu.org/licenses/ .
>
> +cd "$(pwd -P)"
> +

Yes, it's much simpler.

-- 
Jiang Xin

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] test: some testcases failed if cwd is on a symlink
  2012-07-24 14:47 ` Junio C Hamano
  2012-07-24 22:06   ` Jiang Xin
@ 2012-08-18 14:41   ` Michael Haggerty
  2012-08-18 20:36     ` Junio C Hamano
  1 sibling, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-08-18 14:41 UTC (permalink / raw)
  To: git

Junio C Hamano <gitster <at> pobox.com> writes:

> 
> Jiang Xin <worldhello.net <at> gmail.com> writes:
> 
> > Run command 'git rev-parse --git-dir' under subdir will return realpath
> > of '.git' directory. Some test scripts compare this realpath against
> > "$TRASH_DIRECTORY", they are not equal if current working directory is
> > on a symlink.
> >
> > In this fix, get realpath of "$TRASH_DIRECTORY", store it in
> > "$TRASH_REALPATH" variable, and use it when necessary.
> 
> I wonder if running test in a real directory (in other words, "fix"
> your cwd) may be a simpler, more robust and generally a better
> solution, e.g. something silly like...
> 
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index acda33d..7f6fb0a 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -15,6 +15,8 @@
>  # You should have received a copy of the GNU General Public License
>  # along with this program.  If not, see http://www.gnu.org/licenses/ .
> 
> +cd "$(pwd -P)"
> +
>  # if --tee was passed, write the output not only to the terminal, but
>  # additionally to the file test-results/$BASENAME.out, too.
>  case "$GIT_TEST_TEE_STARTED, $* " in

What is the status of this bug?  Today I wasted a bunch of time trying to track 
down a build breakage that was ultimately caused by this problem.  I was running 
the test suite on master with "--root=/dev/shm" (my usual setting), but this 
caused tests t4035 and t9903 to fail as described upthread.  (It turns out that 
on my system, /dev/shm is a symlink to /run/shm.)

For me, the failure is fixed by Jiang Xin's patch, but it is not fixed by 
Junio's.  In the case of t4035 in failing test "git diff --ignore-all-space, 
both files outside repo", right before "git diff" is called,

PWD=/run/shm/trash directory.t4035-diff-quiet/test-outside/non/git
GIT_CEILING_DIRECTORIES=/dev/shm/trash directory.t4035-diff-quiet/test-outside

I can work around the problem by using "--root=/run/shm".  But it would be good 
to get this problem fixed one way or the other to spare other people the same 
pain.

Michael

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] test: some testcases failed if cwd is on a symlink
  2012-08-18 14:41   ` Michael Haggerty
@ 2012-08-18 20:36     ` Junio C Hamano
  2012-08-19 13:57       ` Michael Haggerty
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-08-18 20:36 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> I can work around the problem by using "--root=/run/shm". 

I do not necessarily think it is a work around.

A low-impact approach may be to update the part that parses --root
option to do

	root=$(...)
        root=$( cd "$root" && /bin/pwd )

or something.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] test: some testcases failed if cwd is on a symlink
  2012-08-18 20:36     ` Junio C Hamano
@ 2012-08-19 13:57       ` Michael Haggerty
  2012-08-19 16:43         ` Junio C Hamano
  2012-08-27  5:13         ` [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY Jiang Xin
  0 siblings, 2 replies; 38+ messages in thread
From: Michael Haggerty @ 2012-08-19 13:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 08/18/2012 10:36 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> I can work around the problem by using "--root=/run/shm". 
> 
> I do not necessarily think it is a work around.

http://en.wiktionary.org/wiki/workaround:
2. (computing) A procedure or a temporary fix that bypasses a problem
   and allows the user to continue working until a better solution
   can be provided; a kluge.

For me that is exactly what it was.

> A low-impact approach may be to update the part that parses --root
> option to do
> 
> 	root=$(...)
>         root=$( cd "$root" && /bin/pwd )
> 
> or something.

I just verified that the combination of your two suggestions (i.e., the
patch below) fixes the problem for me.

Nevertheless, I'm not sure that this is the best solution.  The test
failures that occur without this change suggest to me that
GIT_CEILING_DIRECTORIES is implemented in a fragile way.

Michael

diff --git a/t/test-lib.sh b/t/test-lib.sh
index bb4f886..c7f320f 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -15,6 +15,8 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see http://www.gnu.org/licenses/ .

+cd "$(pwd -P)"
+
 # if --tee was passed, write the output not only to the terminal, but
 # additionally to the file test-results/$BASENAME.out, too.
 case "$GIT_TEST_TEE_STARTED, $* " in
@@ -166,6 +168,7 @@ do
                shift ;; # was handled already
        --root=*)
                root=$(expr "z$1" : 'z[^=]*=\(.*\)')
+               root=$(cd "$root" && /bin/pwd)
                shift ;;
        *)
                echo "error: unknown test option '$1'" >&2; exit 1 ;;


-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH] test: some testcases failed if cwd is on a symlink
  2012-08-19 13:57       ` Michael Haggerty
@ 2012-08-19 16:43         ` Junio C Hamano
  2012-08-21  5:59           ` Michael Haggerty
  2012-08-27  5:13         ` [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY Jiang Xin
  1 sibling, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-08-19 16:43 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> I just verified that the combination of your two suggestions (i.e., the
> patch below) fixes the problem for me.

Good to know.

The only remaining two worries from me are if everybody has working
pwd at that early point in the script (I think MINGW replaces pwd
with its own), and if the latter one should really be "/bin/pwd"
everywhere.  Saying "Give the true path to --root when you run it"
can sidestep the latter issue ;-)

> Nevertheless, I'm not sure that this is the best solution.  The test
> failures that occur without this change suggest to me that
> GIT_CEILING_DIRECTORIES is implemented in a fragile way.

Hrmph.  How would you improve it?  chdir() around twice and compare?

> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index bb4f886..c7f320f 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -15,6 +15,8 @@
>  # You should have received a copy of the GNU General Public License
>  # along with this program.  If not, see http://www.gnu.org/licenses/ .
>
> +cd "$(pwd -P)"
> +
>  # if --tee was passed, write the output not only to the terminal, but
>  # additionally to the file test-results/$BASENAME.out, too.
>  case "$GIT_TEST_TEE_STARTED, $* " in
> @@ -166,6 +168,7 @@ do
>                 shift ;; # was handled already
>         --root=*)
>                 root=$(expr "z$1" : 'z[^=]*=\(.*\)')
> +               root=$(cd "$root" && /bin/pwd)
>                 shift ;;
>         *)
>                 echo "error: unknown test option '$1'" >&2; exit 1 ;;

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] test: some testcases failed if cwd is on a symlink
  2012-08-19 16:43         ` Junio C Hamano
@ 2012-08-21  5:59           ` Michael Haggerty
  0 siblings, 0 replies; 38+ messages in thread
From: Michael Haggerty @ 2012-08-21  5:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 08/19/2012 06:43 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> I just verified that the combination of your two suggestions (i.e., the
>> patch below) fixes the problem for me.
> 
> Good to know.
> 
> The only remaining two worries from me are if everybody has working
> pwd at that early point in the script (I think MINGW replaces pwd
> with its own), and if the latter one should really be "/bin/pwd"
> everywhere.  Saying "Give the true path to --root when you run it"
> can sidestep the latter issue ;-)
> 
>> Nevertheless, I'm not sure that this is the best solution.  The test
>> failures that occur without this change suggest to me that
>> GIT_CEILING_DIRECTORIES is implemented in a fragile way.
> 
> Hrmph.  How would you improve it?  chdir() around twice and compare?

I believe that the old-school method is to stat(2) the two directories
and check whether their device IDs and inode numbers are identical.  But
I don't know whether that is portable to other allegedly
POSIX-compatible OSs, or even works with all modern filesystems (I think
there was just a thread about a FUSE filesystem that sometimes changes
inode numbers).

Another alternative is to write a function that knows how to convert an
arbitrary path into an absolute path, including converting relative
paths to absolute, resolving symlinks, eliminating redundant "/" and
".", resolving "..", and perhaps canonicalizing "/" vs. "\" and
who-knows-what-else on Windows.  It takes a bit of care to implement
this correctly, but it might be a useful function to have around.
Python's library function os.path.realpath() is an example [1].

Either approach would avoid chdir()ing around even temporarily, which
would anyway be bad form in git proper (as opposed to the test suite).
And it would avoid the need to chdir() permanently in the test suite,
which can have the effect of making directory names appear in unfamiliar
forms.

I'm afraid I don't have time to work on this now; I'm still trying to
finish the next iteration of the post-receive-email hook script replacement.

Michael

[1] (Python) source code here:

    http://hg.python.org/cpython/file/20985f52b65e/Lib/posixfile.py

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY
  2012-08-19 13:57       ` Michael Haggerty
  2012-08-19 16:43         ` Junio C Hamano
@ 2012-08-27  5:13         ` Jiang Xin
  2012-08-27 16:15           ` Junio C Hamano
  1 sibling, 1 reply; 38+ messages in thread
From: Jiang Xin @ 2012-08-27  5:13 UTC (permalink / raw)
  To: Junio C Hamano, Michael Haggerty; +Cc: Git List, Jiang Xin

Some testcases will fail if current work directory is on a symlink.

    symlink$ sh ./t4035-diff-quiet.sh
    $ sh ./t4035-diff-quiet.sh --root=/symlink
    $ TEST_OUTPUT_DIRECTORY=/symlink sh ./t4035-diff-quiet.sh

This is because the realpath of ".git" directory will be returned when
running the command 'git rev-parse --git-dir' in a subdir of the work
tree, and the realpath may not equal to "$TRASH_DIRECTORY".

In this fix, "$TRASH_DIRECTORY" is determined right after the realpath
of CWD is resolved.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
---
 t/test-lib.sh | 9 +++++----
 1 个文件被修改,插入 5 行(+),删除 4 行(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 78c42..9a59ca8 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -531,17 +531,17 @@ fi
 test="trash directory.$(basename "$0" .sh)"
 test -n "$root" && test="$root/$test"
 case "$test" in
-/*) TRASH_DIRECTORY="$test" ;;
- *) TRASH_DIRECTORY="$TEST_OUTPUT_DIRECTORY/$test" ;;
+/*) ;;
+ *) test="$TEST_OUTPUT_DIRECTORY/$test" ;;
 esac
-test ! -z "$debug" || remove_trash=$TRASH_DIRECTORY
+test ! -z "$debug" || remove_trash=$test
 rm -fr "$test" || {
 	GIT_EXIT_OK=t
 	echo >&5 "FATAL: Cannot prepare test area"
 	exit 1
 }
 
-HOME="$TRASH_DIRECTORY"
+HOME="$test"
 export HOME
 
 if test -z "$TEST_NO_CREATE_REPO"; then
@@ -552,6 +552,7 @@ fi
 # Use -P to resolve symlinks in our working directory so that the cwd
 # in subprocesses like git equals our $PWD (for pathname comparisons).
 cd -P "$test" || exit 1
+TRASH_DIRECTORY="$(pwd)"
 
 this_test=${0##*/}
 this_test=${this_test%%-*}
-- 
1.7.12.92.gaa91cb5

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY
  2012-08-27  5:13         ` [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY Jiang Xin
@ 2012-08-27 16:15           ` Junio C Hamano
  2012-08-29  4:14             ` Michael Haggerty
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-08-27 16:15 UTC (permalink / raw)
  To: Jiang Xin; +Cc: Michael Haggerty, Git List

Jiang Xin <worldhello.net@gmail.com> writes:

> Some testcases will fail if current work directory is on a symlink.
>
>     symlink$ sh ./t4035-diff-quiet.sh
>     $ sh ./t4035-diff-quiet.sh --root=/symlink
>     $ TEST_OUTPUT_DIRECTORY=/symlink sh ./t4035-diff-quiet.sh
>
> This is because the realpath of ".git" directory will be returned when
> running the command 'git rev-parse --git-dir' in a subdir of the work
> tree, and the realpath may not equal to "$TRASH_DIRECTORY".
>
> In this fix, "$TRASH_DIRECTORY" is determined right after the realpath
> of CWD is resolved.
>
> Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
> Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
> Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
> ---

I think this is in line with what was discussed in the other thread
Michael brought this up.  Thanks for following it through.

Michael, this looks good to me; anything I missed?

> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 78c42..9a59ca8 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -531,17 +531,17 @@ fi
>  test="trash directory.$(basename "$0" .sh)"
>  test -n "$root" && test="$root/$test"
>  case "$test" in
> -/*) TRASH_DIRECTORY="$test" ;;
> - *) TRASH_DIRECTORY="$TEST_OUTPUT_DIRECTORY/$test" ;;
> +/*) ;;
> + *) test="$TEST_OUTPUT_DIRECTORY/$test" ;;
>  esac
> -test ! -z "$debug" || remove_trash=$TRASH_DIRECTORY
> +test ! -z "$debug" || remove_trash=$test
>  rm -fr "$test" || {
>  	GIT_EXIT_OK=t
>  	echo >&5 "FATAL: Cannot prepare test area"
>  	exit 1
>  }
>  
> -HOME="$TRASH_DIRECTORY"
> +HOME="$test"
>  export HOME
>  
>  if test -z "$TEST_NO_CREATE_REPO"; then
> @@ -552,6 +552,7 @@ fi
>  # Use -P to resolve symlinks in our working directory so that the cwd
>  # in subprocesses like git equals our $PWD (for pathname comparisons).
>  cd -P "$test" || exit 1
> +TRASH_DIRECTORY="$(pwd)"
>  
>  this_test=${0##*/}
>  this_test=${this_test%%-*}

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY
  2012-08-27 16:15           ` Junio C Hamano
@ 2012-08-29  4:14             ` Michael Haggerty
  2012-08-29  6:06               ` Junio C Hamano
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-08-29  4:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Git List

On 08/27/2012 06:15 PM, Junio C Hamano wrote:
> Jiang Xin <worldhello.net@gmail.com> writes:
> 
>> Some testcases will fail if current work directory is on a symlink.
>>
>>     symlink$ sh ./t4035-diff-quiet.sh
>>     $ sh ./t4035-diff-quiet.sh --root=/symlink
>>     $ TEST_OUTPUT_DIRECTORY=/symlink sh ./t4035-diff-quiet.sh
>>
>> This is because the realpath of ".git" directory will be returned when
>> running the command 'git rev-parse --git-dir' in a subdir of the work
>> tree, and the realpath may not equal to "$TRASH_DIRECTORY".
>>
>> In this fix, "$TRASH_DIRECTORY" is determined right after the realpath
>> of CWD is resolved.
>>
>> Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
>> Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
>> Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
>> ---
> 
> I think this is in line with what was discussed in the other thread
> Michael brought this up.  Thanks for following it through.
> 
> Michael, this looks good to me; anything I missed?

I can confirm that this patch eliminates the test failures that I was
seeing in t4035 and t9903.

But it also changes almost 600 *other* tests from "succeed even in the
presence of symlinks" to "never tested in the presence of symlinks", and
I think that is surrendering more ground than necessary.  I would rather
see one of the following approaches:

*If* the official policy is that GIT_CEILING_DIRECTORIES is not reliable
in the presence of symlinks, then (a) that limitation should be
mentioned in the documentation; (b) the affected tests should either be
skipped in the case of symlinked directories or they (alone!) should
take measures to work around the problem.

*If* the official policy is that GIT_CEILING_DIRECTORIES should work in
the presence of symlinks, then (a) we should add a failing test case
that specifically documents this bug; (b) other tests that fail as a
side effect of this bug even though they are trying to test something
else should either be skipped in the case of symlinked directories or
they (alone!) should take measures to work around the problem; (c) the
bug should be fixed as soon as possible.

In fact, we could even go further:

* The "cd -P" in test-lib.sh could be changed to "cd -L", to avoid
masking other problems related to symlinks.  If I make that change, I
get test failures in 10 files when using --root=/symlinkeddir, and not
all of them are obviously related to GIT_CEILING_DIRECTORIES.  Some of
these might simply be sloppiness in how the test scripts are written,
but others might be bugs in git proper.

* We could *intentionally* access the trash directories via a symlink on
systems that support symlinks (much like the trash directory names
intentionally include a space) to get *systematic* test coverage of
symlink issues, rather than occasional testing and mysterious failures
when somebody happens to have a symlink in his build path or root.

(But it is still the case that I don't have time to work on this.)

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY
  2012-08-29  4:14             ` Michael Haggerty
@ 2012-08-29  6:06               ` Junio C Hamano
  2012-08-29  8:15                 ` Michael Haggerty
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-08-29  6:06 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jiang Xin, Git List

Michael Haggerty <mhagger@alum.mit.edu> writes:

> But it also changes almost 600 *other* tests from "succeed even in the
> presence of symlinks" to "never tested in the presence of symlinks", and
> I think that is surrendering more ground than necessary.

Ouch.  I did not know have 600+ tests that cares about CEILING.

> I would rather
> see one of the following approaches:
>
> *If* the official policy is that GIT_CEILING_DIRECTORIES is not reliable
> in the presence of symlinks, then (a) that limitation should be
> mentioned in the documentation; (b) the affected tests should either be
> skipped in the case of symlinked directories or they (alone!) should
> take measures to work around the problem.

What exactly is broken in CEILING?

I somehow thought that Jiang's patch was to make sure any variables
that contain pathnames (and make sure future paths we might grab out
of $(pwd)) are realpath without symlinks early in the test set-up,
and with that arrangement, no symlink gotcha should come into
picture, with or without CEILING.

Perhaps the setting of the CEILING has not been correctly converted
with the patch?

Or is there something fundamentally broken, even if we do not have
any symlinks involved, with CEILING check?

Puzzled..

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY
  2012-08-29  6:06               ` Junio C Hamano
@ 2012-08-29  8:15                 ` Michael Haggerty
  2012-08-29 16:33                   ` Junio C Hamano
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-08-29  8:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Git List, Lea Wiemann

On 08/29/2012 08:06 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> But it also changes almost 600 *other* tests from "succeed even in the
>> presence of symlinks" to "never tested in the presence of symlinks", and
>> I think that is surrendering more ground than necessary.
> 
> Ouch.  I did not know have 600+ tests that cares about CEILING.

No, I'm referring to the whole test suite, which is approximately 600
files :-)  Because the patch changes TEST_DIRECTORY, it affects the
environment of all tests that use that variable (one of which being via
GIT_CEILING_DIRECTORIES).

But really I shouldn't have made it sound like this patch is so
terrible, because it is just a logical continuation of the approach begun by

    1bd9c648 t/test-lib.sh: resolve symlinks in working directory, for
pathname comparisons (2008-05-31)

It is in fact the whole approach that I object to.

>> I would rather
>> see one of the following approaches:
>>
>> *If* the official policy is that GIT_CEILING_DIRECTORIES is not reliable
>> in the presence of symlinks, then (a) that limitation should be
>> mentioned in the documentation; (b) the affected tests should either be
>> skipped in the case of symlinked directories or they (alone!) should
>> take measures to work around the problem.
> 
> What exactly is broken in CEILING?

CEILING is used in setup_git_directory_gently_1() when trying to find
the GIT_DIR appropriate for the current directory.  The entries in
CEILING are compared textually to getcwd(), looking for the entry that
is the longest proper prefix of PWD.  This is then used to limit a loop
that is vaguely

    while (!is_git_directory())
            chdir("..");

The problem, as I understand it, is that when the tests are run with
root set to a path that includes a symlink, then test and
TRASH_DIRECTORY are set to the textual value of "$root/trash
directory.tXXXX", but then a little bit later test-lib.sh chdirs into
the trash directory using "cd -P $test" (thereby resolving symlinks).
So, taking my concrete example "--root=/dev/shm" where /dev/shm is a
symlink to /run/shm, we have

    test="/dev/shm/trash directory.tXXXX"
    TRASH_DIRECTORY="$test"
    ...
    cd -P "$test"

which results in PWD being "/run/shm/trash directory.tXXXX".

Then (for example) in test t4035, we have stuff like

    GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/test-outside" &&
    export GIT_CEILING_DIRECTORIES &&
    cd test-outside/non/git &&
    git diff

The problem is that because PWD and TRASH_DIRECTORY *look* different,
the code in setup_git_directory_gently_1() doesn't realize that the
value of GIT_CEILING_DIRECTORIES is a parent of PWD, so it doesn't abort
the search for GIT_DIR there, and this causes the test to fail.

The underlying problem is that setup_git_directory_gently_1() isn't
smart enough to realize that the directory in GIT_CEILING_DIRECTORIES is
in fact a parent of PWD even though it textually doesn't look like one.

The patch being discussed sets TEST_DIRECTORY *after* $test has been
canonicalized (through the use of "cd -P $test"), so that TEST_DIRECTORY
and later GIT_CEILING_DIRECTORIES start with "/run/shm/" instead of
"/dev/shm/".  This change makes setup_git_directory_gently_1()'s naive
textual prefix comparison succeed.

The same problem would occur in the real world whenever the user sets
GIT_CEILING_DIRECTORIES to a value that is *in fact* a parent of PWD but
due to symbolic links *textually* does not appear to be a parent.

So the first decision is whether this should be documented as a known
limitation of GIT_CEILING_DIRECTORIES, or whether it is a bug.  My
opinion is that it is a bug.

> I somehow thought that Jiang's patch was to make sure any variables
> that contain pathnames (and make sure future paths we might grab out
> of $(pwd)) are realpath without symlinks early in the test set-up,
> and with that arrangement, no symlink gotcha should come into
> picture, with or without CEILING.

Yes, this is correct.  But what you call a "gotcha" is actually a
real-world possibility.  Git might *really* be run in a PWD that
contains symlinks, or GIT_CEILING_DIRECTORIES might contain entries that
are symlinks.  So by resolving symlinks in PWD before running tests, we
are preventing the tests from ever encountering this situation, and
therefore failing to test whether git works correctly when PWD includes
a symlink.

> Perhaps the setting of the CEILING has not been correctly converted
> with the patch?

That's not the problem.

> Or is there something fundamentally broken, even if we do not have
> any symlinks involved, with CEILING check?

I believe that:

1. The handling of GIT_CEILING_DIRECTORIES in
setup_git_directory_gently_1() (i.e., in git itself, not how it is used
in the test suite) is correct if no symlinks are involved, but breaks in
the face of symlinks.

2. The proposed patch is a mistake because it masks the brokenness of
setup_git_directory_gently_1().

3. The old commit 1bd9c648 is an even bigger mistake because it might
mask other breakages throughout git in dealing with working directories
that contain symlinks.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY
  2012-08-29  8:15                 ` Michael Haggerty
@ 2012-08-29 16:33                   ` Junio C Hamano
  2012-08-30  4:37                     ` Michael Haggerty
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-08-29 16:33 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jiang Xin, Git List, Lea Wiemann

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 08/29/2012 08:06 AM, Junio C Hamano wrote:
>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
> It is in fact the whole approach that I object to.
> ...
>> What exactly is broken in CEILING?
>
> CEILING is used in setup_git_directory_gently_1() when trying to find
> the GIT_DIR appropriate for the current directory.  The entries in
> CEILING are compared textually to getcwd(), looking for the entry that
> is the longest proper prefix of PWD.  This is then used to limit a loop
> that is vaguely
>
>     while (!is_git_directory())
>             chdir("..");
>
> The problem, as I understand it, is that when the tests are run with
> root set to a path that includes a symlink, then test and
> TRASH_DIRECTORY are set to the textual value of "$root/trash
> directory.tXXXX", but then a little bit later test-lib.sh chdirs into
> the trash directory using "cd -P $test" (thereby resolving symlinks).
> So, taking my concrete example "--root=/dev/shm" where /dev/shm is a
> symlink to /run/shm, we have
>
>     test="/dev/shm/trash directory.tXXXX"
>     TRASH_DIRECTORY="$test"
>     ...
>     cd -P "$test"
>
> which results in PWD being "/run/shm/trash directory.tXXXX".

The components of CEILING should be adjusted to strip the symlink so
that it begins with "/run/shm/" before it is used; otherwise it
would not work.  As the current code does not do that at runtime (I
am describing the situation, not justifying), it won't match if
CEILING is built out of $TRASH_DIRECTORY.  In the above, setting of
$test and $TRASH is wrong; it does not match the realpath.

So "I somehow thought that Jiang's patch was to make sure any
variables that contain pathnames (and make sure future paths we
might grab out of $(pwd)) are realpath without symlinks early in the
test set-up," in my previous message was not correct.  The patch is
not doing that, and that is what breaks your setup.

My preference is to set things up in such a way that most of the
tests do not have to worry about these symlink aliases.  I know you
said you do not like the "whole approach", but I'd like to see our
tests run in a stable and reproducible testing environment.

We should have specific tests (on symlink enabled platforms) that
creates a directory and an alias to it via a symlink, goes into it
and checks the CEILING (and whatever else) behaviour.  We know that
the current code does not account for the alias introduced by
symlinks, and setup_git_directory_gently() may want to be patched to
fix _that_ specific test.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY
  2012-08-29 16:33                   ` Junio C Hamano
@ 2012-08-30  4:37                     ` Michael Haggerty
  2012-08-30  5:26                       ` Junio C Hamano
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-08-30  4:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Git List, Lea Wiemann

On 08/29/2012 06:33 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> On 08/29/2012 08:06 AM, Junio C Hamano wrote:
>>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>>
>> It is in fact the whole approach that I object to.
>> ...
>>> What exactly is broken in CEILING?
>>
>> CEILING is used in setup_git_directory_gently_1() when trying to find
>> the GIT_DIR appropriate for the current directory.  The entries in
>> CEILING are compared textually to getcwd(), looking for the entry that
>> is the longest proper prefix of PWD.  This is then used to limit a loop
>> that is vaguely
>>
>>     while (!is_git_directory())
>>             chdir("..");
>>
>> The problem, as I understand it, is that when the tests are run with
>> root set to a path that includes a symlink, then test and
>> TRASH_DIRECTORY are set to the textual value of "$root/trash
>> directory.tXXXX", but then a little bit later test-lib.sh chdirs into
>> the trash directory using "cd -P $test" (thereby resolving symlinks).
>> So, taking my concrete example "--root=/dev/shm" where /dev/shm is a
>> symlink to /run/shm, we have
>>
>>     test="/dev/shm/trash directory.tXXXX"
>>     TRASH_DIRECTORY="$test"
>>     ...
>>     cd -P "$test"
>>
>> which results in PWD being "/run/shm/trash directory.tXXXX".
> 
> The components of CEILING should be adjusted to strip the symlink so
> that it begins with "/run/shm/" before it is used; otherwise it
> would not work.  As the current code does not do that at runtime (I
> am describing the situation, not justifying), it won't match if
> CEILING is built out of $TRASH_DIRECTORY.  In the above, setting of
> $test and $TRASH is wrong; it does not match the realpath.
> 
> So "I somehow thought that Jiang's patch was to make sure any
> variables that contain pathnames (and make sure future paths we
> might grab out of $(pwd)) are realpath without symlinks early in the
> test set-up," in my previous message was not correct.  The patch is
> not doing that, and that is what breaks your setup.

I've confused things by not being clear what I was describing.  The
description that you quoted above was the state *before* Jiang's patch.
 Jiang's patch makes sure that $TRASH_DIRECTORY is a realpath.  The
working directory was already a realpath before his patch (due to commit
1bd9c648).  There are some other variables that are not necessarily
realpaths, even after Jiang's patch; for example (via a casual look at
test-lib.sh): $remove_trash, $HOME, $test, $TEST_DIRECTORY,
$test_results_dir, $GIT_BUILD_DIR.  I haven't checked whether these
variables are used in ways that could cause other problems.

When Jiang's patch is applied the test suite runs to completion without
any failures on my system even when I pass --root=/dev/shm (a symlink).

> My preference is to set things up in such a way that most of the
> tests do not have to worry about these symlink aliases.  I know you
> said you do not like the "whole approach", but I'd like to see our
> tests run in a stable and reproducible testing environment.
> 
> We should have specific tests (on symlink enabled platforms) that
> creates a directory and an alias to it via a symlink, goes into it
> and checks the CEILING (and whatever else) behaviour.  We know that
> the current code does not account for the alias introduced by
> symlinks, and setup_git_directory_gently() may want to be patched to
> fix _that_ specific test.

Yes, stable and reproducible is good.  And explicit tests for
symlink-sensitive code would be good.  But how do we find the code that
needs explicit testing for symlink tolerance?

Therefore (in addition to specific tests) I think it would be good to
have an easy way to run the *whole* test suite in a symlink environment.
 For example, we could add a test suite option that *explicitly* makes
all tests run in a directory containing symlinks.  Or we could even do
that *all* of the time on systems that support symlinks [1]--that would
be a stable and reproducible testing environment that *does* detect many
symlink problems rather than hiding them.  (It seems unlikely the the
use of symlinks would *hide* other bugs.)  Something along the lines of

    mkdir -p "$test.dir"
    ln -s "$test.dir" "$test"

By the way, is the use of realpath(3) permissible in git code?
GIT_CEILING_DIRECTORIES handling could be fixed relatively easily by
using this function to canonicalize pathnames before comparison.

Michael

[1] This would be analogous to the inclusion of a space in "trash
directory.*", which I presume was done to detect space-handling problems
quickly.

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY
  2012-08-30  4:37                     ` Michael Haggerty
@ 2012-08-30  5:26                       ` Junio C Hamano
  2012-08-31  7:49                         ` Michael Haggerty
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-08-30  5:26 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jiang Xin, Git List, Lea Wiemann

Michael Haggerty <mhagger@alum.mit.edu> writes:

> By the way, is the use of realpath(3) permissible in git code?
> GIT_CEILING_DIRECTORIES handling could be fixed relatively easily by
> using this function to canonicalize pathnames before comparison.

As long as we can add a compat/realpath.c (perhaps lift one from
glibc before they went GPLv3) for platforms that matter, I do not
see it as a huge problem.  How close is abspath.c::real_path() to
what you need?

> [1] This would be analogous to the inclusion of a space in "trash
> directory.*", which I presume was done to detect space-handling problems
> quickly.

Yeah, understood where you are coming from, and I think I can agree
with where you are trying to go.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY
  2012-08-30  5:26                       ` Junio C Hamano
@ 2012-08-31  7:49                         ` Michael Haggerty
  2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-08-31  7:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Git List, Lea Wiemann

On 08/30/2012 07:26 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> By the way, is the use of realpath(3) permissible in git code?
>> GIT_CEILING_DIRECTORIES handling could be fixed relatively easily by
>> using this function to canonicalize pathnames before comparison.
> 
> As long as we can add a compat/realpath.c (perhaps lift one from
> glibc before they went GPLv3) for platforms that matter, I do not
> see it as a huge problem.  How close is abspath.c::real_path() to
> what you need?

Cool, I didn't know about that function.  It's approximately what is
needed, except that it dies if fed an invalid path (unacceptable when
processing GIT_CEILING_DIRECTORIES) and it's buggy (try "test-path-utils
real_path ''" or "test-path-utils real_path '/foo'").  However, I'm
working on fixing it and splitting off a variant that returns NULL on
errors.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
  2012-08-31  7:49                         ` Michael Haggerty
@ 2012-09-26 19:34                           ` Michael Haggerty
  2012-09-26 19:34                             ` [PATCH 1/8] Introduce new static function real_path_internal() Michael Haggerty
                                               ` (8 more replies)
  0 siblings, 9 replies; 38+ messages in thread
From: Michael Haggerty @ 2012-09-26 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git, Michael Haggerty

This series fixes longest_ancestor_length() so that it works even if
prefix_list contains entries that involve symlinks.  The basic goal of
the series is to call real_path() on each of the entries so that a
textual comparison of the potential prefix to the front of path
correctly decides whether the path is located inside of the entry.
But along the way some other things had to be changed:

* real_path() die()s if the path passed to it is invalid, whereas it
  is allowed for GIT_CEILING_DIRECTORIES to contain invalid paths.  So
  create a new function real_path_if_valid() that returns NULL for
  invalid paths.

* Changing longest_ancestor_length() to call real_path_if_valid()
  would make the former very difficult to test (because the tests
  would depend on the contents of the whole filesystem).  Therefore,
  rewrite longest_ancestor_length() in terms of functions
  string_list_split(), string_list_longest_prefix(), and
  real_path_if_valid() which are tested individually.

The net results of these changes are that:

1. t1504 used to have to canonicalize TRASH_DIRECTORY to make itself
   work even if the --root directory contains symlinks.  This
   canonicalization is no longer necessary (and has been removed).

2. t4035, which used to fail if the --root directory contained
   symlinks, now works correctly in that situation.

After this change, all tests pass if the --root directory does *not*
contain symlinks, but t9903 still fails if the --root directory
contains symlinks.  I haven't analyzed the cause of t9903's failure,
but it does not appear to be related to the GIT_CEILING_DIRECTORIES
feature.

On the mailing list I suggested *purposely* inserting symlinks into
the "trash directory.*" paths to test symlink handling more
systematically.  This patch series does *NOT* make that change.

Michael Haggerty (8):
  Introduce new static function real_path_internal()
  Introduce new function real_path_if_valid()
  longest_ancestor_length(): use string_list_split()
  longest_ancestor_length(): explicitly filter list before loop
  longest_ancestor_length(): always add a slash to the end of prefixes
  longest_ancestor_length(): use string_list_longest_prefix()
  longest_ancestor_length(): resolve symlinks before comparing paths
  t1504: stop resolving symlinks in GIT_CEILING_DIRECTORIES

 abspath.c               | 98 ++++++++++++++++++++++++++++++++++++++-----------
 cache.h                 |  1 +
 path.c                  | 54 ++++++++++++++++-----------
 t/t0060-path-utils.sh   | 64 --------------------------------
 t/t1504-ceiling-dirs.sh | 67 +++++++++++++++++----------------
 5 files changed, 144 insertions(+), 140 deletions(-)

-- 
1.7.11.3

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH 1/8] Introduce new static function real_path_internal()
  2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
@ 2012-09-26 19:34                             ` Michael Haggerty
  2012-09-27 21:27                               ` Junio C Hamano
  2012-09-26 19:34                             ` [PATCH 2/8] Introduce new function real_path_if_valid() Michael Haggerty
                                               ` (7 subsequent siblings)
  8 siblings, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-09-26 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git, Michael Haggerty

It accepts a new parameter, die_on_error.  If die_on_error is false,
it simply cleans up after itself and returns NULL rather than dying.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 abspath.c | 93 ++++++++++++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 72 insertions(+), 21 deletions(-)

diff --git a/abspath.c b/abspath.c
index 05f2d79..a7ab8e9 100644
--- a/abspath.c
+++ b/abspath.c
@@ -15,15 +15,26 @@ int is_directory(const char *path)
 #define MAXDEPTH 5
 
 /*
- * Use this to get the real path, i.e. resolve links. If you want an
- * absolute path but don't mind links, use absolute_path.
+ * Return the real path (i.e., absolute path, with symlinks resolved
+ * and extra slashes removed) equivalent to the specified path.  (If
+ * you want an absolute path but don't mind links, use
+ * absolute_path().)  The return value is a pointer to a static
+ * buffer.
+ *
+ * The input and all intermediate paths must be shorter than MAX_PATH.
+ * The directory part of path (i.e., everything up to the last
+ * dir_sep) must denote a valid, existing directory, but the last
+ * component need not exist.  If die_on_error is set, then die with an
+ * informative error message if there is a problem.  Otherwise, return
+ * NULL on errors (without generating any output).
  *
  * If path is our buffer, then return path, as it's already what the
  * user wants.
  */
-const char *real_path(const char *path)
+static const char *real_path_internal(const char *path, int die_on_error)
 {
 	static char bufs[2][PATH_MAX + 1], *buf = bufs[0], *next_buf = bufs[1];
+	char *retval = NULL;
 	char cwd[1024] = "";
 	int buf_index = 1;
 
@@ -35,11 +46,19 @@ const char *real_path(const char *path)
 	if (path == buf || path == next_buf)
 		return path;
 
-	if (!*path)
-		die("The empty string is not a valid path");
+	if (!*path) {
+		if (die_on_error)
+			die("The empty string is not a valid path");
+		else
+			goto error_out;
+	}
 
-	if (strlcpy(buf, path, PATH_MAX) >= PATH_MAX)
-		die ("Too long path: %.*s", 60, path);
+	if (strlcpy(buf, path, PATH_MAX) >= PATH_MAX) {
+		if (die_on_error)
+			die("Too long path: %.*s", 60, path);
+		else
+			goto error_out;
+	}
 
 	while (depth--) {
 		if (!is_directory(buf)) {
@@ -54,20 +73,36 @@ const char *real_path(const char *path)
 		}
 
 		if (*buf) {
-			if (!*cwd && !getcwd(cwd, sizeof(cwd)))
-				die_errno ("Could not get current working directory");
+			if (!*cwd && !getcwd(cwd, sizeof(cwd))) {
+				if (die_on_error)
+					die_errno("Could not get current working directory");
+				else
+					goto error_out;
+			}
 
-			if (chdir(buf))
-				die_errno ("Could not switch to '%s'", buf);
+			if (chdir(buf)) {
+				if (die_on_error)
+					die_errno("Could not switch to '%s'", buf);
+				else
+					goto error_out;
+			}
+		}
+		if (!getcwd(buf, PATH_MAX)) {
+			if (die_on_error)
+				die_errno("Could not get current working directory");
+			else
+				goto error_out;
 		}
-		if (!getcwd(buf, PATH_MAX))
-			die_errno ("Could not get current working directory");
 
 		if (last_elem) {
 			size_t len = strlen(buf);
-			if (len + strlen(last_elem) + 2 > PATH_MAX)
-				die ("Too long path name: '%s/%s'",
-						buf, last_elem);
+			if (len + strlen(last_elem) + 2 > PATH_MAX) {
+				if (die_on_error)
+					die("Too long path name: '%s/%s'",
+					    buf, last_elem);
+				else
+					goto error_out;
+			}
 			if (len && !is_dir_sep(buf[len-1]))
 				buf[len++] = '/';
 			strcpy(buf + len, last_elem);
@@ -77,10 +112,18 @@ const char *real_path(const char *path)
 
 		if (!lstat(buf, &st) && S_ISLNK(st.st_mode)) {
 			ssize_t len = readlink(buf, next_buf, PATH_MAX);
-			if (len < 0)
-				die_errno ("Invalid symlink '%s'", buf);
-			if (PATH_MAX <= len)
-				die("symbolic link too long: %s", buf);
+			if (len < 0) {
+				if (die_on_error)
+					die_errno("Invalid symlink '%s'", buf);
+				else
+					goto error_out;
+			}
+			if (PATH_MAX <= len) {
+				if (die_on_error)
+					die("symbolic link too long: %s", buf);
+				else
+					goto error_out;
+			}
 			next_buf[len] = '\0';
 			buf = next_buf;
 			buf_index = 1 - buf_index;
@@ -89,10 +132,18 @@ const char *real_path(const char *path)
 			break;
 	}
 
+	retval = buf;
+error_out:
+	free(last_elem);
 	if (*cwd && chdir(cwd))
 		die_errno ("Could not change back to '%s'", cwd);
 
-	return buf;
+	return retval;
+}
+
+const char *real_path(const char *path)
+{
+	return real_path_internal(path, 1);
 }
 
 static const char *get_pwd_cwd(void)
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 2/8] Introduce new function real_path_if_valid()
  2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
  2012-09-26 19:34                             ` [PATCH 1/8] Introduce new static function real_path_internal() Michael Haggerty
@ 2012-09-26 19:34                             ` Michael Haggerty
  2012-09-26 19:34                             ` [PATCH 3/8] longest_ancestor_length(): use string_list_split() Michael Haggerty
                                               ` (6 subsequent siblings)
  8 siblings, 0 replies; 38+ messages in thread
From: Michael Haggerty @ 2012-09-26 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git, Michael Haggerty

The function is like real_path(), except that it returns NULL on error
instead of dying.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 abspath.c | 5 +++++
 cache.h   | 1 +
 2 files changed, 6 insertions(+)

diff --git a/abspath.c b/abspath.c
index a7ab8e9..5748b91 100644
--- a/abspath.c
+++ b/abspath.c
@@ -146,6 +146,11 @@ const char *real_path(const char *path)
 	return real_path_internal(path, 1);
 }
 
+const char *real_path_if_valid(const char *path)
+{
+	return real_path_internal(path, 0);
+}
+
 static const char *get_pwd_cwd(void)
 {
 	static char cwd[PATH_MAX + 1];
diff --git a/cache.h b/cache.h
index a58df84..b0d75bc 100644
--- a/cache.h
+++ b/cache.h
@@ -714,6 +714,7 @@ static inline int is_absolute_path(const char *path)
 }
 int is_directory(const char *);
 const char *real_path(const char *path);
+const char *real_path_if_valid(const char *path);
 const char *absolute_path(const char *path);
 const char *relative_path(const char *abs, const char *base);
 int normalize_path_copy(char *dst, const char *src);
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 3/8] longest_ancestor_length(): use string_list_split()
  2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
  2012-09-26 19:34                             ` [PATCH 1/8] Introduce new static function real_path_internal() Michael Haggerty
  2012-09-26 19:34                             ` [PATCH 2/8] Introduce new function real_path_if_valid() Michael Haggerty
@ 2012-09-26 19:34                             ` Michael Haggerty
  2012-09-27 22:48                               ` Junio C Hamano
  2012-09-26 19:34                             ` [PATCH 4/8] longest_ancestor_length(): explicitly filter list before loop Michael Haggerty
                                               ` (5 subsequent siblings)
  8 siblings, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-09-26 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git, Michael Haggerty


Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 path.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/path.c b/path.c
index cbbdf7d..969bc17 100644
--- a/path.c
+++ b/path.c
@@ -12,6 +12,7 @@
  */
 #include "cache.h"
 #include "strbuf.h"
+#include "string-list.h"
 
 static char bad_path[] = "/bad-path/";
 
@@ -582,19 +583,22 @@ int normalize_path_copy(char *dst, const char *src)
  */
 int longest_ancestor_length(const char *path, const char *prefix_list)
 {
+	struct string_list prefixes = STRING_LIST_INIT_DUP;
 	char buf[PATH_MAX+1];
-	const char *ceil, *colon;
-	int len, max_len = -1;
+	int i, max_len = -1;
 
 	if (prefix_list == NULL || !strcmp(path, "/"))
 		return -1;
 
-	for (colon = ceil = prefix_list; *colon; ceil = colon+1) {
-		for (colon = ceil; *colon && *colon != PATH_SEP; colon++);
-		len = colon - ceil;
+	string_list_split(&prefixes, prefix_list, PATH_SEP, -1);
+
+	for (i = 0; i < prefixes.nr; i++) {
+		const char *ceil = prefixes.items[i].string;
+		int len = strlen(ceil);
+
 		if (len == 0 || len > PATH_MAX || !is_absolute_path(ceil))
 			continue;
-		strlcpy(buf, ceil, len+1);
+		memcpy(buf, ceil, len+1);
 		if (normalize_path_copy(buf, buf) < 0)
 			continue;
 		len = strlen(buf);
@@ -608,6 +612,7 @@ int longest_ancestor_length(const char *path, const char *prefix_list)
 		}
 	}
 
+	string_list_clear(&prefixes, 0);
 	return max_len;
 }
 
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 4/8] longest_ancestor_length(): explicitly filter list before loop
  2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
                                               ` (2 preceding siblings ...)
  2012-09-26 19:34                             ` [PATCH 3/8] longest_ancestor_length(): use string_list_split() Michael Haggerty
@ 2012-09-26 19:34                             ` Michael Haggerty
  2012-09-27 22:48                               ` Junio C Hamano
  2012-09-26 19:34                             ` [PATCH 5/8] longest_ancestor_length(): always add a slash to the end of prefixes Michael Haggerty
                                               ` (4 subsequent siblings)
  8 siblings, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-09-26 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git, Michael Haggerty

Separate the step of filtering and normalizing elements of the
prefixes list from the iteration that looks for the longest prefix.
This will help keep the function testable after we not only normalize
the paths, but also convert them into real paths.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 path.c | 32 +++++++++++++++++++++-----------
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/path.c b/path.c
index 969bc17..6da7029 100644
--- a/path.c
+++ b/path.c
@@ -568,6 +568,25 @@ int normalize_path_copy(char *dst, const char *src)
 	return 0;
 }
 
+static int normalize_path_callback(struct string_list_item *item, void *cb_data)
+{
+	char buf[PATH_MAX+1];
+	const char *ceil = item->string;
+	int len = strlen(ceil);
+
+	if (len == 0 || len > PATH_MAX || !is_absolute_path(ceil))
+		return 0;
+	memcpy(buf, ceil, len+1);
+	if (normalize_path_copy(buf, buf) < 0)
+		return 0;
+	len = strlen(buf);
+	if (len > 0 && buf[len-1] == '/')
+		buf[--len] = '\0';
+	free(item->string);
+	item->string = xstrdup(buf);
+	return 1;
+}
+
 /*
  * path = Canonical absolute path
  * prefix_list = Colon-separated list of absolute paths
@@ -584,28 +603,19 @@ int normalize_path_copy(char *dst, const char *src)
 int longest_ancestor_length(const char *path, const char *prefix_list)
 {
 	struct string_list prefixes = STRING_LIST_INIT_DUP;
-	char buf[PATH_MAX+1];
 	int i, max_len = -1;
 
 	if (prefix_list == NULL || !strcmp(path, "/"))
 		return -1;
 
 	string_list_split(&prefixes, prefix_list, PATH_SEP, -1);
+	filter_string_list(&prefixes, 0, normalize_path_callback, NULL);
 
 	for (i = 0; i < prefixes.nr; i++) {
 		const char *ceil = prefixes.items[i].string;
 		int len = strlen(ceil);
 
-		if (len == 0 || len > PATH_MAX || !is_absolute_path(ceil))
-			continue;
-		memcpy(buf, ceil, len+1);
-		if (normalize_path_copy(buf, buf) < 0)
-			continue;
-		len = strlen(buf);
-		if (len > 0 && buf[len-1] == '/')
-			buf[--len] = '\0';
-
-		if (!strncmp(path, buf, len) &&
+		if (!strncmp(path, ceil, len) &&
 		    path[len] == '/' &&
 		    len > max_len) {
 			max_len = len;
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 5/8] longest_ancestor_length(): always add a slash to the end of prefixes
  2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
                                               ` (3 preceding siblings ...)
  2012-09-26 19:34                             ` [PATCH 4/8] longest_ancestor_length(): explicitly filter list before loop Michael Haggerty
@ 2012-09-26 19:34                             ` Michael Haggerty
  2012-09-26 19:34                             ` [PATCH 6/8] longest_ancestor_length(): use string_list_longest_prefix() Michael Haggerty
                                               ` (3 subsequent siblings)
  8 siblings, 0 replies; 38+ messages in thread
From: Michael Haggerty @ 2012-09-26 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git, Michael Haggerty

Previously we stripped off any slashes that were present.  Instead,
add a slash if it is missing.  This removes the need for an extra
check that path has a slash following the prefix and makes the
handling of the root directory more natural, making the way clear to
use string_list_longest_prefix().

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 path.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/path.c b/path.c
index 6da7029..90ef53d 100644
--- a/path.c
+++ b/path.c
@@ -570,7 +570,7 @@ int normalize_path_copy(char *dst, const char *src)
 
 static int normalize_path_callback(struct string_list_item *item, void *cb_data)
 {
-	char buf[PATH_MAX+1];
+	char buf[PATH_MAX+2];
 	const char *ceil = item->string;
 	int len = strlen(ceil);
 
@@ -580,8 +580,10 @@ static int normalize_path_callback(struct string_list_item *item, void *cb_data)
 	if (normalize_path_copy(buf, buf) < 0)
 		return 0;
 	len = strlen(buf);
-	if (len > 0 && buf[len-1] == '/')
-		buf[--len] = '\0';
+	if (len == 0 || buf[len-1] != '/') {
+		buf[len++] = '/';
+		buf[len++] = '\0';
+	}
 	free(item->string);
 	item->string = xstrdup(buf);
 	return 1;
@@ -616,9 +618,8 @@ int longest_ancestor_length(const char *path, const char *prefix_list)
 		int len = strlen(ceil);
 
 		if (!strncmp(path, ceil, len) &&
-		    path[len] == '/' &&
-		    len > max_len) {
-			max_len = len;
+		    len - 1 > max_len) {
+			max_len = len - 1;
 		}
 	}
 
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 6/8] longest_ancestor_length(): use string_list_longest_prefix()
  2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
                                               ` (4 preceding siblings ...)
  2012-09-26 19:34                             ` [PATCH 5/8] longest_ancestor_length(): always add a slash to the end of prefixes Michael Haggerty
@ 2012-09-26 19:34                             ` Michael Haggerty
  2012-09-26 19:34                             ` [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths Michael Haggerty
                                               ` (2 subsequent siblings)
  8 siblings, 0 replies; 38+ messages in thread
From: Michael Haggerty @ 2012-09-26 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git, Michael Haggerty

Use string_list_longest_prefix() in the implementation of
longest_ancestor_length(), instead of an equivalent loop.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 path.c | 15 ++++-----------
 1 file changed, 4 insertions(+), 11 deletions(-)

diff --git a/path.c b/path.c
index 90ef53d..5cace83 100644
--- a/path.c
+++ b/path.c
@@ -605,23 +605,16 @@ static int normalize_path_callback(struct string_list_item *item, void *cb_data)
 int longest_ancestor_length(const char *path, const char *prefix_list)
 {
 	struct string_list prefixes = STRING_LIST_INIT_DUP;
-	int i, max_len = -1;
+	int max_len;
+	const char *longest_prefix;
 
 	if (prefix_list == NULL || !strcmp(path, "/"))
 		return -1;
 
 	string_list_split(&prefixes, prefix_list, PATH_SEP, -1);
 	filter_string_list(&prefixes, 0, normalize_path_callback, NULL);
-
-	for (i = 0; i < prefixes.nr; i++) {
-		const char *ceil = prefixes.items[i].string;
-		int len = strlen(ceil);
-
-		if (!strncmp(path, ceil, len) &&
-		    len - 1 > max_len) {
-			max_len = len - 1;
-		}
-	}
+	longest_prefix = string_list_longest_prefix(&prefixes, path);
+	max_len = longest_prefix ? strlen(longest_prefix) - 1 : -1;
 
 	string_list_clear(&prefixes, 0);
 	return max_len;
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths
  2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
                                               ` (5 preceding siblings ...)
  2012-09-26 19:34                             ` [PATCH 6/8] longest_ancestor_length(): use string_list_longest_prefix() Michael Haggerty
@ 2012-09-26 19:34                             ` Michael Haggerty
  2012-09-27 22:51                               ` Junio C Hamano
  2012-09-26 19:34                             ` [PATCH 8/8] t1504: stop resolving symlinks in GIT_CEILING_DIRECTORIES Michael Haggerty
  2012-09-27 19:42                             ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Junio C Hamano
  8 siblings, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-09-26 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git, Michael Haggerty

longest_ancestor_length() relies on a textual comparison of directory
parts to find the part of path that overlaps with one of the paths in
prefix_list.  But this doesn't work if any of the prefixes involves a
symbolic link, because the directories will look different even though
they might logically refer to the same directory.  So canonicalize the
paths listed in prefix_list using real_path_if_valid() before trying
to find matches.

path is already in canonical form, so doesn't need to be canonicalized
again.

This fixes some problems with using GIT_CEILING_DIRECTORIES that
contains paths involving symlinks, including t4035 if run with --root
set to a path involving symlinks.

Remove a number of tests of longest_ancestor_length().  It is awkward
to test longest_ancestor_length() now, because its new path
normalization behavior depends on the contents of the whole
filesystem.  But we can live without the tests, because
longest_ancestor_length() is now built of reusable components that are
themselves tested separately: string_list_split(),
string_list_longest_prefix(), and real_path_if_valid().

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 path.c                | 17 ++++++++------
 t/t0060-path-utils.sh | 64 ---------------------------------------------------
 2 files changed, 10 insertions(+), 71 deletions(-)

diff --git a/path.c b/path.c
index 5cace83..981bb06 100644
--- a/path.c
+++ b/path.c
@@ -570,22 +570,25 @@ int normalize_path_copy(char *dst, const char *src)
 
 static int normalize_path_callback(struct string_list_item *item, void *cb_data)
 {
-	char buf[PATH_MAX+2];
+	char *buf;
 	const char *ceil = item->string;
-	int len = strlen(ceil);
+	const char *realpath;
+	int len;
 
-	if (len == 0 || len > PATH_MAX || !is_absolute_path(ceil))
+	if (!*ceil || !is_absolute_path(ceil))
 		return 0;
-	memcpy(buf, ceil, len+1);
-	if (normalize_path_copy(buf, buf) < 0)
+	realpath = real_path_if_valid(ceil);
+	if (!realpath)
 		return 0;
-	len = strlen(buf);
+	len = strlen(realpath);
+	buf = xmalloc(len + 2); /* Leave space for possible trailing slash */
+	strcpy(buf, realpath);
 	if (len == 0 || buf[len-1] != '/') {
 		buf[len++] = '/';
 		buf[len++] = '\0';
 	}
 	free(item->string);
-	item->string = xstrdup(buf);
+	item->string = buf;
 	return 1;
 }
 
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 4ef2345..c97bbf2 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -12,28 +12,6 @@ norm_path() {
 	"test \"\$(test-path-utils normalize_path_copy '$1')\" = '$2'"
 }
 
-# On Windows, we are using MSYS's bash, which mangles the paths.
-# Absolute paths are anchored at the MSYS installation directory,
-# which means that the path / accounts for this many characters:
-rootoff=$(test-path-utils normalize_path_copy / | wc -c)
-# Account for the trailing LF:
-if test $rootoff = 2; then
-	rootoff=	# we are on Unix
-else
-	rootoff=$(($rootoff-1))
-fi
-
-ancestor() {
-	# We do some math with the expected ancestor length.
-	expected=$3
-	if test -n "$rootoff" && test "x$expected" != x-1; then
-		expected=$(($expected+$rootoff))
-	fi
-	test_expect_success "longest ancestor: $1 $2 => $expected" \
-	"actual=\$(test-path-utils longest_ancestor_length '$1' '$2') &&
-	 test \"\$actual\" = '$expected'"
-}
-
 # Absolute path tests must be skipped on Windows because due to path mangling
 # the test program never sees a POSIX-style absolute path
 case $(uname -s) in
@@ -93,48 +71,6 @@ norm_path /d1/s1//../s2/../../d2 /d2 POSIX
 norm_path /d1/.../d2 /d1/.../d2 POSIX
 norm_path /d1/..././../d2 /d1/d2 POSIX
 
-ancestor / "" -1
-ancestor / / -1
-ancestor /foo "" -1
-ancestor /foo : -1
-ancestor /foo ::. -1
-ancestor /foo ::..:: -1
-ancestor /foo / 0
-ancestor /foo /fo -1
-ancestor /foo /foo -1
-ancestor /foo /foo/ -1
-ancestor /foo /bar -1
-ancestor /foo /bar/ -1
-ancestor /foo /foo/bar -1
-ancestor /foo /foo:/bar/ -1
-ancestor /foo /foo/:/bar/ -1
-ancestor /foo /foo::/bar/ -1
-ancestor /foo /:/foo:/bar/ 0
-ancestor /foo /foo:/:/bar/ 0
-ancestor /foo /:/bar/:/foo 0
-ancestor /foo/bar "" -1
-ancestor /foo/bar / 0
-ancestor /foo/bar /fo -1
-ancestor /foo/bar foo -1
-ancestor /foo/bar /foo 4
-ancestor /foo/bar /foo/ 4
-ancestor /foo/bar /foo/ba -1
-ancestor /foo/bar /:/fo 0
-ancestor /foo/bar /foo:/foo/ba 4
-ancestor /foo/bar /bar -1
-ancestor /foo/bar /bar/ -1
-ancestor /foo/bar /fo: -1
-ancestor /foo/bar :/fo -1
-ancestor /foo/bar /foo:/bar/ 4
-ancestor /foo/bar /:/foo:/bar/ 4
-ancestor /foo/bar /foo:/:/bar/ 4
-ancestor /foo/bar /:/bar/:/fo 0
-ancestor /foo/bar /:/bar/ 0
-ancestor /foo/bar .:/foo/. 4
-ancestor /foo/bar .:/foo/.:.: 4
-ancestor /foo/bar /foo/./:.:/bar 4
-ancestor /foo/bar .:/bar -1
-
 test_expect_success 'strip_path_suffix' '
 	test c:/msysgit = $(test-path-utils strip_path_suffix \
 		c:/msysgit/libexec//git-core libexec/git-core)
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH 8/8] t1504: stop resolving symlinks in GIT_CEILING_DIRECTORIES
  2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
                                               ` (6 preceding siblings ...)
  2012-09-26 19:34                             ` [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths Michael Haggerty
@ 2012-09-26 19:34                             ` Michael Haggerty
  2012-09-27 19:42                             ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Junio C Hamano
  8 siblings, 0 replies; 38+ messages in thread
From: Michael Haggerty @ 2012-09-26 19:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git, Michael Haggerty

This test used to explicitly resolve symlinks in the paths derived
from TRASH_DIRECTORY that were written to GIT_CEILING_DIRECTORIES,
because the code handling GIT_CEILING_DIRECTORIES was confused by
symlinks.  This is no longer necessary.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 t/t1504-ceiling-dirs.sh | 67 ++++++++++++++++++++++++-------------------------
 1 file changed, 33 insertions(+), 34 deletions(-)

diff --git a/t/t1504-ceiling-dirs.sh b/t/t1504-ceiling-dirs.sh
index cce87a5..a4a5202 100755
--- a/t/t1504-ceiling-dirs.sh
+++ b/t/t1504-ceiling-dirs.sh
@@ -14,8 +14,7 @@ test_fail() {
 	'
 }
 
-TRASH_ROOT="$PWD"
-ROOT_PARENT=$(dirname "$TRASH_ROOT")
+ROOT_PARENT=$(dirname "$TRASH_DIRECTORY")
 
 
 unset GIT_CEILING_DIRECTORIES
@@ -32,16 +31,16 @@ test_prefix ceil_at_parent ""
 GIT_CEILING_DIRECTORIES="$ROOT_PARENT/"
 test_prefix ceil_at_parent_slash ""
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY"
 test_prefix ceil_at_trash ""
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/"
 test_prefix ceil_at_trash_slash ""
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sub"
 test_prefix ceil_at_sub ""
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub/"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sub/"
 test_prefix ceil_at_sub_slash ""
 
 
@@ -56,55 +55,55 @@ export GIT_CEILING_DIRECTORIES
 GIT_CEILING_DIRECTORIES=""
 test_prefix subdir_ceil_empty "sub/dir/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY"
 test_fail subdir_ceil_at_trash
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/"
 test_fail subdir_ceil_at_trash_slash
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sub"
 test_fail subdir_ceil_at_sub
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub/"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sub/"
 test_fail subdir_ceil_at_sub_slash
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub/dir"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sub/dir"
 test_prefix subdir_ceil_at_subdir "sub/dir/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub/dir/"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sub/dir/"
 test_prefix subdir_ceil_at_subdir_slash "sub/dir/"
 
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/su"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/su"
 test_prefix subdir_ceil_at_su "sub/dir/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/su/"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/su/"
 test_prefix subdir_ceil_at_su_slash "sub/dir/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub/di"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sub/di"
 test_prefix subdir_ceil_at_sub_di "sub/dir/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub/di"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sub/di"
 test_prefix subdir_ceil_at_sub_di_slash "sub/dir/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/subdi"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/subdi"
 test_prefix subdir_ceil_at_subdi "sub/dir/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/subdi"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/subdi"
 test_prefix subdir_ceil_at_subdi_slash "sub/dir/"
 
 
-GIT_CEILING_DIRECTORIES="/foo:$TRASH_ROOT/sub"
+GIT_CEILING_DIRECTORIES="/foo:$TRASH_DIRECTORY/sub"
 test_fail second_of_two
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub:/bar"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sub:/bar"
 test_fail first_of_two
 
-GIT_CEILING_DIRECTORIES="/foo:$TRASH_ROOT/sub:/bar"
+GIT_CEILING_DIRECTORIES="/foo:$TRASH_DIRECTORY/sub:/bar"
 test_fail second_of_three
 
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sub"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sub"
 GIT_DIR=../../.git
 export GIT_DIR
 test_prefix git_dir_specified ""
@@ -123,41 +122,41 @@ export GIT_CEILING_DIRECTORIES
 GIT_CEILING_DIRECTORIES=""
 test_prefix sd_ceil_empty "s/d/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY"
 test_fail sd_ceil_at_trash
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/"
 test_fail sd_ceil_at_trash_slash
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/s"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/s"
 test_fail sd_ceil_at_s
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/s/"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/s/"
 test_fail sd_ceil_at_s_slash
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/s/d"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/s/d"
 test_prefix sd_ceil_at_sd "s/d/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/s/d/"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/s/d/"
 test_prefix sd_ceil_at_sd_slash "s/d/"
 
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/su"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/su"
 test_prefix sd_ceil_at_su "s/d/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/su/"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/su/"
 test_prefix sd_ceil_at_su_slash "s/d/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/s/di"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/s/di"
 test_prefix sd_ceil_at_s_di "s/d/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/s/di"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/s/di"
 test_prefix sd_ceil_at_s_di_slash "s/d/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sdi"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sdi"
 test_prefix sd_ceil_at_sdi "s/d/"
 
-GIT_CEILING_DIRECTORIES="$TRASH_ROOT/sdi"
+GIT_CEILING_DIRECTORIES="$TRASH_DIRECTORY/sdi"
 test_prefix sd_ceil_at_sdi_slash "s/d/"
 
 
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
  2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
                                               ` (7 preceding siblings ...)
  2012-09-26 19:34                             ` [PATCH 8/8] t1504: stop resolving symlinks in GIT_CEILING_DIRECTORIES Michael Haggerty
@ 2012-09-27 19:42                             ` Junio C Hamano
  8 siblings, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2012-09-27 19:42 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jiang Xin, Lea Wiemann, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> This series fixes longest_ancestor_length() so that it works even if
> prefix_list contains entries that involve symlinks.  The basic goal of
> the series is to call real_path() on each of the entries so that a
> textual comparison of the potential prefix to the front of path
> correctly decides whether the path is located inside of the entry.
> But along the way some other things had to be changed:
>
> * real_path() die()s if the path passed to it is invalid, whereas it
>   is allowed for GIT_CEILING_DIRECTORIES to contain invalid paths.  So
>   create a new function real_path_if_valid() that returns NULL for
>   invalid paths.
>
> * Changing longest_ancestor_length() to call real_path_if_valid()
>   would make the former very difficult to test (because the tests
>   would depend on the contents of the whole filesystem).  Therefore,
>   rewrite longest_ancestor_length() in terms of functions
>   string_list_split(), string_list_longest_prefix(), and
>   real_path_if_valid() which are tested individually.
>
> The net results of these changes are that:
>
> 1. t1504 used to have to canonicalize TRASH_DIRECTORY to make itself
>    work even if the --root directory contains symlinks.  This
>    canonicalization is no longer necessary (and has been removed).
>
> 2. t4035, which used to fail if the --root directory contained
>    symlinks, now works correctly in that situation.
>
> After this change, all tests pass if the --root directory does *not*
> contain symlinks, but t9903 still fails if the --root directory
> contains symlinks.  I haven't analyzed the cause of t9903's failure,
> but it does not appear to be related to the GIT_CEILING_DIRECTORIES
> feature.

I haven't read the actual patches yet, but the all of the above
sounds sensible.

> On the mailing list I suggested *purposely* inserting symlinks into
> the "trash directory.*" paths to test symlink handling more
> systematically.  This patch series does *NOT* make that change.

And that may be a sensible follow-up step once the dust settles.

Thanks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] Introduce new static function real_path_internal()
  2012-09-26 19:34                             ` [PATCH 1/8] Introduce new static function real_path_internal() Michael Haggerty
@ 2012-09-27 21:27                               ` Junio C Hamano
  2012-09-29  4:56                                 ` Michael Haggerty
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-09-27 21:27 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jiang Xin, Lea Wiemann, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> @@ -54,20 +73,36 @@ const char *real_path(const char *path)
>  		}
>  
>  		if (*buf) {
> -			if (!*cwd && !getcwd(cwd, sizeof(cwd)))
> -				die_errno ("Could not get current working directory");
> +			if (!*cwd && !getcwd(cwd, sizeof(cwd))) {
> +				if (die_on_error)
> +					die_errno("Could not get current working directory");
> +				else
> +					goto error_out;
> +			}
>  
> -			if (chdir(buf))
> -				die_errno ("Could not switch to '%s'", buf);
> +			if (chdir(buf)) {
> +				if (die_on_error)
> +					die_errno("Could not switch to '%s'", buf);
> +				else
> +					goto error_out;
> +			}
> +		}

The patch makes sense, but while you are touching this code, I have
to wonder if there is an easy way to tell, before entering the loop,
if we have to chdir() around in the loop.  That would allow us to
hoist the getcwd() that is done only so that we can come back to
where we started outside the loop, making it clear why the call is
there, instead of cryptic "if (!*cwd &&" to ensure we do getcwd()
once and before doing any chdir().

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/8] longest_ancestor_length(): use string_list_split()
  2012-09-26 19:34                             ` [PATCH 3/8] longest_ancestor_length(): use string_list_split() Michael Haggerty
@ 2012-09-27 22:48                               ` Junio C Hamano
  2012-09-29  5:25                                 ` Michael Haggerty
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-09-27 22:48 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jiang Xin, Lea Wiemann, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> -	for (colon = ceil = prefix_list; *colon; ceil = colon+1) {
> -		for (colon = ceil; *colon && *colon != PATH_SEP; colon++);
> -		len = colon - ceil;
> +	string_list_split(&prefixes, prefix_list, PATH_SEP, -1);
> +
> +	for (i = 0; i < prefixes.nr; i++) {
> +		const char *ceil = prefixes.items[i].string;
> +		int len = strlen(ceil);
> +

Much nicer than the yucky original ;-)

>  		if (len == 0 || len > PATH_MAX || !is_absolute_path(ceil))
>  			continue;
> -		strlcpy(buf, ceil, len+1);
> +		memcpy(buf, ceil, len+1);
>  		if (normalize_path_copy(buf, buf) < 0)
>  			continue;

Why do you need this memcpy in the first place?  Isn't ceil already
a NUL terminated string unlike the original code that points into a
part of the prefix_list string?  IOW, why not

	normalize_path_copy(buf, ceil);

or something?

Can normalize_path_copy() overflow buf[PATH_MAX+1] here (before or
after this patch)?

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 4/8] longest_ancestor_length(): explicitly filter list before loop
  2012-09-26 19:34                             ` [PATCH 4/8] longest_ancestor_length(): explicitly filter list before loop Michael Haggerty
@ 2012-09-27 22:48                               ` Junio C Hamano
  0 siblings, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2012-09-27 22:48 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jiang Xin, Lea Wiemann, git

Makes sense.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths
  2012-09-26 19:34                             ` [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths Michael Haggerty
@ 2012-09-27 22:51                               ` Junio C Hamano
  2012-09-29  5:46                                 ` Michael Haggerty
  0 siblings, 1 reply; 38+ messages in thread
From: Junio C Hamano @ 2012-09-27 22:51 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jiang Xin, Lea Wiemann, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> longest_ancestor_length() relies on a textual comparison of directory
> parts to find the part of path that overlaps with one of the paths in
> prefix_list.  But this doesn't work if any of the prefixes involves a
> symbolic link, because the directories will look different even though
> they might logically refer to the same directory.  So canonicalize the
> paths listed in prefix_list using real_path_if_valid() before trying
> to find matches.
>
> path is already in canonical form, so doesn't need to be canonicalized
> again.
>
> This fixes some problems with using GIT_CEILING_DIRECTORIES that
> contains paths involving symlinks, including t4035 if run with --root
> set to a path involving symlinks.
>
> Remove a number of tests of longest_ancestor_length().  It is awkward
> to test longest_ancestor_length() now, because its new path
> normalization behavior depends on the contents of the whole
> filesystem.  But we can live without the tests, because
> longest_ancestor_length() is now built of reusable components that are
> themselves tested separately: string_list_split(),
> string_list_longest_prefix(), and real_path_if_valid().

Errr, components may be correct but the way to combine and construct
could go faulty, so...

> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
> ---
>  path.c                | 17 ++++++++------
>  t/t0060-path-utils.sh | 64 ---------------------------------------------------
>  2 files changed, 10 insertions(+), 71 deletions(-)
>
> diff --git a/path.c b/path.c
> index 5cace83..981bb06 100644
> --- a/path.c
> +++ b/path.c
> @@ -570,22 +570,25 @@ int normalize_path_copy(char *dst, const char *src)
>  
>  static int normalize_path_callback(struct string_list_item *item, void *cb_data)
>  {
> -	char buf[PATH_MAX+2];
> +	char *buf;
>  	const char *ceil = item->string;
> -	int len = strlen(ceil);
> +	const char *realpath;
> +	int len;
>  
> -	if (len == 0 || len > PATH_MAX || !is_absolute_path(ceil))
> +	if (!*ceil || !is_absolute_path(ceil))
>  		return 0;
> -	memcpy(buf, ceil, len+1);
> -	if (normalize_path_copy(buf, buf) < 0)
> +	realpath = real_path_if_valid(ceil);
> +	if (!realpath)
>  		return 0;
> -	len = strlen(buf);
> +	len = strlen(realpath);
> +	buf = xmalloc(len + 2); /* Leave space for possible trailing slash */
> +	strcpy(buf, realpath);
>  	if (len == 0 || buf[len-1] != '/') {
>  		buf[len++] = '/';
>  		buf[len++] = '\0';
>  	}

Nice.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] Introduce new static function real_path_internal()
  2012-09-27 21:27                               ` Junio C Hamano
@ 2012-09-29  4:56                                 ` Michael Haggerty
  2012-09-29  5:40                                   ` Junio C Hamano
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-09-29  4:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git

On 09/27/2012 11:27 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> @@ -54,20 +73,36 @@ const char *real_path(const char *path)
>>  		}
>>  
>>  		if (*buf) {
>> -			if (!*cwd && !getcwd(cwd, sizeof(cwd)))
>> -				die_errno ("Could not get current working directory");
>> +			if (!*cwd && !getcwd(cwd, sizeof(cwd))) {
>> +				if (die_on_error)
>> +					die_errno("Could not get current working directory");
>> +				else
>> +					goto error_out;
>> +			}
>>  
>> -			if (chdir(buf))
>> -				die_errno ("Could not switch to '%s'", buf);
>> +			if (chdir(buf)) {
>> +				if (die_on_error)
>> +					die_errno("Could not switch to '%s'", buf);
>> +				else
>> +					goto error_out;
>> +			}
>> +		}
> 
> The patch makes sense, but while you are touching this code, I have
> to wonder if there is an easy way to tell, before entering the loop,
> if we have to chdir() around in the loop.  That would allow us to
> hoist the getcwd() that is done only so that we can come back to
> where we started outside the loop, making it clear why the call is
> there, instead of cryptic "if (!*cwd &&" to ensure we do getcwd()
> once and before doing any chdir().

I don't see an easy way to predict, before entering the loop, whether
chdir() will be needed.  For example, compare

    touch filename
    ln -s filename foo
    ./test-path-utils real_path foo

with

    touch filename
    ln -s $(pwd)/filename foo
    ./test-path-utils real_path foo

In the first case no chdir() is needed, whereas in the second case a
chdir() is needed but only on the second loop iteration.

All I can offer you is a palliative like the one below.

Michael

diff --git a/abspath.c b/abspath.c
index 5748b91..40cdc46 100644
--- a/abspath.c
+++ b/abspath.c
@@ -35,7 +35,14 @@ static const char *real_path_internal(const char
*path, int die_on_error)
 {
        static char bufs[2][PATH_MAX + 1], *buf = bufs[0], *next_buf =
bufs[1];
        char *retval = NULL;
+
+       /*
+        * If we have to temporarily chdir(), store the original CWD
+        * here so that we can chdir() back to it at the end of the
+        * function:
+        */
        char cwd[1024] = "";
+
        int buf_index = 1;

        int depth = MAXDEPTH;

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/8] longest_ancestor_length(): use string_list_split()
  2012-09-27 22:48                               ` Junio C Hamano
@ 2012-09-29  5:25                                 ` Michael Haggerty
  2012-09-29  5:43                                   ` Junio C Hamano
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Haggerty @ 2012-09-29  5:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git

On 09/28/2012 12:48 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> -	for (colon = ceil = prefix_list; *colon; ceil = colon+1) {
>> -		for (colon = ceil; *colon && *colon != PATH_SEP; colon++);
>> -		len = colon - ceil;
>> +	string_list_split(&prefixes, prefix_list, PATH_SEP, -1);
>> +
>> +	for (i = 0; i < prefixes.nr; i++) {
>> +		const char *ceil = prefixes.items[i].string;
>> +		int len = strlen(ceil);
>> +
> 
> Much nicer than the yucky original ;-)

If your winky-smiley implies irony, then I would like to object.  Even
though the original is not difficult to understand, it is more difficult
to review than the new version because one has to think about off-by-one
errors etc.  The new version has a bit more boilerplate, but a quick
read suffices both to understand it and to see that it is correct.
Though of course I admit that the improvement is small.

But the main point of this change is to move towards using more testable
parts, so it is a step forward regardless of whether it is more readable.

>>  		if (len == 0 || len > PATH_MAX || !is_absolute_path(ceil))
>>  			continue;
>> -		strlcpy(buf, ceil, len+1);
>> +		memcpy(buf, ceil, len+1);
>>  		if (normalize_path_copy(buf, buf) < 0)
>>  			continue;
> 
> Why do you need this memcpy in the first place?  Isn't ceil already
> a NUL terminated string unlike the original code that points into a
> part of the prefix_list string?  IOW, why not
> 
> 	normalize_path_copy(buf, ceil);
> 
> or something?

Good point.  I will fix this in v2.

> Can normalize_path_copy() overflow buf[PATH_MAX+1] here (before or
> after this patch)?

normalize_path_copy() can only shrink paths, not grow them.  So the
length check on ceil guarantees that the result of normalize_path_copy()
will fit in buf.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 1/8] Introduce new static function real_path_internal()
  2012-09-29  4:56                                 ` Michael Haggerty
@ 2012-09-29  5:40                                   ` Junio C Hamano
  0 siblings, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2012-09-29  5:40 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jiang Xin, Lea Wiemann, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

>> The patch makes sense, but while you are touching this code, I have
>> to wonder if there is an easy way to tell, before entering the loop,
>> if we have to chdir() around in the loop.  That would allow us to
>> hoist the getcwd() that is done only so that we can come back to
>> where we started outside the loop, making it clear why the call is
>> there, instead of cryptic "if (!*cwd &&" to ensure we do getcwd()
>> once and before doing any chdir().
>
> I don't see an easy way to predict, before entering the loop, whether
> chdir() will be needed.  For example, compare
>
>     touch filename
>     ln -s filename foo
>     ./test-path-utils real_path foo
>
> with
>
>     touch filename
>     ln -s $(pwd)/filename foo
>     ./test-path-utils real_path foo
>
> In the first case no chdir() is needed, whereas in the second case a
> chdir() is needed but only on the second loop iteration.

Thanks for an example, and it is perfectly OK if we really have to
have that "only fires once and only if needed" logic in the loop.

> All I can offer you is a palliative like the one below.

I think that is an improvement; or we could rename it not cwd[] but
some other name that includes the word "original" in it.

Thanks.

> Michael
>
> diff --git a/abspath.c b/abspath.c
> index 5748b91..40cdc46 100644
> --- a/abspath.c
> +++ b/abspath.c
> @@ -35,7 +35,14 @@ static const char *real_path_internal(const char
> *path, int die_on_error)
>  {
>         static char bufs[2][PATH_MAX + 1], *buf = bufs[0], *next_buf =
> bufs[1];
>         char *retval = NULL;
> +
> +       /*
> +        * If we have to temporarily chdir(), store the original CWD
> +        * here so that we can chdir() back to it at the end of the
> +        * function:
> +        */
>         char cwd[1024] = "";
> +
>         int buf_index = 1;
>
>         int depth = MAXDEPTH;

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 3/8] longest_ancestor_length(): use string_list_split()
  2012-09-29  5:25                                 ` Michael Haggerty
@ 2012-09-29  5:43                                   ` Junio C Hamano
  0 siblings, 0 replies; 38+ messages in thread
From: Junio C Hamano @ 2012-09-29  5:43 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jiang Xin, Lea Wiemann, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> On 09/28/2012 12:48 AM, Junio C Hamano wrote:
>> Michael Haggerty <mhagger@alum.mit.edu> writes:
>> 
>>> -	for (colon = ceil = prefix_list; *colon; ceil = colon+1) {
>>> -		for (colon = ceil; *colon && *colon != PATH_SEP; colon++);
>>> -		len = colon - ceil;
>>> +	string_list_split(&prefixes, prefix_list, PATH_SEP, -1);
>>> +
>>> +	for (i = 0; i < prefixes.nr; i++) {
>>> +		const char *ceil = prefixes.items[i].string;
>>> +		int len = strlen(ceil);
>>> +
>> 
>> Much nicer than the yucky original ;-)
>
> If your winky-smiley implies irony, then I would like to object.

No irony.  The original was hard to read, especially with the for
loop that does in-line strchr() on a single line.  The updated one
is much easier to read.

> normalize_path_copy() can only shrink paths, not grow them.  So the
> length check on ceil guarantees that the result of normalize_path_copy()
> will fit in buf.

OK.

Thanks.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths
  2012-09-27 22:51                               ` Junio C Hamano
@ 2012-09-29  5:46                                 ` Michael Haggerty
  0 siblings, 0 replies; 38+ messages in thread
From: Michael Haggerty @ 2012-09-29  5:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jiang Xin, Lea Wiemann, git

On 09/28/2012 12:51 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> longest_ancestor_length() relies on a textual comparison of directory
>> parts to find the part of path that overlaps with one of the paths in
>> prefix_list.  But this doesn't work if any of the prefixes involves a
>> symbolic link, because the directories will look different even though
>> they might logically refer to the same directory.  So canonicalize the
>> paths listed in prefix_list using real_path_if_valid() before trying
>> to find matches.
>>
>> path is already in canonical form, so doesn't need to be canonicalized
>> again.
>>
>> This fixes some problems with using GIT_CEILING_DIRECTORIES that
>> contains paths involving symlinks, including t4035 if run with --root
>> set to a path involving symlinks.
>>
>> Remove a number of tests of longest_ancestor_length().  It is awkward
>> to test longest_ancestor_length() now, because its new path
>> normalization behavior depends on the contents of the whole
>> filesystem.  But we can live without the tests, because
>> longest_ancestor_length() is now built of reusable components that are
>> themselves tested separately: string_list_split(),
>> string_list_longest_prefix(), and real_path_if_valid().
> 
> Errr, components may be correct but the way to combine and construct
> could go faulty, so...

I don't see a realistic alternative.  Testing real_path() is itself is
already quite awkward (see t0060), so testing longest_ancestor_length()
would be even more so.  Of course, the GIT_CEILING_DIRECTORIES tests
indirectly test longest_ancestor_length(), though not systematically.

If you have a better suggestion, please let me know.

>> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
>> ---
>>  path.c                | 17 ++++++++------
>>  t/t0060-path-utils.sh | 64 ---------------------------------------------------
>>  2 files changed, 10 insertions(+), 71 deletions(-)
>>
>> diff --git a/path.c b/path.c
>> index 5cace83..981bb06 100644
>> --- a/path.c
>> +++ b/path.c
>> @@ -570,22 +570,25 @@ int normalize_path_copy(char *dst, const char *src)
>>  
>>  static int normalize_path_callback(struct string_list_item *item, void *cb_data)
>>  {
>> -	char buf[PATH_MAX+2];
>> +	char *buf;
>>  	const char *ceil = item->string;
>> -	int len = strlen(ceil);
>> +	const char *realpath;
>> +	int len;
>>  
>> -	if (len == 0 || len > PATH_MAX || !is_absolute_path(ceil))
>> +	if (!*ceil || !is_absolute_path(ceil))
>>  		return 0;
>> -	memcpy(buf, ceil, len+1);
>> -	if (normalize_path_copy(buf, buf) < 0)
>> +	realpath = real_path_if_valid(ceil);
>> +	if (!realpath)
>>  		return 0;
>> -	len = strlen(buf);
>> +	len = strlen(realpath);
>> +	buf = xmalloc(len + 2); /* Leave space for possible trailing slash */
>> +	strcpy(buf, realpath);
>>  	if (len == 0 || buf[len-1] != '/') {
>>  		buf[len++] = '/';
>>  		buf[len++] = '\0';
>>  	}
> 
> Nice.

I just noticed that the second "len++" in the final "if" is misleading.
 I will fix that in v2.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2012-09-29  5:46 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-24  8:00 [PATCH] test: some testcases failed if cwd is on a symlink Jiang Xin
2012-07-24  8:24 ` Stefano Lattarini
2012-07-24 10:59 ` Pete Wyckoff
2012-07-24 14:47 ` Junio C Hamano
2012-07-24 22:06   ` Jiang Xin
2012-08-18 14:41   ` Michael Haggerty
2012-08-18 20:36     ` Junio C Hamano
2012-08-19 13:57       ` Michael Haggerty
2012-08-19 16:43         ` Junio C Hamano
2012-08-21  5:59           ` Michael Haggerty
2012-08-27  5:13         ` [PATCH v2] test: set the realpath of CWD as TRASH_DIRECTORY Jiang Xin
2012-08-27 16:15           ` Junio C Hamano
2012-08-29  4:14             ` Michael Haggerty
2012-08-29  6:06               ` Junio C Hamano
2012-08-29  8:15                 ` Michael Haggerty
2012-08-29 16:33                   ` Junio C Hamano
2012-08-30  4:37                     ` Michael Haggerty
2012-08-30  5:26                       ` Junio C Hamano
2012-08-31  7:49                         ` Michael Haggerty
2012-09-26 19:34                           ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Michael Haggerty
2012-09-26 19:34                             ` [PATCH 1/8] Introduce new static function real_path_internal() Michael Haggerty
2012-09-27 21:27                               ` Junio C Hamano
2012-09-29  4:56                                 ` Michael Haggerty
2012-09-29  5:40                                   ` Junio C Hamano
2012-09-26 19:34                             ` [PATCH 2/8] Introduce new function real_path_if_valid() Michael Haggerty
2012-09-26 19:34                             ` [PATCH 3/8] longest_ancestor_length(): use string_list_split() Michael Haggerty
2012-09-27 22:48                               ` Junio C Hamano
2012-09-29  5:25                                 ` Michael Haggerty
2012-09-29  5:43                                   ` Junio C Hamano
2012-09-26 19:34                             ` [PATCH 4/8] longest_ancestor_length(): explicitly filter list before loop Michael Haggerty
2012-09-27 22:48                               ` Junio C Hamano
2012-09-26 19:34                             ` [PATCH 5/8] longest_ancestor_length(): always add a slash to the end of prefixes Michael Haggerty
2012-09-26 19:34                             ` [PATCH 6/8] longest_ancestor_length(): use string_list_longest_prefix() Michael Haggerty
2012-09-26 19:34                             ` [PATCH 7/8] longest_ancestor_length(): resolve symlinks before comparing paths Michael Haggerty
2012-09-27 22:51                               ` Junio C Hamano
2012-09-29  5:46                                 ` Michael Haggerty
2012-09-26 19:34                             ` [PATCH 8/8] t1504: stop resolving symlinks in GIT_CEILING_DIRECTORIES Michael Haggerty
2012-09-27 19:42                             ` [PATCH 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).