git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Canonicalize the git-svn path & url accessors
@ 2012-07-28  9:38 Michael G. Schwern
  2012-07-28  9:38 ` [PATCH 1/7] Move the canonicalization functions to Git::SVN::Utils Michael G. Schwern
                   ` (6 more replies)
  0 siblings, 7 replies; 39+ messages in thread
From: Michael G. Schwern @ 2012-07-28  9:38 UTC (permalink / raw)
  To: git, gitster; +Cc: robbat2, bwalton, normalperson, jrnieder, schwern

This patch turns on canonicalization in the Git::SVN and Git::SVN::Ra
path and url accessors.

It also makes the canonicalizers use the SVN API when available.

All patches pass with SVN 1.6.  Next patch series will fix SVN 1.7.

This should be placed on top of the previous patch series which added
path and url accessors.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 1/7] Move the canonicalization functions to Git::SVN::Utils
  2012-07-28  9:38 Canonicalize the git-svn path & url accessors Michael G. Schwern
@ 2012-07-28  9:38 ` Michael G. Schwern
  2012-07-28  9:38 ` [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available Michael G. Schwern
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 39+ messages in thread
From: Michael G. Schwern @ 2012-07-28  9:38 UTC (permalink / raw)
  To: git, gitster; +Cc: robbat2, bwalton, normalperson, jrnieder, schwern

From: "Michael G. Schwern" <schwern@pobox.com>

So they can be used by others.

I'd like to test them, but they're going to become SVN API wrappers shortly
and those aren't predictable.

No functional change.
---
 git-svn.perl          | 33 +++++++-------------------------
 perl/Git/SVN/Utils.pm | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 58 insertions(+), 27 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index de1ddd1..a857484 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -29,7 +29,13 @@ use Git::SVN::Prompt;
 use Git::SVN::Log;
 use Git::SVN::Migration;
 
-use Git::SVN::Utils qw(fatal can_compress);
+use Git::SVN::Utils qw(
+	fatal
+	can_compress
+	canonicalize_path
+	canonicalize_url
+);
+
 use Git qw(
 	git_cmd_try
 	command
@@ -1256,31 +1262,6 @@ sub cmd_mkdirs {
 	$gs->mkemptydirs($_revision);
 }
 
-sub canonicalize_path {
-	my ($path) = @_;
-	my $dot_slash_added = 0;
-	if (substr($path, 0, 1) ne "/") {
-		$path = "./" . $path;
-		$dot_slash_added = 1;
-	}
-	# File::Spec->canonpath doesn't collapse x/../y into y (for a
-	# good reason), so let's do this manually.
-	$path =~ s#/+#/#g;
-	$path =~ s#/\.(?:/|$)#/#g;
-	$path =~ s#/[^/]+/\.\.##g;
-	$path =~ s#/$##g;
-	$path =~ s#^\./## if $dot_slash_added;
-	$path =~ s#^/##;
-	$path =~ s#^\.$##;
-	return $path;
-}
-
-sub canonicalize_url {
-	my ($url) = @_;
-	$url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;
-	return $url;
-}
-
 # get_svnprops(PATH)
 # ------------------
 # Helper for cmd_propget and cmd_proplist below.
diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 3d0bfa4..c842d00 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -5,7 +5,12 @@ use warnings;
 
 use base qw(Exporter);
 
-our @EXPORT_OK = qw(fatal can_compress);
+our @EXPORT_OK = qw(
+	fatal
+	can_compress
+	canonicalize_path
+	canonicalize_url
+);
 
 
 =head1 NAME
@@ -56,4 +61,49 @@ sub can_compress {
 }
 
 
+=head3 canonicalize_path
+
+    my $canoncalized_path = canonicalize_path($path);
+
+Converts $path into a canonical form which is safe to pass to the SVN
+API as a file path.
+
+=cut
+
+sub canonicalize_path {
+	my ($path) = @_;
+	my $dot_slash_added = 0;
+	if (substr($path, 0, 1) ne "/") {
+		$path = "./" . $path;
+		$dot_slash_added = 1;
+	}
+	# File::Spec->canonpath doesn't collapse x/../y into y (for a
+	# good reason), so let's do this manually.
+	$path =~ s#/+#/#g;
+	$path =~ s#/\.(?:/|$)#/#g;
+	$path =~ s#/[^/]+/\.\.##g;
+	$path =~ s#/$##g;
+	$path =~ s#^\./## if $dot_slash_added;
+	$path =~ s#^/##;
+	$path =~ s#^\.$##;
+	return $path;
+}
+
+
+=head3 canonicalize_url
+
+    my $canonicalized_url = canonicalize_url($url);
+
+Converts $url into a canonical form which is safe to pass to the SVN
+API as a URL.
+
+=cut
+
+sub canonicalize_url {
+	my ($url) = @_;
+	$url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;
+	return $url;
+}
+
+
 1;
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.
  2012-07-28  9:38 Canonicalize the git-svn path & url accessors Michael G. Schwern
  2012-07-28  9:38 ` [PATCH 1/7] Move the canonicalization functions to Git::SVN::Utils Michael G. Schwern
@ 2012-07-28  9:38 ` Michael G. Schwern
  2012-07-28 13:50   ` Jonathan Nieder
  2012-07-28  9:38 ` [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths Michael G. Schwern
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Michael G. Schwern @ 2012-07-28  9:38 UTC (permalink / raw)
  To: git, gitster; +Cc: robbat2, bwalton, normalperson, jrnieder, schwern

From: "Michael G. Schwern" <schwern@pobox.com>

No change on SVN 1.6.  The tests all pass with SVN 1.6 if
canonicalize_url() does nothing, so tests passing doesn't have
much meaning.

The tests are so messed up right now with SVN 1.7 it isn't really
useful to check.  They will be useful later.
---
 perl/Git/SVN/Utils.pm | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index c842d00..9d5d3c5 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -3,6 +3,8 @@ package Git::SVN::Utils;
 use strict;
 use warnings;
 
+use SVN::Core;
+
 use base qw(Exporter);
 
 our @EXPORT_OK = qw(
@@ -100,6 +102,20 @@ API as a URL.
 =cut
 
 sub canonicalize_url {
+	my $url = shift;
+
+	# The 1.7 way to do it
+	if ( defined &SVN::_Core::svn_uri_canonicalize ) {
+		return SVN::_Core::svn_uri_canonicalize($url);
+	}
+	# There wasn't a 1.6 way to do it, so we do it ourself.
+	else {
+		return _canonicalize_url_ourselves($url);
+	}
+}
+
+
+sub _canonicalize_url_ourselves {
 	my ($url) = @_;
 	$url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;
 	return $url;
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.
  2012-07-28  9:38 Canonicalize the git-svn path & url accessors Michael G. Schwern
  2012-07-28  9:38 ` [PATCH 1/7] Move the canonicalization functions to Git::SVN::Utils Michael G. Schwern
  2012-07-28  9:38 ` [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available Michael G. Schwern
@ 2012-07-28  9:38 ` Michael G. Schwern
  2012-07-30 19:51   ` Eric Wong
  2012-07-28  9:38 ` [PATCH 4/7] Add join_paths() to safely concatenate paths Michael G. Schwern
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Michael G. Schwern @ 2012-07-28  9:38 UTC (permalink / raw)
  To: git, gitster; +Cc: robbat2, bwalton, normalperson, jrnieder, schwern

From: "Michael G. Schwern" <schwern@pobox.com>

The SVN API functions will not accept ../foo but their canonicalization
functions will not collapse it.  So we'll have to do it ourselves.

_collapse_dotdot() works better than the existing regex did.

This will be used shortly when canonicalize_path() starts using the
SVN API.
---
 perl/Git/SVN/Utils.pm             | 14 +++++++++++++-
 t/Git-SVN/Utils/collapse_dotdot.t | 23 +++++++++++++++++++++++
 2 files changed, 36 insertions(+), 1 deletion(-)
 create mode 100644 t/Git-SVN/Utils/collapse_dotdot.t

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 9d5d3c5..7314e52 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -72,6 +72,18 @@ API as a file path.
 
 =cut
 
+# Turn foo/../bar into bar
+sub _collapse_dotdot {
+	my $path = shift;
+
+	1 while $path =~ s{/[^/]+/+\.\.}{};
+	1 while $path =~ s{[^/]+/+\.\./}{};
+	1 while $path =~ s{[^/]+/+\.\.}{};
+
+	return $path;
+}
+
+
 sub canonicalize_path {
 	my ($path) = @_;
 	my $dot_slash_added = 0;
@@ -83,7 +95,7 @@ sub canonicalize_path {
 	# good reason), so let's do this manually.
 	$path =~ s#/+#/#g;
 	$path =~ s#/\.(?:/|$)#/#g;
-	$path =~ s#/[^/]+/\.\.##g;
+	$path = _collapse_dotdot($path);
 	$path =~ s#/$##g;
 	$path =~ s#^\./## if $dot_slash_added;
 	$path =~ s#^/##;
diff --git a/t/Git-SVN/Utils/collapse_dotdot.t b/t/Git-SVN/Utils/collapse_dotdot.t
new file mode 100644
index 0000000..1da1cce
--- /dev/null
+++ b/t/Git-SVN/Utils/collapse_dotdot.t
@@ -0,0 +1,23 @@
+#!/usr/bin/env perl
+
+use strict;
+use warnings;
+
+use Test::More 'no_plan';
+
+use Git::SVN::Utils;
+my $collapse_dotdot = \&Git::SVN::Utils::_collapse_dotdot;
+
+my %tests = (
+	"foo/bar/baz"			=> "foo/bar/baz",
+	".."				=> "..",
+	"foo/.."			=> "",
+	"/foo/bar/../../baz"		=> "/baz",
+	"deeply/.././deeply/nested"	=> "./deeply/nested",
+);
+
+for my $arg (keys %tests) {
+	my $want = $tests{$arg};
+
+	is $collapse_dotdot->($arg), $want, "_collapse_dotdot('$arg') => $want";
+}
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 4/7] Add join_paths() to safely concatenate paths.
  2012-07-28  9:38 Canonicalize the git-svn path & url accessors Michael G. Schwern
                   ` (2 preceding siblings ...)
  2012-07-28  9:38 ` [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths Michael G. Schwern
@ 2012-07-28  9:38 ` Michael G. Schwern
  2012-09-26 20:51   ` Jonathan Nieder
  2012-07-28  9:38 ` [PATCH 5/7] Remove irrelevant comment Michael G. Schwern
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 39+ messages in thread
From: Michael G. Schwern @ 2012-07-28  9:38 UTC (permalink / raw)
  To: git, gitster; +Cc: robbat2, bwalton, normalperson, jrnieder, schwern

From: "Michael G. Schwern" <schwern@pobox.com>

Otherwise you might wind up with things like...

    my $path1 = undef;
    my $path2 = 'foo';
    my $path = $path1 . '/' . $path2;

creating '/foo'.  Or this...

    my $path1 = 'foo/';
    my $path2 = 'bar';
    my $path = $path1 . '/' . $path2;

creating 'foo//bar'.

Could have used File::Spec, but I'm shying away from it due to SVN
1.7's pickiness about paths.  Felt it would be better to have our own
we can control completely.
---
 git-svn.perl                 |  3 ++-
 perl/Git/SVN.pm              | 10 ++++++----
 perl/Git/SVN/Utils.pm        | 32 ++++++++++++++++++++++++++++++++
 t/Git-SVN/Utils/join_paths.t | 32 ++++++++++++++++++++++++++++++++
 4 files changed, 72 insertions(+), 5 deletions(-)
 create mode 100644 t/Git-SVN/Utils/join_paths.t

diff --git a/git-svn.perl b/git-svn.perl
index a857484..6e3e240 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -34,6 +34,7 @@ use Git::SVN::Utils qw(
 	can_compress
 	canonicalize_path
 	canonicalize_url
+	join_paths
 );
 
 use Git qw(
@@ -1275,7 +1276,7 @@ sub get_svnprops {
 	$path = $cmd_dir_prefix . $path;
 	fatal("No such file or directory: $path") unless -e $path;
 	my $is_dir = -d $path ? 1 : 0;
-	$path = $gs->{path} . '/' . $path;
+	$path = join_paths($gs->{path}, $path);
 
 	# canonicalize the path (otherwise libsvn will abort or fail to
 	# find the file)
diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index 7913d8f..b0ed3ea 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -23,7 +23,11 @@ use Git qw(
     command_output_pipe
     command_close_pipe
 );
-use Git::SVN::Utils qw(fatal can_compress);
+use Git::SVN::Utils qw(
+	fatal
+	can_compress
+	join_paths
+);
 
 my $can_use_yaml;
 BEGIN {
@@ -316,9 +320,7 @@ sub init_remote_config {
 			}
 			my $old_path = $self->path;
 			$url =~ s!^\Q$min_url\E(/|$)!!;
-			if (length $old_path) {
-				$url .= "/$old_path";
-			}
+			$url = join_paths($url, $old_path);
 			$self->path($url);
 			$url = $min_url;
 		}
diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 7314e52..deade07 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -12,6 +12,7 @@ our @EXPORT_OK = qw(
 	can_compress
 	canonicalize_path
 	canonicalize_url
+	join_paths
 );
 
 
@@ -134,4 +135,35 @@ sub _canonicalize_url_ourselves {
 }
 
 
+=head3 join_paths
+
+    my $new_path = join_paths(@paths);
+
+Appends @paths together into a single path.  Any empty paths are ignored.
+
+=cut
+
+sub join_paths {
+	my @paths = @_;
+
+	@paths = grep { defined $_ && length $_ } @paths;
+
+	return '' unless @paths;
+	return $paths[0] if @paths == 1;
+
+	my $new_path = shift @paths;
+	$new_path =~ s{/+$}{};
+
+	my $last_path = pop @paths;
+	$last_path =~ s{^/+}{};
+
+	for my $path (@paths) {
+		$path =~ s{^/+}{};
+		$path =~ s{/+$}{};
+		$new_path .= "/$path";
+	}
+
+	return $new_path .= "/$last_path";
+}
+
 1;
diff --git a/t/Git-SVN/Utils/join_paths.t b/t/Git-SVN/Utils/join_paths.t
new file mode 100644
index 0000000..d4488e7
--- /dev/null
+++ b/t/Git-SVN/Utils/join_paths.t
@@ -0,0 +1,32 @@
+#!/usr/bin/env perl
+
+use strict;
+use warnings;
+
+use Test::More 'no_plan';
+
+use Git::SVN::Utils qw(
+	join_paths
+);
+
+# A reference cannot be a hash key, so we use an array.
+my @tests = (
+	[]					=> '',
+	["/x.com", "bar"]			=> '/x.com/bar',
+	["x.com", ""]				=> 'x.com',
+	["/x.com/foo/", undef, "bar"]		=> '/x.com/foo/bar',
+	["x.com/foo/", "/bar/baz/"]		=> 'x.com/foo/bar/baz/',
+	["foo", "bar"]				=> 'foo/bar',
+	["/foo/bar", "baz", "/biff"]		=> '/foo/bar/baz/biff',
+	["", undef, "."]			=> '.',
+	[]					=> '',
+
+);
+
+while(@tests) {
+	my($have, $want) = splice @tests, 0, 2;
+
+	my $args = join ", ", map { qq['$_'] } map { defined($_) ? $_ : 'undef' } @$have;
+	my $name = "join_paths($args) eq '$want'";
+	is join_paths(@$have), $want, $name;
+}
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 5/7] Remove irrelevant comment.
  2012-07-28  9:38 Canonicalize the git-svn path & url accessors Michael G. Schwern
                   ` (3 preceding siblings ...)
  2012-07-28  9:38 ` [PATCH 4/7] Add join_paths() to safely concatenate paths Michael G. Schwern
@ 2012-07-28  9:38 ` Michael G. Schwern
  2012-07-28  9:38 ` [PATCH 6/7] Switch path canonicalization to use the SVN API Michael G. Schwern
  2012-07-28  9:38 ` [PATCH 7/7] Make Git::SVN and Git::SVN::Ra canonicalize paths and urls Michael G. Schwern
  6 siblings, 0 replies; 39+ messages in thread
From: Michael G. Schwern @ 2012-07-28  9:38 UTC (permalink / raw)
  To: git, gitster; +Cc: robbat2, bwalton, normalperson, jrnieder, schwern

From: "Michael G. Schwern" <schwern@pobox.com>

The code doesn't use File::Spec.
---
 perl/Git/SVN/Utils.pm | 2 --
 1 file changed, 2 deletions(-)

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index deade07..6c8ae53 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -92,8 +92,6 @@ sub canonicalize_path {
 		$path = "./" . $path;
 		$dot_slash_added = 1;
 	}
-	# File::Spec->canonpath doesn't collapse x/../y into y (for a
-	# good reason), so let's do this manually.
 	$path =~ s#/+#/#g;
 	$path =~ s#/\.(?:/|$)#/#g;
 	$path = _collapse_dotdot($path);
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 6/7] Switch path canonicalization to use the SVN API.
  2012-07-28  9:38 Canonicalize the git-svn path & url accessors Michael G. Schwern
                   ` (4 preceding siblings ...)
  2012-07-28  9:38 ` [PATCH 5/7] Remove irrelevant comment Michael G. Schwern
@ 2012-07-28  9:38 ` Michael G. Schwern
  2012-07-28 13:55   ` Jonathan Nieder
  2012-07-28  9:38 ` [PATCH 7/7] Make Git::SVN and Git::SVN::Ra canonicalize paths and urls Michael G. Schwern
  6 siblings, 1 reply; 39+ messages in thread
From: Michael G. Schwern @ 2012-07-28  9:38 UTC (permalink / raw)
  To: git, gitster; +Cc: robbat2, bwalton, normalperson, jrnieder, schwern

From: "Michael G. Schwern" <schwern@pobox.com>

All tests pass with SVN 1.6.  SVN 1.7 remains broken, not worrying
about it yet.

SVN changed its path canonicalization API between 1.6 and 1.7.
http://svnbook.red-bean.com/en/1.6/svn.developer.usingapi.html#svn.developer.usingapi.urlpath
http://svnbook.red-bean.com/en/1.7/svn.developer.usingapi.html#svn.developer.usingapi.urlpath

The SVN API does not accept foo/.. but it also doesn't canonicalize
it.  We have to do it ourselves.
---
 perl/Git/SVN/Utils.pm | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 6c8ae53..7ae6fac 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -86,6 +86,27 @@ sub _collapse_dotdot {
 
 
 sub canonicalize_path {
+	my $path = shift;
+
+	# The 1.7 way to do it
+	if ( defined &SVN::_Core::svn_dirent_canonicalize ) {
+		$path = _collapse_dotdot($path);
+		return SVN::_Core::svn_dirent_canonicalize($path);
+	}
+	# The 1.6 way to do it
+	elsif ( defined &SVN::_Core::svn_path_canonicalize ) {
+		$path = _collapse_dotdot($path);
+		return SVN::_Core::svn_path_canonicalize($path);
+	}
+	# No SVN API canonicalization is available, do it ourselves
+	else {
+		$path = _canonicalize_path_ourselves($path);
+		return $path;
+	}
+}
+
+
+sub _canonicalize_path_ourselves {
 	my ($path) = @_;
 	my $dot_slash_added = 0;
 	if (substr($path, 0, 1) ne "/") {
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 7/7] Make Git::SVN and Git::SVN::Ra canonicalize paths and urls.
  2012-07-28  9:38 Canonicalize the git-svn path & url accessors Michael G. Schwern
                   ` (5 preceding siblings ...)
  2012-07-28  9:38 ` [PATCH 6/7] Switch path canonicalization to use the SVN API Michael G. Schwern
@ 2012-07-28  9:38 ` Michael G. Schwern
  2012-07-28 14:11   ` Jonathan Nieder
  6 siblings, 1 reply; 39+ messages in thread
From: Michael G. Schwern @ 2012-07-28  9:38 UTC (permalink / raw)
  To: git, gitster; +Cc: robbat2, bwalton, normalperson, jrnieder, schwern

From: "Michael G. Schwern" <schwern@pobox.com>

This canonicalizes paths and urls as early as possible so we don't
have to remember to do it at the point of use.  It will fix a swath
of SVN 1.7 problems in one go.

Its ok to double canonicalize things.

SVN 1.7 still fails, still not worrying about that.
---
 perl/Git/SVN.pm    | 6 ++++--
 perl/Git/SVN/Ra.pm | 6 +++++-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index b0ed3ea..798f6c4 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -27,6 +27,8 @@ use Git::SVN::Utils qw(
 	fatal
 	can_compress
 	join_paths
+	canonicalize_path
+	canonicalize_url
 );
 
 my $can_use_yaml;
@@ -2305,7 +2307,7 @@ sub path {
 
     if( @_ ) {
         my $path = shift;
-        $self->{path} = $path;
+        $self->{path} = canonicalize_path($path);
         return;
     }
 
@@ -2318,7 +2320,7 @@ sub url {
 
     if( @_ ) {
         my $url = shift;
-        $self->{url} = $url;
+        $self->{url} = canonicalize_url($url);
         return;
     }
 
diff --git a/perl/Git/SVN/Ra.pm b/perl/Git/SVN/Ra.pm
index 27dcdd5..ef7b3dd 100644
--- a/perl/Git/SVN/Ra.pm
+++ b/perl/Git/SVN/Ra.pm
@@ -3,6 +3,10 @@ use vars qw/@ISA $config_dir $_ignore_refs_regex $_log_window_size/;
 use strict;
 use warnings;
 use SVN::Client;
+use Git::SVN::Utils qw(
+	canonicalize_url
+);
+
 use SVN::Ra;
 BEGIN {
 	@ISA = qw(SVN::Ra);
@@ -138,7 +142,7 @@ sub url {
 
     if( @_ ) {
         my $url = shift;
-        $self->{url} = $url;
+        $self->{url} = canonicalize_url($url);
         return;
     }
 
-- 
1.7.11.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.
  2012-07-28  9:38 ` [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available Michael G. Schwern
@ 2012-07-28 13:50   ` Jonathan Nieder
  2012-07-28 19:01     ` Michael G Schwern
  0 siblings, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-07-28 13:50 UTC (permalink / raw)
  To: Michael G. Schwern; +Cc: git, gitster, robbat2, bwalton, normalperson

Hi,

Michael G. Schwern wrote:

> --- a/perl/Git/SVN/Utils.pm
> +++ b/perl/Git/SVN/Utils.pm
[...]
> @@ -100,6 +102,20 @@ API as a URL.
>  =cut
>  
>  sub canonicalize_url {
> +	my $url = shift;
> +
> +	# The 1.7 way to do it
> +	if ( defined &SVN::_Core::svn_uri_canonicalize ) {
> +		return SVN::_Core::svn_uri_canonicalize($url);
> +	}
> +	# There wasn't a 1.6 way to do it, so we do it ourself.
> +	else {
> +		return _canonicalize_url_ourselves($url);
> +	}
> +}
> +
> +
> +sub _canonicalize_url_ourselves {
>  	my ($url) = @_;
>  	$url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;

Leaves me a bit nervous.

What effect should we expect this change to have?  Is our emulation
of svn_uri_canonicalize already perfect and this change just a little
futureproofing in case svn_uri_canonicalize gets even better, or is
this a trap waiting to happen when new callers of canonicalize_url
start relying on, e.g., %-encoding of special characters?

If I am reading Subversion r873487 correctly, in ancient times,
svn_path_canonicalize() did the appropriate tweaking for URIs.  Today
its implementation is comforting:

	const char *
	svn_path_canonicalize(const char *path, apr_pool_t *pool)
	{
	  if (svn_path_is_url(path))
	    return svn_uri_canonicalize(path, pool);
	  else
	    return svn_dirent_canonicalize(path, pool);
	}

It might be easier to rely on that on pre-1.7 systems.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/7] Switch path canonicalization to use the SVN API.
  2012-07-28  9:38 ` [PATCH 6/7] Switch path canonicalization to use the SVN API Michael G. Schwern
@ 2012-07-28 13:55   ` Jonathan Nieder
  2012-07-28 19:07     ` Michael G Schwern
  0 siblings, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-07-28 13:55 UTC (permalink / raw)
  To: Michael G. Schwern; +Cc: git, gitster, robbat2, bwalton, normalperson

Michael G. Schwern wrote:

> --- a/perl/Git/SVN/Utils.pm
> +++ b/perl/Git/SVN/Utils.pm
> @@ -86,6 +86,27 @@ sub _collapse_dotdot {
>  
>  
>  sub canonicalize_path {
> +	my $path = shift;
> +
> +	# The 1.7 way to do it
> +	if ( defined &SVN::_Core::svn_dirent_canonicalize ) {
> +		$path = _collapse_dotdot($path);
> +		return SVN::_Core::svn_dirent_canonicalize($path);
> +	}
> +	# The 1.6 way to do it
> +	elsif ( defined &SVN::_Core::svn_path_canonicalize ) {
> +		$path = _collapse_dotdot($path);
> +		return SVN::_Core::svn_path_canonicalize($path);
> +	}
> +	# No SVN API canonicalization is available, do it ourselves
> +	else {

When would this "else" case trip?  Would it be safe to make it
return an error message, or even to do something like the following?

	sub canonicalize_path {
		my $path = shift;
		$path = _collapse_dotdot($path);

		# Subversion 1.7 split svn_path_canonicalize() into
		# svn_dirent_canonicalize() and svn_uri_canonicalize().
		if (!defined &SVN::_Core::svn_dirent_canonicalize) {
			return SVN::_Core::svn_path_canonicalize($path);
		}

		return SVN::_Core::svn_dirent_canonicalize($path);
	}

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/7] Make Git::SVN and Git::SVN::Ra canonicalize paths and urls.
  2012-07-28  9:38 ` [PATCH 7/7] Make Git::SVN and Git::SVN::Ra canonicalize paths and urls Michael G. Schwern
@ 2012-07-28 14:11   ` Jonathan Nieder
  2012-07-28 19:15     ` Michael G Schwern
  0 siblings, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-07-28 14:11 UTC (permalink / raw)
  To: Michael G. Schwern; +Cc: git, gitster, robbat2, bwalton, normalperson

Michael G. Schwern wrote:

> This canonicalizes paths and urls as early as possible so we don't
> have to remember to do it at the point of use.

Yay!  Am I correct in imagining this makes the following sequence of
commands[1] no longer trip an assertion failure in svn_path_join[2]
with SVN 1.6?

	git svn init -Thttp://trac-hacks.org/svn/tagsplugin/trunk \
		-thttp://trac-hacks.org/svn/tagsplugin/tags \
		-bhttp://trac-hacks.org/svn/tagsplugin/branches
	git svn fetch

[1] http://bugs.debian.org/616168
[2] 
  $ git svn fetch
  W: Ignoring error from SVN, path probably does not exist: (160013): Filesystem has no item: File not found: revision 100, path '/tagsplugin'
  W: Do not be alarmed at the above message git-svn is just searching aggressively for old history.
  This may take a while on large repositories
  perl: /build/buildd-subversion_1.6.17dfsg-4-i386-MgNPeW/subversion-1.6.17dfsg/subversion/libsvn_subr/path.c:115: svn_path_join: Assertion `svn_path_is_canonical(component, pool)' failed.
  error: git-svn died of signal 6

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.
  2012-07-28 13:50   ` Jonathan Nieder
@ 2012-07-28 19:01     ` Michael G Schwern
  2012-07-28 19:30       ` Jonathan Nieder
  2012-10-14 11:42       ` [PATCH/RFC 0/2] " Jonathan Nieder
  0 siblings, 2 replies; 39+ messages in thread
From: Michael G Schwern @ 2012-07-28 19:01 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, gitster, robbat2, bwalton, normalperson

On 2012.7.28 6:50 AM, Jonathan Nieder wrote:
>> --- a/perl/Git/SVN/Utils.pm
>> +++ b/perl/Git/SVN/Utils.pm
> [...]
>> @@ -100,6 +102,20 @@ API as a URL.
>>  =cut
>>  
>>  sub canonicalize_url {
>> +	my $url = shift;
>> +
>> +	# The 1.7 way to do it
>> +	if ( defined &SVN::_Core::svn_uri_canonicalize ) {
>> +		return SVN::_Core::svn_uri_canonicalize($url);
>> +	}
>> +	# There wasn't a 1.6 way to do it, so we do it ourself.
>> +	else {
>> +		return _canonicalize_url_ourselves($url);
>> +	}
>> +}
>> +
>> +
>> +sub _canonicalize_url_ourselves {
>>  	my ($url) = @_;
>>  	$url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;
> 
> Leaves me a bit nervous.

As it should, SVN dumped a mess on us.


> What effect should we expect this change to have?  Is our emulation
> of svn_uri_canonicalize already perfect and this change just a little
> futureproofing in case svn_uri_canonicalize gets even better, or is
> this a trap waiting to happen when new callers of canonicalize_url
> start relying on, e.g., %-encoding of special characters?

This change is *just* about sliding in the SVN API call and seeing if git-svn
still works with SVN 1.6.  It should have no effect on SVN 1.6.  These patches
are a very slow and careful refactoring doing just one thing at a time.  Every
time I tried to do too many things at once, tests broke and I had to tease the
patch apart.

At this point in the patch series the code is not ready for canonicalization.
 Until 3/8 in the next patch series, canonicalize_url() basically does nothing
on SVN 1.6 so the code has never had to deal with the problem.  3/8 deals with
improving _canonicalize_url_ourselves() to work more like
svn_uri_canonicalize() and thus "turns on" canonicalization for SVN 1.6 and
deals with the breakage.


> If I am reading Subversion r873487 correctly, in ancient times,
> svn_path_canonicalize() did the appropriate tweaking for URIs.  Today
> its implementation is comforting:
> 
> 	const char *
> 	svn_path_canonicalize(const char *path, apr_pool_t *pool)
> 	{
> 	  if (svn_path_is_url(path))
> 	    return svn_uri_canonicalize(path, pool);
> 	  else
> 	    return svn_dirent_canonicalize(path, pool);
> 	}
> 
> It might be easier to rely on that on pre-1.7 systems.

I didn't know about that.  I don't know what your SVN backwards compat
requirements are, but if that behavior goes back far enough in SVN to satisfy
you folks, then canonicalize_url() should fall back to
SVN::_Core::svn_path_canonicalize().  But try it at the end of the patch
series.  The code has to be prepared for canonicalization first.  Then how it
actually does it can be improved.


-- 
Defender of Lexical Encapsulation

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/7] Switch path canonicalization to use the SVN API.
  2012-07-28 13:55   ` Jonathan Nieder
@ 2012-07-28 19:07     ` Michael G Schwern
  2012-07-30 20:04       ` Eric Wong
  2012-10-05  7:04       ` [PATCH] git-svn: keep leading slash when canonicalizing paths (fallback case) Jonathan Nieder
  0 siblings, 2 replies; 39+ messages in thread
From: Michael G Schwern @ 2012-07-28 19:07 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, gitster, robbat2, bwalton, normalperson

On 2012.7.28 6:55 AM, Jonathan Nieder wrote:
> Michael G. Schwern wrote:
>> --- a/perl/Git/SVN/Utils.pm
>> +++ b/perl/Git/SVN/Utils.pm
>> @@ -86,6 +86,27 @@ sub _collapse_dotdot {
>>  
>>  
>>  sub canonicalize_path {
>> +	my $path = shift;
>> +
>> +	# The 1.7 way to do it
>> +	if ( defined &SVN::_Core::svn_dirent_canonicalize ) {
>> +		$path = _collapse_dotdot($path);
>> +		return SVN::_Core::svn_dirent_canonicalize($path);
>> +	}
>> +	# The 1.6 way to do it
>> +	elsif ( defined &SVN::_Core::svn_path_canonicalize ) {
>> +		$path = _collapse_dotdot($path);
>> +		return SVN::_Core::svn_path_canonicalize($path);
>> +	}
>> +	# No SVN API canonicalization is available, do it ourselves
>> +	else {
> 
> When would this "else" case trip?

When svn_path_canonicalize() does not exist in the SVN API, presumably because
their SVN is too old.


> Would it be safe to make it
> return an error message, or even to do something like the following?

I don't know what your SVN backwards compat requirements are, or when
svn_path_canonicalize() appears in the API, so I left it as is.  git-svn's
home rolled path canonicalization worked and its no work to leave it working.
 No reason to break it IMO.


> 	sub canonicalize_path {
> 		my $path = shift;
> 		$path = _collapse_dotdot($path);
> 
> 		# Subversion 1.7 split svn_path_canonicalize() into
> 		# svn_dirent_canonicalize() and svn_uri_canonicalize().
> 		if (!defined &SVN::_Core::svn_dirent_canonicalize) {
> 			return SVN::_Core::svn_path_canonicalize($path);
> 		}
> 
> 		return SVN::_Core::svn_dirent_canonicalize($path);
> 	}

As a side note...
"If they don't have Mars bar, get me a Twix.  Else get me a Mars bar."
"If they have a Mars bar, get me one.  Else get me a Twix."


-- 
Look at me talking when there's science to do.
When I look out there it makes me glad I'm not you.
I've experiments to be run.
There is research to be done
On the people who are still alive.
    -- Jonathan Coulton, "Still Alive"

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/7] Make Git::SVN and Git::SVN::Ra canonicalize paths and urls.
  2012-07-28 14:11   ` Jonathan Nieder
@ 2012-07-28 19:15     ` Michael G Schwern
  0 siblings, 0 replies; 39+ messages in thread
From: Michael G Schwern @ 2012-07-28 19:15 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, gitster, robbat2, bwalton, normalperson

On 2012.7.28 7:11 AM, Jonathan Nieder wrote:
> Yay!  Am I correct in imagining this makes the following sequence of
> commands[1] no longer trip an assertion failure in svn_path_join[2]
> with SVN 1.6?
> 
> 	git svn init -Thttp://trac-hacks.org/svn/tagsplugin/trunk \
> 		-thttp://trac-hacks.org/svn/tagsplugin/tags \
> 		-bhttp://trac-hacks.org/svn/tagsplugin/branches
> 	git svn fetch

Works For Me™!

  ...
  Checked out HEAD:
    http://trac-hacks.org/svn/tagsplugin/trunk r11776


-- 
Insulting our readers is part of our business model.
        http://somethingpositive.net/sp07122005.shtml

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.
  2012-07-28 19:01     ` Michael G Schwern
@ 2012-07-28 19:30       ` Jonathan Nieder
  2012-07-28 19:51         ` Michael G Schwern
  2012-10-14 11:42       ` [PATCH/RFC 0/2] " Jonathan Nieder
  1 sibling, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-07-28 19:30 UTC (permalink / raw)
  To: Michael G Schwern; +Cc: git, gitster, robbat2, bwalton, normalperson

Michael G Schwern wrote:
> On 2012.7.28 6:50 AM, Jonathan Nieder wrote:

>> If I am reading Subversion r873487 correctly, in ancient times,
>> svn_path_canonicalize() did the appropriate tweaking for URIs.  Today
>> its implementation is comforting:
>> 
>> 	const char *
>> 	svn_path_canonicalize(const char *path, apr_pool_t *pool)
>> 	{
>> 	  if (svn_path_is_url(path))
>> 	    return svn_uri_canonicalize(path, pool);
>> 	  else
>> 	    return svn_dirent_canonicalize(path, pool);
>> 	}
[...]
> I didn't know about that.  I don't know what your SVN backwards compat
> requirements are, but if that behavior goes back far enough in SVN to satisfy
> you folks, then canonicalize_url() should fall back to
> SVN::_Core::svn_path_canonicalize().

svn_path_canonicalize() has been usable for this kind of thing since
SVN 1.1, possibly earlier.

>                                      But try it at the end of the patch
> series.  The code has to be prepared for canonicalization first.  Then how it
> actually does it can be improved.

Since this part of the series is not tested with SVN 1.7, this is
basically adding dead code, right?  That could be avoided by
reordering the changes to keep "canonicalize_url" as-is until later in
the series when the switchover is safe.

Thanks.  Will play around with this code more.

Jonathan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.
  2012-07-28 19:30       ` Jonathan Nieder
@ 2012-07-28 19:51         ` Michael G Schwern
  2012-07-28 19:57           ` Jonathan Nieder
  0 siblings, 1 reply; 39+ messages in thread
From: Michael G Schwern @ 2012-07-28 19:51 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, gitster, robbat2, bwalton, normalperson

On 2012.7.28 12:30 PM, Jonathan Nieder wrote:
>> I didn't know about that.  I don't know what your SVN backwards compat
>> requirements are, but if that behavior goes back far enough in SVN to satisfy
>> you folks, then canonicalize_url() should fall back to
>> SVN::_Core::svn_path_canonicalize().
> 
> svn_path_canonicalize() has been usable for this kind of thing since
> SVN 1.1, possibly earlier.

Great!  Then _canonicalize_url_ourselves() can probably be replaced with that.
 Just take my advice and do it after 1.7 is working and the code is ready for
canonicalization.


>>                                      But try it at the end of the patch
>> series.  The code has to be prepared for canonicalization first.  Then how it
>> actually does it can be improved.
> 
> Since this part of the series is not tested with SVN 1.7, this is
> basically adding dead code, right?  That could be avoided by
> reordering the changes to keep "canonicalize_url" as-is until later in
> the series when the switchover is safe.

I would suggest that worrying whether a few lines of code are introduced now
or 10 patches later in the same branch which is all going to be merged in one
go (and retesting the patches after it) is not the most important thing.  The
code needs humans looking over it and deciding if canonicalizations were
missed or applied inappropriately.  Or hey, work on that path and url object
idea that makes a lot of real code mess go away.


-- 
ROCKS FALL! EVERYONE DIES!
	http://www.somethingpositive.net/sp05032002.shtml

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.
  2012-07-28 19:51         ` Michael G Schwern
@ 2012-07-28 19:57           ` Jonathan Nieder
  2012-07-28 20:02             ` Jonathan Nieder
  0 siblings, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-07-28 19:57 UTC (permalink / raw)
  To: Michael G Schwern; +Cc: git, gitster, robbat2, bwalton, normalperson

Michael G Schwern wrote:
> On 2012.7.28 12:30 PM, Jonathan Nieder wrote:

>> Since this part of the series is not tested with SVN 1.7, this is
>> basically adding dead code, right?  That could be avoided by
>> reordering the changes to keep "canonicalize_url" as-is until later in
>> the series when the switchover is safe.
>
> I would suggest that worrying whether a few lines of code are introduced now
> or 10 patches later in the same branch which is all going to be merged in one
> go (and retesting the patches after it) is not the most important thing.  The
> code needs humans looking over it and deciding if canonicalizations were
> missed or applied inappropriately.  Or hey, work on that path and url object
> idea that makes a lot of real code mess go away.

In that case they should be one patch, I'd think.

The advantage of introducing changes gradually is that (1) the changes
can be examined and tested one at a time, and (2) if later a change
proves to be problematic, it can be isolated, understood, and fixed
more easily.  The strategy you are suggesting would have neither of
those advantages.

Jonathan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.
  2012-07-28 19:57           ` Jonathan Nieder
@ 2012-07-28 20:02             ` Jonathan Nieder
  2012-07-28 20:24               ` Michael G Schwern
  0 siblings, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-07-28 20:02 UTC (permalink / raw)
  To: Michael G Schwern; +Cc: git, gitster, robbat2, bwalton, normalperson

Jonathan Nieder wrote:
> Michael G Schwern wrote:

>> I would suggest that worrying whether a few lines of code are introduced now
>> or 10 patches later in the same branch which is all going to be merged in one
>> go (and retesting the patches after it) is not the most important thing.
[...]
> In that case they should be one patch, I'd think.
>
> The advantage of introducing changes gradually is that (1) the changes
> can be examined and tested one at a time, and (2) if later a change
> proves to be problematic, it can be isolated, understood, and fixed
> more easily.  The strategy you are suggesting would have neither of
> those advantages.

(To avoid confusion: by "The strategy you are suggesting" I mean
introducing dead code first and activating it later, not the path and
url object idea.  The path and url object approach would be very
nice. :))

Sorry for the lack of clarity.
Jonathan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.
  2012-07-28 20:02             ` Jonathan Nieder
@ 2012-07-28 20:24               ` Michael G Schwern
  0 siblings, 0 replies; 39+ messages in thread
From: Michael G Schwern @ 2012-07-28 20:24 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, gitster, robbat2, bwalton, normalperson

On 2012.7.28 1:02 PM, Jonathan Nieder wrote:
> Jonathan Nieder wrote:
>> Michael G Schwern wrote:
> 
>>> I would suggest that worrying whether a few lines of code are introduced now
>>> or 10 patches later in the same branch which is all going to be merged in one
>>> go (and retesting the patches after it) is not the most important thing.
> [...]
>> In that case they should be one patch, I'd think.
>>
>> The advantage of introducing changes gradually is that (1) the changes
>> can be examined and tested one at a time, and (2) if later a change
>> proves to be problematic, it can be isolated, understood, and fixed
>> more easily.  The strategy you are suggesting would have neither of
>> those advantages.
> 
> (To avoid confusion: by "The strategy you are suggesting" I mean
> introducing dead code first and activating it later, not the path and
> url object idea.  The path and url object approach would be very
> nice. :))

If this is all a topic branch then it doesn't matter much whether a couple
lines of code is introduced at patch 8 of a branch or patch 13.  Sure, it
matters a little, but...
https://secure.wikimedia.org/wikipedia/en/wiki/Opportunity_cost

If it *isn't* going in a topic branch, if its not visible as a collected work
in history, if its going to be rebased on top of master, then yeah I can see
why you're so concerned.


-- 
Alligator sandwich, and make it snappy!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.
  2012-07-28  9:38 ` [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths Michael G. Schwern
@ 2012-07-30 19:51   ` Eric Wong
  2012-07-30 20:46     ` Michael G Schwern
  0 siblings, 1 reply; 39+ messages in thread
From: Eric Wong @ 2012-07-30 19:51 UTC (permalink / raw)
  To: Michael G. Schwern; +Cc: git, gitster, robbat2, bwalton, jrnieder

"Michael G. Schwern" <schwern@pobox.com> wrote:
> From: "Michael G. Schwern" <schwern@pobox.com>
> 
> The SVN API functions will not accept ../foo but their canonicalization
> functions will not collapse it.  So we'll have to do it ourselves.
> 
> _collapse_dotdot() works better than the existing regex did.

I don't dispute it's better, but it's worth explaining in the commit
message to reviewers why something is "better".

> This will be used shortly when canonicalize_path() starts using the
> SVN API.
> ---

> +# Turn foo/../bar into bar
> +sub _collapse_dotdot {
> +	my $path = shift;
> +
> +	1 while $path =~ s{/[^/]+/+\.\.}{};
> +	1 while $path =~ s{[^/]+/+\.\./}{};
> +	1 while $path =~ s{[^/]+/+\.\.}{};

This is a bug that's gone unnoticed[1] for over 5 years now,
but I've just noticed this doesn' handle "foo/..bar"  or "foo/...bar"
cases correctly.

>  sub canonicalize_path {
>  	my ($path) = @_;
>  	my $dot_slash_added = 0;
> @@ -83,7 +95,7 @@ sub canonicalize_path {
>  	# good reason), so let's do this manually.
>  	$path =~ s#/+#/#g;
>  	$path =~ s#/\.(?:/|$)#/#g;
> -	$path =~ s#/[^/]+/\.\.##g;
> +	$path = _collapse_dotdot($path);

[1] - I doubt anybody uses paths like these, though...

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/7] Switch path canonicalization to use the SVN API.
  2012-07-28 19:07     ` Michael G Schwern
@ 2012-07-30 20:04       ` Eric Wong
  2012-08-02 21:51         ` Eric Wong
  2012-10-05  7:04       ` [PATCH] git-svn: keep leading slash when canonicalizing paths (fallback case) Jonathan Nieder
  1 sibling, 1 reply; 39+ messages in thread
From: Eric Wong @ 2012-07-30 20:04 UTC (permalink / raw)
  To: Michael G Schwern; +Cc: Jonathan Nieder, git, gitster, robbat2, bwalton

Michael G Schwern <schwern@pobox.com> wrote:
> On 2012.7.28 6:55 AM, Jonathan Nieder wrote:
> > Michael G. Schwern wrote:
> >> --- a/perl/Git/SVN/Utils.pm
> >> +++ b/perl/Git/SVN/Utils.pm
> >> @@ -86,6 +86,27 @@ sub _collapse_dotdot {
> >>  
> >>  
> >>  sub canonicalize_path {
> >> +	my $path = shift;
> >> +
> >> +	# The 1.7 way to do it
> >> +	if ( defined &SVN::_Core::svn_dirent_canonicalize ) {
> >> +		$path = _collapse_dotdot($path);
> >> +		return SVN::_Core::svn_dirent_canonicalize($path);
> >> +	}
> >> +	# The 1.6 way to do it
> >> +	elsif ( defined &SVN::_Core::svn_path_canonicalize ) {
> >> +		$path = _collapse_dotdot($path);
> >> +		return SVN::_Core::svn_path_canonicalize($path);
> >> +	}
> >> +	# No SVN API canonicalization is available, do it ourselves
> >> +	else {
> > 
> > When would this "else" case trip?
> 
> When svn_path_canonicalize() does not exist in the SVN API, presumably because
> their SVN is too old.
> 
> 
> > Would it be safe to make it
> > return an error message, or even to do something like the following?
> 
> I don't know what your SVN backwards compat requirements are, or when
> svn_path_canonicalize() appears in the API, so I left it as is.  git-svn's
> home rolled path canonicalization worked and its no work to leave it working.
>  No reason to break it IMO.

I agree there's no reason to break something on older SVN.

git-svn should work with whatever SVN is in CentOS 5.x and similar
distros (SVN 1.4.2).  As long as an active "long-term" distro supports
a version of SVN, I think we should support that if it's not too
difficult.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.
  2012-07-30 19:51   ` Eric Wong
@ 2012-07-30 20:46     ` Michael G Schwern
  2012-09-26 19:45       ` Jonathan Nieder
  0 siblings, 1 reply; 39+ messages in thread
From: Michael G Schwern @ 2012-07-30 20:46 UTC (permalink / raw)
  To: Eric Wong; +Cc: git, gitster, robbat2, bwalton, jrnieder

On 2012.7.30 12:51 PM, Eric Wong wrote:
>> The SVN API functions will not accept ../foo but their canonicalization
>> functions will not collapse it.  So we'll have to do it ourselves.
>>
>> _collapse_dotdot() works better than the existing regex did.
> 
> I don't dispute it's better, but it's worth explaining in the commit
> message to reviewers why something is "better".

Yeah.  I figured the tests covered that.


>> +# Turn foo/../bar into bar
>> +sub _collapse_dotdot {
>> +	my $path = shift;
>> +
>> +	1 while $path =~ s{/[^/]+/+\.\.}{};
>> +	1 while $path =~ s{[^/]+/+\.\./}{};
>> +	1 while $path =~ s{[^/]+/+\.\.}{};
> 
> This is a bug that's gone unnoticed[1] for over 5 years now,
> but I've just noticed this doesn' handle "foo/..bar"  or "foo/...bar"
> cases correctly.

Good catch.  Woo unit tests!  :)  You could add them as TODO tests.

A more accurate way to do it would be to split the path, collapse using the
resulting list, and rejoin it.


> [1] - I doubt anybody uses paths like these, though...

Not for an svnroot or branch name, no.


-- 
Hating the web since 1994.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/7] Switch path canonicalization to use the SVN API.
  2012-07-30 20:04       ` Eric Wong
@ 2012-08-02 21:51         ` Eric Wong
  2012-08-02 23:18           ` Michael G Schwern
  0 siblings, 1 reply; 39+ messages in thread
From: Eric Wong @ 2012-08-02 21:51 UTC (permalink / raw)
  To: Michael G Schwern; +Cc: Jonathan Nieder, git, gitster, robbat2, bwalton

Eric Wong <normalperson@yhbt.net> wrote:
> Michael G Schwern <schwern@pobox.com> wrote:
> > On 2012.7.28 6:55 AM, Jonathan Nieder wrote:
> > > Michael G. Schwern wrote:
> > >> --- a/perl/Git/SVN/Utils.pm
> > >> +++ b/perl/Git/SVN/Utils.pm
> > >> @@ -86,6 +86,27 @@ sub _collapse_dotdot {
> > >>  
> > >>  
> > >>  sub canonicalize_path {
> > >> +	my $path = shift;
> > >> +
> > >> +	# The 1.7 way to do it
> > >> +	if ( defined &SVN::_Core::svn_dirent_canonicalize ) {
> > >> +		$path = _collapse_dotdot($path);
> > >> +		return SVN::_Core::svn_dirent_canonicalize($path);
> > >> +	}
> > >> +	# The 1.6 way to do it
> > >> +	elsif ( defined &SVN::_Core::svn_path_canonicalize ) {
> > >> +		$path = _collapse_dotdot($path);
> > >> +		return SVN::_Core::svn_path_canonicalize($path);
> > >> +	}
> > >> +	# No SVN API canonicalization is available, do it ourselves
> > >> +	else {
> > > 
> > > When would this "else" case trip?
> > 
> > When svn_path_canonicalize() does not exist in the SVN API, presumably because
> > their SVN is too old.

svn_path_canonicalize() may be accessible in some versions of SVN,
but it'll return undef.

I'm squashing the change below to have it fall back to
_canonicalize_path_ourselves in the case svn_path_canonicalize()
is present but unusable.

> > > Would it be safe to make it
> > > return an error message, or even to do something like the following?
> > 
> > I don't know what your SVN backwards compat requirements are, or when
> > svn_path_canonicalize() appears in the API, so I left it as is.  git-svn's
> > home rolled path canonicalization worked and its no work to leave it working.
> >  No reason to break it IMO.
> 
> I agree there's no reason to break something on older SVN.
> 
> git-svn should work with whatever SVN is in CentOS 5.x and similar
> distros (SVN 1.4.2).  As long as an active "long-term" distro supports
> a version of SVN, I think we should support that if it's not too
> difficult.

I've tested the following on an old CentOS 5.2 chroot with SVN 1.4.2:

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index b7727db..4bb4dde 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -88,22 +88,25 @@ sub _collapse_dotdot {
 
 sub canonicalize_path {
 	my $path = shift;
+	my $rv;
 
 	# The 1.7 way to do it
 	if ( defined &SVN::_Core::svn_dirent_canonicalize ) {
 		$path = _collapse_dotdot($path);
-		return SVN::_Core::svn_dirent_canonicalize($path);
+		$rv = SVN::_Core::svn_dirent_canonicalize($path);
 	}
 	# The 1.6 way to do it
+	# This can return undef on subversion-perl-1.4.2-2.el5 (CentOS 5.2)
 	elsif ( defined &SVN::_Core::svn_path_canonicalize ) {
 		$path = _collapse_dotdot($path);
-		return SVN::_Core::svn_path_canonicalize($path);
-	}
-	# No SVN API canonicalization is available, do it ourselves
-	else {
-		$path = _canonicalize_path_ourselves($path);
-		return $path;
+		$rv = SVN::_Core::svn_path_canonicalize($path);
 	}
+
+	return $rv if defined $rv;
+
+	# No SVN API canonicalization is available, or the SVN API
+	# didn't return a successful result, do it ourselves
+	return _canonicalize_path_ourselves($path);
 }
 
 
-- 
Eric Wong

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/7] Switch path canonicalization to use the SVN API.
  2012-08-02 21:51         ` Eric Wong
@ 2012-08-02 23:18           ` Michael G Schwern
  0 siblings, 0 replies; 39+ messages in thread
From: Michael G Schwern @ 2012-08-02 23:18 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jonathan Nieder, git, gitster, robbat2, bwalton

On 2012.8.2 2:51 PM, Eric Wong wrote:
> svn_path_canonicalize() may be accessible in some versions of SVN,
> but it'll return undef.

Yuck!  Good catch!


> I've tested the following on an old CentOS 5.2 chroot with SVN 1.4.2:

Looks good to me.


-- 
Alligator sandwich, and make it snappy!

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.
  2012-07-30 20:46     ` Michael G Schwern
@ 2012-09-26 19:45       ` Jonathan Nieder
  2012-09-26 20:58         ` Eric Wong
  0 siblings, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-09-26 19:45 UTC (permalink / raw)
  To: Eric Wong; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Hi,

Michael G Schwern wrote:
> On 2012.7.30 12:51 PM, Eric Wong wrote:
>> Michael G Schwern wrote:

>>> _collapse_dotdot() works better than the existing regex did.
>>
>> I don't dispute it's better, but it's worth explaining in the commit
>> message to reviewers why something is "better".
>
> Yeah.  I figured the tests covered that.

Now I'm tripping up on the same thing.  Eric, did you ever find out
what the motivation for this patch was?  Is SVN 1.7 more persnickety
about runs of multiple slashes in a row or something, or is it more
of an aesthetic thing?

Puzzled,
Jonathan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 4/7] Add join_paths() to safely concatenate paths.
  2012-07-28  9:38 ` [PATCH 4/7] Add join_paths() to safely concatenate paths Michael G. Schwern
@ 2012-09-26 20:51   ` Jonathan Nieder
  0 siblings, 0 replies; 39+ messages in thread
From: Jonathan Nieder @ 2012-09-26 20:51 UTC (permalink / raw)
  To: normalperson; +Cc: Michael G. Schwern, git, gitster, robbat2, bwalton

Hi,

Michael G. Schwern wrote:

> Otherwise you might wind up with things like...
>
>     my $path1 = undef;
>     my $path2 = 'foo';
>     my $path = $path1 . '/' . $path2;
>
> creating '/foo'.  Or this...
>
>     my $path1 = 'foo/';
>     my $path2 = 'bar';
>     my $path = $path1 . '/' . $path2;
>
> creating 'foo//bar'.

I'm still puzzled by this one, too.  I don't understand the
motivation.  Is this to make joining paths less fragile, by preserving
the property that join_paths($a, $b) names the directory you would get
to by first chdir-ing into $a and then into $b?

It would be easier to understand as two patches: first, one that
extracts join_paths without any functional change, and then one that
changes its implementation with an explanation for what positive
functional effect that would have.

> --- a/git-svn.perl
> +++ b/git-svn.perl
[...]
> @@ -1275,7 +1276,7 @@ sub get_svnprops {
>  	$path = $cmd_dir_prefix . $path;
>  	fatal("No such file or directory: $path") unless -e $path;
>  	my $is_dir = -d $path ? 1 : 0;
> -	$path = $gs->{path} . '/' . $path;
> +	$path = join_paths($gs->{path}, $path);
>  
>  	# canonicalize the path (otherwise libsvn will abort or fail to
>  	# find the file)

This can't be for the //-collapsing effect since the path is about
to be canonicalized.  It can't be for the initial-/ effect since
that is stripped away by canonicalization, too.

So no functional effect here, good or bad.

[...]
> --- a/perl/Git/SVN.pm
> +++ b/perl/Git/SVN.pm
[...]
> @@ -316,9 +320,7 @@ sub init_remote_config {
>  			}
>  			my $old_path = $self->path;
>  			$url =~ s!^\Q$min_url\E(/|$)!!;
> -			if (length $old_path) {
> -				$url .= "/$old_path";
> -			}
> +			$url = join_paths($url, $old_path);
>  			$self->path($url);

This is probably not for the normal //-collapsing effect because
$url already has its trailing / stripped off.  Maybe it is for
cases where $old_path has leading slashes or $min_url has multiple
trailing ones?

In the end it shouldn't make a difference, once a later patch teaches
Git::SVN->path to canonicalize.

Is the functional change in this patch for aesthetic reasons, or is
there some other component (perhaps in a later patch) that relies on
it?

Thanks again for your help,
Jonathan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.
  2012-09-26 19:45       ` Jonathan Nieder
@ 2012-09-26 20:58         ` Eric Wong
  2012-09-26 21:38           ` Jonathan Nieder
  0 siblings, 1 reply; 39+ messages in thread
From: Eric Wong @ 2012-09-26 20:58 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi,
> 
> Michael G Schwern wrote:
> > On 2012.7.30 12:51 PM, Eric Wong wrote:
> >> Michael G Schwern wrote:
> 
> >>> _collapse_dotdot() works better than the existing regex did.
> >>
> >> I don't dispute it's better, but it's worth explaining in the commit
> >> message to reviewers why something is "better".
> >
> > Yeah.  I figured the tests covered that.
> 
> Now I'm tripping up on the same thing.  Eric, did you ever find out
> what the motivation for this patch was?  Is SVN 1.7 more persnickety
> about runs of multiple slashes in a row or something, or is it more
> of an aesthetic thing?

I'm not sure about this case specifically, but SVN has (and will likely
become) more persnickety over time.  I haven't had a chance to check SVN
itself, but I think being defensive and giving it prettier paths will
be safer in the future.

That said, I'd favor an implementation that split on m{/+} and
collapsed as Michael mentioned.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.
  2012-09-26 20:58         ` Eric Wong
@ 2012-09-26 21:38           ` Jonathan Nieder
  2012-09-26 21:54             ` Eric Wong
  0 siblings, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-09-26 21:38 UTC (permalink / raw)
  To: Eric Wong; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Eric Wong wrote:

> That said, I'd favor an implementation that split on m{/+} and
> collapsed as Michael mentioned.

Sounds sensible.  Is canonicalize_path responsible for collapsing
runs of slashes?  What should _collapse_dotdot do to
"c:/.." or "http://www.example.com/.."?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.
  2012-09-26 21:38           ` Jonathan Nieder
@ 2012-09-26 21:54             ` Eric Wong
  2012-09-26 22:43               ` Jonathan Nieder
  0 siblings, 1 reply; 39+ messages in thread
From: Eric Wong @ 2012-09-26 21:54 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Eric Wong wrote:
> > That said, I'd favor an implementation that split on m{/+} and
> > collapsed as Michael mentioned.
> 
> Sounds sensible.  Is canonicalize_path responsible for collapsing
> runs of slashes?  What should _collapse_dotdot do to
> "c:/.." or "http://www.example.com/.."?

It should probably just return the root path ("c:/" and
"http://www.example.com/" respectively).

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.
  2012-09-26 21:54             ` Eric Wong
@ 2012-09-26 22:43               ` Jonathan Nieder
  2012-09-27  0:15                 ` Eric Wong
  0 siblings, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-09-26 22:43 UTC (permalink / raw)
  To: Eric Wong; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Eric Wong wrote:

> It should probably just return the root path ("c:/" and
> "http://www.example.com/" respectively).

That means recognizing drive letters and URLs.  Hm.

Subversion commands seem to use svn_client_args_to_target_array2
to canonicalize arguments.  It does something like the following:

 1. split at @PEG revision
 2. check if it looks like a URL (svn_path_is_url).  If so:

    i.   urlencode characters with high bit set (svn_path_uri_from_iri)
    ii.  urlencode some other special characters (svn_path_uri_autoescape)
         (that is: [ "<>\\^`{|}])
    iii. (on Windows) convert backslashes to forward slashes
    iv.  complain if there are any '..' (svn_path_is_backpath_present)
    v.   make url scheme and hostname lowercase, strip default portnumber,
	 strip trailing '/', collapse '/'-es including urlencoded %2F,
	 strip '.' and '%2E' components, make drive letter in file:///
         URLs uppercase, urlencode uri-special characters
         (svn_uri_canonicalize)

   Otherwise:

    i.   canonicalize case, handle '..' and '.' components, and make
	 path separators into '/'
         (apr_filepath_merge(APR_FILEPATH_TRUENAME))
    ii.  strip trailing '/', collapse '/'-es except in UNC paths,
         strip '.' components, make drive letter uppercase
         (svn_dirent_canonicalize)
    iii. deal with "truepath collisions" (unwanted case
         canonicalizations)
    iv.  reject .svn and _svn directories

Maybe we can use apr_filepath_merge() to avoid reinventing the wheel?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.
  2012-09-26 22:43               ` Jonathan Nieder
@ 2012-09-27  0:15                 ` Eric Wong
  2012-09-27  2:11                   ` Jonathan Nieder
  0 siblings, 1 reply; 39+ messages in thread
From: Eric Wong @ 2012-09-27  0:15 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Maybe we can use apr_filepath_merge() to avoid reinventing the wheel?

Ideally, yes.  Is there an easy way to access that from Perl? (and
for the older versions of SVN folks people are running).

Perhaps we can expose equivalent functionality in git via
git-rev-parse instead?

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths.
  2012-09-27  0:15                 ` Eric Wong
@ 2012-09-27  2:11                   ` Jonathan Nieder
  0 siblings, 0 replies; 39+ messages in thread
From: Jonathan Nieder @ 2012-09-27  2:11 UTC (permalink / raw)
  To: Eric Wong; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Eric Wong wrote:

> Ideally, yes.  Is there an easy way to access that from Perl? (and
> for the older versions of SVN folks people are running).

Subversion's swig bindings only wrap a few apr functions and do not
depend on fuller apr bindings.

Something like svn_dirent_is_under_root() could be useful.  It uses
whatever base path you choose.  I haven't tried using base_path=""
yet.  New in Subversion 1.7, but that would be ok --- an imperfect
fallback for older Subversion versions would be fine.

Unfortunately, from swig/core.i: "SWIG can't digest these functions
yet, so ignore them for now. TODO: make them work."

svn_client_args_to_target_array2() is exposed and would probably be
perfect.  I can't seem to find how to create an apr_array_header_t
to pass as its first argument, alas.

> Perhaps we can expose equivalent functionality in git via
> git-rev-parse instead?

It might be possible to add a git-svn--helper command that links to
libsvn or apr, but ick.  The trouble is it's not clear at all how
Subversion's perl API was *designed* to be used.

Hm.
Jonathan

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH] git-svn: keep leading slash when canonicalizing paths (fallback case)
  2012-07-28 19:07     ` Michael G Schwern
  2012-07-30 20:04       ` Eric Wong
@ 2012-10-05  7:04       ` Jonathan Nieder
  2012-10-05 23:12         ` Eric Wong
  1 sibling, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-10-05  7:04 UTC (permalink / raw)
  To: Eric Wong; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Subversion's svn_dirent_canonicalize() and svn_path_canonicalize()
APIs keep a leading slash in the return value if one was present on
the argument, which can be useful since it allows relative and
absolute paths to be distinguished.

When git-svn's canonicalize_path() learned to use these functions if
available, its semantics changed in the corresponding way.  Some new
callers rely on the leading slash --- for example, if the slash is
stripped out then _canonicalize_url_ourselves() will transform
"proto://host/path/to/resource" to "proto://hostpath/to/resource".

Unfortunately the fallback _canonicalize_path_ourselves(), used when
the appropriate SVN APIs are not usable, still follows the old
semantics, so if that code path is exercised then it breaks.  Fix it
to follow the new convention.

Noticed by forcing the fallback on and running tests.  Without this
patch, t9101.4 fails:

 Bad URL passed to RA layer: Unable to open an ra_local session to \
 URL: Local URL 'file://homejrnsrcgit-scratch/t/trash%20directory.\
 t9101-git-svn-props/svnrepo' contains unsupported hostname at \
 /home/jrn/src/git-scratch/perl/blib/lib/Git/SVN.pm line 148

With it, the git-svn tests pass again.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Hi Eric,

Michael G Schwern wrote:
> On 2012.7.28 6:55 AM, Jonathan Nieder wrote:

>> When would this "else" case trip?
>
> When svn_path_canonicalize() does not exist in the SVN API, presumably because
> their SVN is too old.

I accidentally tested this "else" branch by making the other cases
false.  t9101.4 failed as described above, or in other words,
canonicalize_url_ourselves() stripped out a few too many slashes.  For
reference:

| sub _canonicalize_url_ourselves {
|         my ($url) = @_;
|         if ($url =~ m#^([^:]+)://([^/]*)(.*)$#) {
|                 my ($scheme, $domain, $uri) = ($1, $2, _canonicalize_url_path(canonicalize_path($3)));
|                 $url = "$scheme://$domain$uri";
|         }
|         $url;
| }

When $url is http://host/path/to/resource,

	$1 = "http", $2 = "host", $3 = "/path/to/resource"
	canonicalize_path($3) = "path/to/resource" <--- (??)
	_canonicalize_url_path(ditto) = "path/to/resource"
	$url = "http://hostpath/to/resource"

How about this patch?

 perl/Git/SVN/Utils.pm |    1 -
 1 file changed, 1 deletion(-)

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 4bb4dde8..8b8cf375 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -122,7 +122,6 @@ sub _canonicalize_path_ourselves {
 	$path = _collapse_dotdot($path);
 	$path =~ s#/$##g;
 	$path =~ s#^\./## if $dot_slash_added;
-	$path =~ s#^/##;
 	$path =~ s#^\.$##;
 	return $path;
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH] git-svn: keep leading slash when canonicalizing paths (fallback case)
  2012-10-05  7:04       ` [PATCH] git-svn: keep leading slash when canonicalizing paths (fallback case) Jonathan Nieder
@ 2012-10-05 23:12         ` Eric Wong
  2012-10-06  5:36           ` Junio C Hamano
  0 siblings, 1 reply; 39+ messages in thread
From: Eric Wong @ 2012-10-05 23:12 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Noticed by forcing the fallback on and running tests.  Without this
> patch, t9101.4 fails:
> 
>  Bad URL passed to RA layer: Unable to open an ra_local session to \
>  URL: Local URL 'file://homejrnsrcgit-scratch/t/trash%20directory.\
>  t9101-git-svn-props/svnrepo' contains unsupported hostname at \
>  /home/jrn/src/git-scratch/perl/blib/lib/Git/SVN.pm line 148
> 
> With it, the git-svn tests pass again.
> 
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks for noticing this.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
and pushed to my master at git://bogomips.org/git-svn

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH] git-svn: keep leading slash when canonicalizing paths (fallback case)
  2012-10-05 23:12         ` Eric Wong
@ 2012-10-06  5:36           ` Junio C Hamano
  0 siblings, 0 replies; 39+ messages in thread
From: Junio C Hamano @ 2012-10-06  5:36 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jonathan Nieder, Michael G Schwern, git, robbat2, bwalton

Eric Wong <normalperson@yhbt.net> writes:

> Jonathan Nieder <jrnieder@gmail.com> wrote:
>> Noticed by forcing the fallback on and running tests.  Without this
>> patch, t9101.4 fails:
>> 
>>  Bad URL passed to RA layer: Unable to open an ra_local session to \
>>  URL: Local URL 'file://homejrnsrcgit-scratch/t/trash%20directory.\
>>  t9101-git-svn-props/svnrepo' contains unsupported hostname at \
>>  /home/jrn/src/git-scratch/perl/blib/lib/Git/SVN.pm line 148
>> 
>> With it, the git-svn tests pass again.
>> 
>> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
>
> Thanks for noticing this.
> Signed-off-by: Eric Wong <normalperson@yhbt.net>
> and pushed to my master at git://bogomips.org/git-svn

Will pull before 1.8.0-rc1.  Thanks.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH/RFC 0/2] Re: [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available.
  2012-07-28 19:01     ` Michael G Schwern
  2012-07-28 19:30       ` Jonathan Nieder
@ 2012-10-14 11:42       ` Jonathan Nieder
  2012-10-14 11:45         ` [PATCH 1/2] git svn: do not overescape URLs (fallback case) Jonathan Nieder
  2012-10-14 11:48         ` [PATCH 2/2] git svn: canonicalize_url(): use svn_path_canonicalize when available Jonathan Nieder
  1 sibling, 2 replies; 39+ messages in thread
From: Jonathan Nieder @ 2012-10-14 11:42 UTC (permalink / raw)
  To: Eric Wong; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Hi Eric,

Michael G Schwern wrote:
> On 2012.7.28 6:50 AM, Jonathan Nieder wrote:
>> Michael G Schwern wrote:

>>> --- a/perl/Git/SVN/Utils.pm
>>> +++ b/perl/Git/SVN/Utils.pm
>> [...]
>>> @@ -100,6 +102,20 @@ API as a URL.
>>>  =cut
>>>  
>>>  sub canonicalize_url {
>>> +	my $url = shift;
>>> +
>>> +	# The 1.7 way to do it
>>> +	if ( defined &SVN::_Core::svn_uri_canonicalize ) {
>>> +		return SVN::_Core::svn_uri_canonicalize($url);
>>> +	}
>>> +	# There wasn't a 1.6 way to do it, so we do it ourself.
>>> +	else {
>>> +		return _canonicalize_url_ourselves($url);
[...]
>> Leaves me a bit nervous.
>
> As it should, SVN dumped a mess on us.

Here's a pair of patches that address some of the bugs I was alluding
to.  Patch 1 makes _canonicalize_url_ourselves() match Subversion's
own canonicalization behavior more closely, though even with that
patch it still does not meet Subversion's requirements perfectly
(e.g., "%ab" is not canonicalized to "%AB").  Patch 2 makes that not
matter by using svn_path_canonicalize() when possible, which is the
standard way to do this kind of thing.

Sorry for the lack of clarity before.

Jonathan Nieder (2):
  git svn: do not overescape URLs (fallback case)
  Git::SVN::Utils::canonicalize_url: use svn_path_canonicalize when
    available

 perl/Git/SVN/Utils.pm | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 1/2] git svn: do not overescape URLs (fallback case)
  2012-10-14 11:42       ` [PATCH/RFC 0/2] " Jonathan Nieder
@ 2012-10-14 11:45         ` Jonathan Nieder
  2012-10-14 11:48         ` [PATCH 2/2] git svn: canonicalize_url(): use svn_path_canonicalize when available Jonathan Nieder
  1 sibling, 0 replies; 39+ messages in thread
From: Jonathan Nieder @ 2012-10-14 11:45 UTC (permalink / raw)
  To: Eric Wong; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Subversion's canonical URLs are intended to make URL comparison easy
and therefore have strict rules about what characters are special
enough to urlencode and what characters should be left alone.

When in the fallback codepath because unable to use libsvn's own
canonicalization function for some reason, escape special characters
in URIs according to the svn_uri__char_validity[] table in
subversion/libsvn_subr/path.c (r935829).  The libsvn versions that
trigger this code path are not likely to be strict enough to care, but
it's nicer to be consistent.

Noticed by using SVN 1.6.17 perl bindings, which do not provide
SVN::_Core::svn_uri_canonicalize (triggering the fallback code),
with libsvn 1.7.5, whose do_switch is fussy enough to care:

  Committing to file:///home/jrn/src/git/t/trash%20directory.\
  t9118-git-svn-funky-branch-names/svnrepo/pr%20ject/branches\
  /more%20fun%20plugin%21 ...
  svn: E235000: In file '[...]/subversion/libsvn_subr/dirent_uri.c' \
  line 2291: assertion failed (svn_uri_is_canonical(url, pool))
  error: git-svn died of signal 6
  not ok - 3 test dcommit to funky branch

After this change, the '!' in 'more%20fun%20plugin!' is not urlencoded
and t9118 passes again.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 perl/Git/SVN/Utils.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 8b8cf375..3d1a0933 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -155,7 +155,7 @@ sub _canonicalize_url_path {
 
 	my @parts;
 	foreach my $part (split m{/+}, $uri_path) {
-		$part =~ s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
+		$part =~ s/([^!\$%&'()*+,.\/\w:=\@_`~-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
 		push @parts, $part;
 	}
 
-- 
1.8.0.rc2

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 2/2] git svn: canonicalize_url(): use svn_path_canonicalize when available
  2012-10-14 11:42       ` [PATCH/RFC 0/2] " Jonathan Nieder
  2012-10-14 11:45         ` [PATCH 1/2] git svn: do not overescape URLs (fallback case) Jonathan Nieder
@ 2012-10-14 11:48         ` Jonathan Nieder
  2012-10-23 22:58           ` Eric Wong
  1 sibling, 1 reply; 39+ messages in thread
From: Jonathan Nieder @ 2012-10-14 11:48 UTC (permalink / raw)
  To: Eric Wong; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Until Subversion 1.7 (more precisely r873487), the standard way to
canonicalize a URI was to call svn_path_canonicalize().  Use it.

This saves "git svn" from having to rely on our imperfect
reimplementation of the same.  If the function doesn't exist or
returns undef, though, it can use the fallback code, which we keep to
be conservative.  Since svn_path_canonicalize() was added before
Subversion 1.1, hopefully that doesn't happen often.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 perl/Git/SVN/Utils.pm | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index 3d1a0933..40f7c799 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -138,15 +138,19 @@ API as a URL.
 
 sub canonicalize_url {
 	my $url = shift;
+	my $rv;
 
 	# The 1.7 way to do it
 	if ( defined &SVN::_Core::svn_uri_canonicalize ) {
-		return SVN::_Core::svn_uri_canonicalize($url);
+		$rv = SVN::_Core::svn_uri_canonicalize($url);
 	}
-	# There wasn't a 1.6 way to do it, so we do it ourself.
-	else {
-		return _canonicalize_url_ourselves($url);
+	# The 1.6 way to do it
+	elsif ( defined &SVN::_Core::svn_path_canonicalize ) {
+		$rv = SVN::_Core::svn_path_canonicalize($url);
 	}
+	return $rv if defined $rv;
+
+	return _canonicalize_url_ourselves($url);
 }
 
 
-- 
1.8.0.rc2

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/2] git svn: canonicalize_url(): use svn_path_canonicalize when available
  2012-10-14 11:48         ` [PATCH 2/2] git svn: canonicalize_url(): use svn_path_canonicalize when available Jonathan Nieder
@ 2012-10-23 22:58           ` Eric Wong
  0 siblings, 0 replies; 39+ messages in thread
From: Eric Wong @ 2012-10-23 22:58 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Michael G Schwern, git, gitster, robbat2, bwalton

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Until Subversion 1.7 (more precisely r873487), the standard way to
> canonicalize a URI was to call svn_path_canonicalize().  Use it.
> 
> This saves "git svn" from having to rely on our imperfect
> reimplementation of the same.  If the function doesn't exist or
> returns undef, though, it can use the fallback code, which we keep to
> be conservative.  Since svn_path_canonicalize() was added before
> Subversion 1.1, hopefully that doesn't happen often.

Hi Jonathan, this fails for me using http (but not file:// or svn://).
1/2 of this RFC looks fine, though.

subversion 1.6.12dfsg-6, apache2-mpm-prefork 2.2.16-6+squeeze8
(Debian squeeze)

$ SVN_HTTPD_PORT=12345 sh t9118-git-svn-funky-branch-names.sh -v
Initialized empty Git repository in /home/ew/git-core/t/trash directory.t9118-git-svn-funky-branch-names/.git/
expecting success: 
	mkdir project project/trunk project/branches project/tags &&
	echo foo > project/trunk/foo &&
	svn_cmd import -m "$test_description" project "$svnrepo/pr ject" &&
	rm -rf project &&
	svn_cmd cp -m "fun" "$svnrepo/pr ject/trunk" \
	                "$svnrepo/pr ject/branches/fun plugin" &&
	svn_cmd cp -m "more fun!" "$svnrepo/pr ject/branches/fun plugin" \
	                      "$svnrepo/pr ject/branches/more fun plugin!" &&
	svn_cmd cp -m "scary" "$svnrepo/pr ject/branches/fun plugin" \
	              "$svnrepo/pr ject/branches/$scary_uri" &&
	svn_cmd cp -m "leading dot" "$svnrepo/pr ject/trunk" \
			"$svnrepo/pr ject/branches/.leading_dot" &&
	svn_cmd cp -m "trailing dot" "$svnrepo/pr ject/trunk" \
			"$svnrepo/pr ject/branches/trailing_dot." &&
	svn_cmd cp -m "trailing .lock" "$svnrepo/pr ject/trunk" \
			"$svnrepo/pr ject/branches/trailing_dotlock.lock" &&
	svn_cmd cp -m "reflog" "$svnrepo/pr ject/trunk" \
			"$svnrepo/pr ject/branches/not-a@{0}reflog@" &&
	start_httpd
	
Adding         project/trunk
Adding         project/trunk/foo
Adding         project/branches
Adding         project/tags

Committed revision 1.

Committed revision 2.

Committed revision 3.

Committed revision 4.

Committed revision 5.

Committed revision 6.

Committed revision 7.

Committed revision 8.
ok 1 - setup svnrepo

expecting success: 
	git svn clone -s "$svnrepo/pr ject" project &&
	(
		cd project &&
		git rev-parse "refs/remotes/fun%20plugin" &&
		git rev-parse "refs/remotes/more%20fun%20plugin!" &&
		git rev-parse "refs/remotes/$scary_ref" &&
		git rev-parse "refs/remotes/%2Eleading_dot" &&
		git rev-parse "refs/remotes/trailing_dot%2E" &&
		git rev-parse "refs/remotes/trailing_dotlock%2Elock" &&
		git rev-parse "refs/remotes/$non_reflog"
	)
	
Initialized empty Git repository in /home/ew/git-core/t/trash directory.t9118-git-svn-funky-branch-names/project/.git/
Bad URL passed to RA layer: URL 'http://127.0.0.1:12345/pr ject' is malformed or the scheme or host or path is missing at /home/ew/git-core/perl/blib/lib/Git/SVN.pm line 310

not ok - 2 test clone with funky branch names
#	
#		git svn clone -s "$svnrepo/pr ject" project &&
#		(
#			cd project &&
#			git rev-parse "refs/remotes/fun%20plugin" &&
#			git rev-parse "refs/remotes/more%20fun%20plugin!" &&
#			git rev-parse "refs/remotes/$scary_ref" &&
#			git rev-parse "refs/remotes/%2Eleading_dot" &&
#			git rev-parse "refs/remotes/trailing_dot%2E" &&
#			git rev-parse "refs/remotes/trailing_dotlock%2Elock" &&
#			git rev-parse "refs/remotes/$non_reflog"
#		)
#		

expecting success: 
	(
		cd project &&
		git reset --hard 'refs/remotes/more%20fun%20plugin!' &&
		echo hello >> foo &&
		git commit -m 'hello' -- foo &&
		git svn dcommit
	)
	
fatal: ambiguous argument 'refs/remotes/more%20fun%20plugin!': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
not ok - 3 test dcommit to funky branch
#	
#		(
#			cd project &&
#			git reset --hard 'refs/remotes/more%20fun%20plugin!' &&
#			echo hello >> foo &&
#			git commit -m 'hello' -- foo &&
#			git svn dcommit
#		)
#		

expecting success: 
	(
		cd project &&
		git reset --hard "refs/remotes/$scary_ref" &&
		echo urls are scary >> foo &&
		git commit -m "eep" -- foo &&
		git svn dcommit
	)
	
fatal: ambiguous argument 'refs/remotes/Abo-Uebernahme%20(Bug%20#994)': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
not ok - 4 test dcommit to scary branch
#	
#		(
#			cd project &&
#			git reset --hard "refs/remotes/$scary_ref" &&
#			echo urls are scary >> foo &&
#			git commit -m "eep" -- foo &&
#			git svn dcommit
#		)
#		

expecting success: 
	(
		cd project &&
		git reset --hard "refs/remotes/trailing_dotlock%2Elock" &&
		echo who names branches like this anyway? >> foo &&
		git commit -m "bar" -- foo &&
		git svn dcommit
	)
	
fatal: ambiguous argument 'refs/remotes/trailing_dotlock%2Elock': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
not ok - 5 test dcommit to trailing_dotlock branch
#	
#		(
#			cd project &&
#			git reset --hard "refs/remotes/trailing_dotlock%2Elock" &&
#			echo who names branches like this anyway? >> foo &&
#			git commit -m "bar" -- foo &&
#			git svn dcommit
#		)
#		

# failed 4 among 5 test(s)
1..5

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2012-10-23 22:58 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-28  9:38 Canonicalize the git-svn path & url accessors Michael G. Schwern
2012-07-28  9:38 ` [PATCH 1/7] Move the canonicalization functions to Git::SVN::Utils Michael G. Schwern
2012-07-28  9:38 ` [PATCH 2/7] Change canonicalize_url() to use the SVN 1.7 API when available Michael G. Schwern
2012-07-28 13:50   ` Jonathan Nieder
2012-07-28 19:01     ` Michael G Schwern
2012-07-28 19:30       ` Jonathan Nieder
2012-07-28 19:51         ` Michael G Schwern
2012-07-28 19:57           ` Jonathan Nieder
2012-07-28 20:02             ` Jonathan Nieder
2012-07-28 20:24               ` Michael G Schwern
2012-10-14 11:42       ` [PATCH/RFC 0/2] " Jonathan Nieder
2012-10-14 11:45         ` [PATCH 1/2] git svn: do not overescape URLs (fallback case) Jonathan Nieder
2012-10-14 11:48         ` [PATCH 2/2] git svn: canonicalize_url(): use svn_path_canonicalize when available Jonathan Nieder
2012-10-23 22:58           ` Eric Wong
2012-07-28  9:38 ` [PATCH 3/7] Extract, test and enhance the logic to collapse ../foo paths Michael G. Schwern
2012-07-30 19:51   ` Eric Wong
2012-07-30 20:46     ` Michael G Schwern
2012-09-26 19:45       ` Jonathan Nieder
2012-09-26 20:58         ` Eric Wong
2012-09-26 21:38           ` Jonathan Nieder
2012-09-26 21:54             ` Eric Wong
2012-09-26 22:43               ` Jonathan Nieder
2012-09-27  0:15                 ` Eric Wong
2012-09-27  2:11                   ` Jonathan Nieder
2012-07-28  9:38 ` [PATCH 4/7] Add join_paths() to safely concatenate paths Michael G. Schwern
2012-09-26 20:51   ` Jonathan Nieder
2012-07-28  9:38 ` [PATCH 5/7] Remove irrelevant comment Michael G. Schwern
2012-07-28  9:38 ` [PATCH 6/7] Switch path canonicalization to use the SVN API Michael G. Schwern
2012-07-28 13:55   ` Jonathan Nieder
2012-07-28 19:07     ` Michael G Schwern
2012-07-30 20:04       ` Eric Wong
2012-08-02 21:51         ` Eric Wong
2012-08-02 23:18           ` Michael G Schwern
2012-10-05  7:04       ` [PATCH] git-svn: keep leading slash when canonicalizing paths (fallback case) Jonathan Nieder
2012-10-05 23:12         ` Eric Wong
2012-10-06  5:36           ` Junio C Hamano
2012-07-28  9:38 ` [PATCH 7/7] Make Git::SVN and Git::SVN::Ra canonicalize paths and urls Michael G. Schwern
2012-07-28 14:11   ` Jonathan Nieder
2012-07-28 19:15     ` Michael G Schwern

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).