git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* [PATCH] git-svn: Destroy the cache when we fail to read it
@ 2011-08-22  2:17 Jason Gross
  2011-08-22  4:04 ` Jason Gross
                   ` (3 more replies)
  0 siblings, 4 replies; 26+ messages in thread
From: Jason Gross @ 2011-08-22  2:17 UTC (permalink / raw)
  To: git; +Cc: Jason Gross, Jonathan Nieder

Previously, we would fail fatally when trying to fetch changes with
mergeinfo on a 32 bit machine, when the repository previously had
fetched changes with mergeinfo on a 64 bit machine.

This fixes bug 618875 (which is also 587650, 635097).  Much of the code
was written by Jonathan Nieder <jrnieder@gmail.com> with suggestions
from Steffen Mueller <smueller@cpan.org> (see
http://lists.debian.org/debian-perl/2011/05/msg00023.html and
http://lists.debian.org/debian-perl/2011/05/msg00026.html).

Signed-off-by: Jason Gross <jgross@mit.edu>
Cc: Jonathan Nieder <jrnieder@gmail.com>
---
 git-svn.perl |   59 +++++++++++++++++++++++++++++++++++----------------------
 1 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index 89f83fd..78ccdc8 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -1680,7 +1680,7 @@ use vars qw/$default_repo_id $default_ref_id $_no_metadata $_follow_parent
             $_use_svnsync_props $no_reuse_existing $_minimize_url
 	    $_use_log_author $_add_author_from $_localtime/;
 use Carp qw/croak/;
-use File::Path qw/mkpath/;
+use File::Path qw/mkpath rmtree/;
 use File::Copy qw/copy/;
 use IPC::Open3;
 use Memoize;  # core since 5.8.0, Jul 2002
@@ -3198,28 +3198,41 @@ sub has_no_changes {
 		$memoized = 1;
 
 		my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
-		mkpath([$cache_path]) unless -d $cache_path;
-
-		tie my %lookup_svn_merge_cache => 'Memoize::Storable',
-		    "$cache_path/lookup_svn_merge.db", 'nstore';
-		memoize 'lookup_svn_merge',
-			SCALAR_CACHE => 'FAULT',
-			LIST_CACHE => ['HASH' => \%lookup_svn_merge_cache],
-		;
-
-		tie my %check_cherry_pick_cache => 'Memoize::Storable',
-		    "$cache_path/check_cherry_pick.db", 'nstore';
-		memoize 'check_cherry_pick',
-			SCALAR_CACHE => 'FAULT',
-			LIST_CACHE => ['HASH' => \%check_cherry_pick_cache],
-		;
-
-		tie my %has_no_changes_cache => 'Memoize::Storable',
-		    "$cache_path/has_no_changes.db", 'nstore';
-		memoize 'has_no_changes',
-			SCALAR_CACHE => ['HASH' => \%has_no_changes_cache],
-			LIST_CACHE => 'FAULT',
-		;
+		my $do_memoization = sub {
+			mkpath([$cache_path]) unless -d $cache_path;
+
+			tie my %lookup_svn_merge_cache => 'Memoize::Storable',
+			    "$cache_path/lookup_svn_merge.db", 'nstore';
+			memoize 'lookup_svn_merge',
+				SCALAR_CACHE => 'FAULT',
+				LIST_CACHE => ['HASH' => \%lookup_svn_merge_cache],
+			;
+
+			tie my %check_cherry_pick_cache => 'Memoize::Storable',
+			    "$cache_path/check_cherry_pick.db", 'nstore';
+			memoize 'check_cherry_pick',
+				SCALAR_CACHE => 'FAULT',
+				LIST_CACHE => ['HASH' => \%check_cherry_pick_cache],
+			;
+
+			tie my %has_no_changes_cache => 'Memoize::Storable',
+			    "$cache_path/has_no_changes.db", 'nstore';
+			memoize 'has_no_changes',
+				SCALAR_CACHE => ['HASH' => \%has_no_changes_cache],
+				LIST_CACHE => 'FAULT',
+			;
+		};
+
+		if (not eval {
+			$do_memoization->();
+			1;
+		}) {
+			my $err = $@ || "Zombie error"; # "Zombie error" to catch clobbered $@ in buggy destructors
+			die $err unless -d $cache_path;
+			print STDERR "Discarding cache and trying again ($@)\n";
+			rmtree([$cache_path]);
+			$do_memoization->();
+		}
 	}
 
 	sub unmemoize_svn_mergeinfo_functions {
-- 
1.7.2.3

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] git-svn: Destroy the cache when we fail to read it
  2011-08-22  2:17 [PATCH] git-svn: Destroy the cache when we fail to read it Jason Gross
@ 2011-08-22  4:04 ` Jason Gross
  2011-08-22  4:10 ` [PATCH] Add tests for handling corrupted caches Jason Gross
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 26+ messages in thread
From: Jason Gross @ 2011-08-22  4:04 UTC (permalink / raw)
  To: git; +Cc: Jason Gross, Jonathan Nieder

Oops, the bug numbers I gave were debian bug ids, not git bug ids.
The "This fixes bug ..." sentence should either be removed from the
commit message before this commit is merged, or the numbers should be
replaced with the appropriate git bug numbers.

-Jason

On Sun, Aug 21, 2011 at 10:17 PM, Jason Gross <jgross@mit.edu> wrote:
>
> Previously, we would fail fatally when trying to fetch changes with
> mergeinfo on a 32 bit machine, when the repository previously had
> fetched changes with mergeinfo on a 64 bit machine.
>
> This fixes bug 618875 (which is also 587650, 635097).  Much of the code
> was written by Jonathan Nieder <jrnieder@gmail.com> with suggestions
> from Steffen Mueller <smueller@cpan.org> (see
> http://lists.debian.org/debian-perl/2011/05/msg00023.html and
> http://lists.debian.org/debian-perl/2011/05/msg00026.html).
>
> Signed-off-by: Jason Gross <jgross@mit.edu>
> Cc: Jonathan Nieder <jrnieder@gmail.com>
> ---
>  git-svn.perl |   59 +++++++++++++++++++++++++++++++++++----------------------
>  1 files changed, 36 insertions(+), 23 deletions(-)
>
> diff --git a/git-svn.perl b/git-svn.perl
> index 89f83fd..78ccdc8 100755
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -1680,7 +1680,7 @@ use vars qw/$default_repo_id $default_ref_id $_no_metadata $_follow_parent
>             $_use_svnsync_props $no_reuse_existing $_minimize_url
>            $_use_log_author $_add_author_from $_localtime/;
>  use Carp qw/croak/;
> -use File::Path qw/mkpath/;
> +use File::Path qw/mkpath rmtree/;
>  use File::Copy qw/copy/;
>  use IPC::Open3;
>  use Memoize;  # core since 5.8.0, Jul 2002
> @@ -3198,28 +3198,41 @@ sub has_no_changes {
>                $memoized = 1;
>
>                my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
> -               mkpath([$cache_path]) unless -d $cache_path;
> -
> -               tie my %lookup_svn_merge_cache => 'Memoize::Storable',
> -                   "$cache_path/lookup_svn_merge.db", 'nstore';
> -               memoize 'lookup_svn_merge',
> -                       SCALAR_CACHE => 'FAULT',
> -                       LIST_CACHE => ['HASH' => \%lookup_svn_merge_cache],
> -               ;
> -
> -               tie my %check_cherry_pick_cache => 'Memoize::Storable',
> -                   "$cache_path/check_cherry_pick.db", 'nstore';
> -               memoize 'check_cherry_pick',
> -                       SCALAR_CACHE => 'FAULT',
> -                       LIST_CACHE => ['HASH' => \%check_cherry_pick_cache],
> -               ;
> -
> -               tie my %has_no_changes_cache => 'Memoize::Storable',
> -                   "$cache_path/has_no_changes.db", 'nstore';
> -               memoize 'has_no_changes',
> -                       SCALAR_CACHE => ['HASH' => \%has_no_changes_cache],
> -                       LIST_CACHE => 'FAULT',
> -               ;
> +               my $do_memoization = sub {
> +                       mkpath([$cache_path]) unless -d $cache_path;
> +
> +                       tie my %lookup_svn_merge_cache => 'Memoize::Storable',
> +                           "$cache_path/lookup_svn_merge.db", 'nstore';
> +                       memoize 'lookup_svn_merge',
> +                               SCALAR_CACHE => 'FAULT',
> +                               LIST_CACHE => ['HASH' => \%lookup_svn_merge_cache],
> +                       ;
> +
> +                       tie my %check_cherry_pick_cache => 'Memoize::Storable',
> +                           "$cache_path/check_cherry_pick.db", 'nstore';
> +                       memoize 'check_cherry_pick',
> +                               SCALAR_CACHE => 'FAULT',
> +                               LIST_CACHE => ['HASH' => \%check_cherry_pick_cache],
> +                       ;
> +
> +                       tie my %has_no_changes_cache => 'Memoize::Storable',
> +                           "$cache_path/has_no_changes.db", 'nstore';
> +                       memoize 'has_no_changes',
> +                               SCALAR_CACHE => ['HASH' => \%has_no_changes_cache],
> +                               LIST_CACHE => 'FAULT',
> +                       ;
> +               };
> +
> +               if (not eval {
> +                       $do_memoization->();
> +                       1;
> +               }) {
> +                       my $err = $@ || "Zombie error"; # "Zombie error" to catch clobbered $@ in buggy destructors
> +                       die $err unless -d $cache_path;
> +                       print STDERR "Discarding cache and trying again ($@)\n";
> +                       rmtree([$cache_path]);
> +                       $do_memoization->();
> +               }
>        }
>
>        sub unmemoize_svn_mergeinfo_functions {
> --
> 1.7.2.3
>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH] Add tests for handling corrupted caches
  2011-08-22  2:17 [PATCH] git-svn: Destroy the cache when we fail to read it Jason Gross
  2011-08-22  4:04 ` Jason Gross
@ 2011-08-22  4:10 ` Jason Gross
  2011-08-23  2:27 ` [PATCH] git-svn: Destroy the cache when we fail to read it Jonathan Nieder
  2011-08-23  8:15 ` Eric Wong
  3 siblings, 0 replies; 26+ messages in thread
From: Jason Gross @ 2011-08-22  4:10 UTC (permalink / raw)
  To: git; +Cc: Jonathan Nieder, Jason Gross

There might be a more straightforward way to generate an svn repository
with mergeinfo, but I'm not very familiar with svn.

Signed-off-by: Jason Gross <jgross@mit.edu>
---
 t/t9160-git-svn-caches.sh |   59 +++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 59 insertions(+), 0 deletions(-)
 create mode 100755 t/t9160-git-svn-caches.sh

diff --git a/t/t9160-git-svn-caches.sh b/t/t9160-git-svn-caches.sh
new file mode 100755
index 0000000..bba437a
--- /dev/null
+++ b/t/t9160-git-svn-caches.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+
+test_description='git svn handling of .git/svn/.caches'
+
+. ./lib-git-svn.sh
+
+say 'define NO_SVN_TESTS to skip git svn tests'
+
+test_expect_success 'initialize source svn repo' '
+	svn_cmd mkdir -m x "$svnrepo"/trunk "$svnrepo"/branches &&
+	svn_cmd co "$svnrepo" "$SVN_TREE"
+'
+
+test_expect_success 'make a change that gives mergeinfo' '
+	(
+		cd "$SVN_TREE" &&
+		touch trunk/foo &&
+		svn_cmd add trunk/foo &&
+		svn_cmd commit -m "foo" &&
+		svn_cmd up &&
+		svn_cmd cp trunk branches/1
+		svn_cmd commit -m "make branch" &&
+		svn_cmd up &&
+		touch branches/1/bar &&
+		svn_cmd add branches/1/bar &&
+		svn_cmd commit -m bar &&
+		svn_cmd up &&
+		cd trunk &&
+		svn_cmd merge ../branches/1 &&
+		svn_cmd commit -m merge &&
+		svn_cmd up
+	)
+'
+
+test_expect_success 'clone svn repo and corrupt .caches' '
+	git svn init "$svnrepo" -T trunk -b branches &&
+	git svn fetch &&
+	echo corrupted > .git/svn/.caches/lookup_svn_merge.db
+'
+
+test_expect_success 'make another change that generates mergeinfo' '
+	(
+		cd "$SVN_TREE" &&
+		touch branches/1/baz &&
+		svn_cmd commit -m baz &&
+		svn_cmd up &&
+		cd trunk &&
+		svn_cmd merge ../branches/1 &&
+		svn_cmd commit -m merge2 &&
+		svn_cmd up
+	) &&
+	rm -rf "$SVN_TREE"
+'
+
+test_expect_success 'fetch change with mergeinfo with a corrupted cache' '
+	git svn fetch
+'
+
+test_done
-- 
1.7.6.558.g4b2f

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] git-svn: Destroy the cache when we fail to read it
  2011-08-22  2:17 [PATCH] git-svn: Destroy the cache when we fail to read it Jason Gross
  2011-08-22  4:04 ` Jason Gross
  2011-08-22  4:10 ` [PATCH] Add tests for handling corrupted caches Jason Gross
@ 2011-08-23  2:27 ` Jonathan Nieder
  2011-08-23  2:36   ` Jonathan Nieder
  2011-08-23  5:51   ` Jason Gross
  2011-08-23  8:15 ` Eric Wong
  3 siblings, 2 replies; 26+ messages in thread
From: Jonathan Nieder @ 2011-08-23  2:27 UTC (permalink / raw)
  To: Jason Gross; +Cc: git, Eric Wong

(+cc: Eric Wong)
Hi Jason,

Jason Gross wrote:

[This patch teaches "git svn" to invalidate caches when they
 fail to load, for example because the endianness or size of
 some type changed, which is common in the perl 5.6 -> 5.8
 upgrade.]

> http://lists.debian.org/debian-perl/2011/05/msg00023.html and
> http://lists.debian.org/debian-perl/2011/05/msg00026.html).
[...]
> --- a/git-svn.perl
> +++ b/git-svn.perl
> @@ -1680,7 +1680,7 @@ use vars qw/$default_repo_id $default_ref_id $_no_metadata $_follow_parent
>              $_use_svnsync_props $no_reuse_existing $_minimize_url
>  	    $_use_log_author $_add_author_from $_localtime/;
>  use Carp qw/croak/;
> -use File::Path qw/mkpath/;
> +use File::Path qw/mkpath rmtree/;
>  use File::Copy qw/copy/;
>  use IPC::Open3;
>  use Memoize;  # core since 5.8.0, Jul 2002
> @@ -3198,28 +3198,41 @@ sub has_no_changes {
>  		$memoized = 1;
>  
>  		my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
> -		mkpath([$cache_path]) unless -d $cache_path;
> -
> -		tie my %lookup_svn_merge_cache => 'Memoize::Storable',
[...]
> -		;
> +		my $do_memoization = sub {
> +			mkpath([$cache_path]) unless -d $cache_path;
[...]
> +			;
> +		};
> +
> +		if (not eval {
> +			$do_memoization->();
> +			1;
> +		}) {
> +			my $err = $@ || "Zombie error"; # "Zombie error" to catch clobbered $@ in buggy destructors
> +			die $err unless -d $cache_path;
> +			print STDERR "Discarding cache and trying again ($@)\n";
> +			rmtree([$cache_path]);
> +			$do_memoization->();
> +		}

Thank you thank you thank you.

Okay, time for nitpicks:

 1) Would it be possible to lift this do_memoization() to a toplevel sub?
    I suspect that could make the code a little easier to read.

 2) Is it important to discard the cache for _all_ errors, instead
    of just corruption and "is not compatible" errors?  Rebuilding the
    cache is not cheap, and I am afraid of effects like repeatedly
    discarding the cache only to rebuild it again due to a typo in
    git-svn.perl or an out-of-memory condition.

 3) The line with "Zombie error" is very long --- I guess putting the
    comment on the line before would help.

 4) The series would be clearer imho as a single patch that includes
    both the fix and tests.

Eric, what do you think?

Thanks again, :)
Jonathan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] git-svn: Destroy the cache when we fail to read it
  2011-08-23  2:27 ` [PATCH] git-svn: Destroy the cache when we fail to read it Jonathan Nieder
@ 2011-08-23  2:36   ` Jonathan Nieder
  2011-08-23  5:51   ` Jason Gross
  1 sibling, 0 replies; 26+ messages in thread
From: Jonathan Nieder @ 2011-08-23  2:36 UTC (permalink / raw)
  To: Jason Gross; +Cc: git, Eric Wong

Jonathan Nieder wrote:

> [This patch teaches "git svn" to invalidate caches when they
>  fail to load, for example because the endianness or size of
>  some type changed, which is common in the perl 5.6 -> 5.8
>  upgrade.]

Erm, actually it showed up in the perl 5.10 -> 5.12 upgrade in Debian,
due to use of -Duse64bitint when compiling the latter and not the
former[1].

To summarize:

 - "$Storable::interwork_56_64bit = 1;" can be used to work around
   that particular brand of breakage;

 - The interwork_* option does not take care of similar cases in which
   a git-svn repository is generated on one machine and read by
   another[2], though.

Hence my happiness at the arrival of Jason's patch.

[1] http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=32;bug=618875
[2] http://bugs.debian.org/cgi-bin/bugreport.cgi?msg=20;bug=587650

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] git-svn: Destroy the cache when we fail to read it
  2011-08-23  2:27 ` [PATCH] git-svn: Destroy the cache when we fail to read it Jonathan Nieder
  2011-08-23  2:36   ` Jonathan Nieder
@ 2011-08-23  5:51   ` Jason Gross
  1 sibling, 0 replies; 26+ messages in thread
From: Jason Gross @ 2011-08-23  5:51 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Eric Wong

1) Sure.  I'm not sure whether I should make $cache_path toplevel, or
a parameter of do_memoization.  I'm thinking the latter, just in case
$ENV{GIT_DIR} changes between calls to
memoize_svn_mergeinfo_functions?

2) I'm not sure if there's a clean/robust way to detect cache
corruption; if it's completely corrupted, Storable complains about not
finding the right magic number, but I could imagine things being
corrupted in more subtle ways that give rise to other Storable errors.
As for your worry: I don't know the code well enough to have git-svn
recover from some other error in memoization; if there's an error,
that means that it can't read the cache correctly (either in tieing
the hashes, or in memoizing the functions).  The other alternative
that I see is to die.  Is this preferable to repeatedly discarding the
cache?

3) Will fix.

4) Will do.

I'll send a new patch in once I hear back from you about 1 and 2.

-Jason

On Mon, Aug 22, 2011 at 10:27 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>
> (+cc: Eric Wong)
> Hi Jason,
>
> Jason Gross wrote:
>
> [This patch teaches "git svn" to invalidate caches when they
>  fail to load, for example because the endianness or size of
>  some type changed, which is common in the perl 5.6 -> 5.8
>  upgrade.]
>
> > http://lists.debian.org/debian-perl/2011/05/msg00023.html and
> > http://lists.debian.org/debian-perl/2011/05/msg00026.html).
> [...]
> > --- a/git-svn.perl
> > +++ b/git-svn.perl
> > @@ -1680,7 +1680,7 @@ use vars qw/$default_repo_id $default_ref_id $_no_metadata $_follow_parent
> >              $_use_svnsync_props $no_reuse_existing $_minimize_url
> >           $_use_log_author $_add_author_from $_localtime/;
> >  use Carp qw/croak/;
> > -use File::Path qw/mkpath/;
> > +use File::Path qw/mkpath rmtree/;
> >  use File::Copy qw/copy/;
> >  use IPC::Open3;
> >  use Memoize;  # core since 5.8.0, Jul 2002
> > @@ -3198,28 +3198,41 @@ sub has_no_changes {
> >               $memoized = 1;
> >
> >               my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
> > -             mkpath([$cache_path]) unless -d $cache_path;
> > -
> > -             tie my %lookup_svn_merge_cache => 'Memoize::Storable',
> [...]
> > -             ;
> > +             my $do_memoization = sub {
> > +                     mkpath([$cache_path]) unless -d $cache_path;
> [...]
> > +                     ;
> > +             };
> > +
> > +             if (not eval {
> > +                     $do_memoization->();
> > +                     1;
> > +             }) {
> > +                     my $err = $@ || "Zombie error"; # "Zombie error" to catch clobbered $@ in buggy destructors
> > +                     die $err unless -d $cache_path;
> > +                     print STDERR "Discarding cache and trying again ($@)\n";
> > +                     rmtree([$cache_path]);
> > +                     $do_memoization->();
> > +             }
>
> Thank you thank you thank you.
>
> Okay, time for nitpicks:
>
>  1) Would it be possible to lift this do_memoization() to a toplevel sub?
>    I suspect that could make the code a little easier to read.
>
>  2) Is it important to discard the cache for _all_ errors, instead
>    of just corruption and "is not compatible" errors?  Rebuilding the
>    cache is not cheap, and I am afraid of effects like repeatedly
>    discarding the cache only to rebuild it again due to a typo in
>    git-svn.perl or an out-of-memory condition.
>
>  3) The line with "Zombie error" is very long --- I guess putting the
>    comment on the line before would help.
>
>  4) The series would be clearer imho as a single patch that includes
>    both the fix and tests.
>
> Eric, what do you think?
>
> Thanks again, :)
> Jonathan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] git-svn: Destroy the cache when we fail to read it
  2011-08-22  2:17 [PATCH] git-svn: Destroy the cache when we fail to read it Jason Gross
                   ` (2 preceding siblings ...)
  2011-08-23  2:27 ` [PATCH] git-svn: Destroy the cache when we fail to read it Jonathan Nieder
@ 2011-08-23  8:15 ` Eric Wong
  2011-08-23 17:05   ` Junio C Hamano
  3 siblings, 1 reply; 26+ messages in thread
From: Eric Wong @ 2011-08-23  8:15 UTC (permalink / raw)
  To: Jason Gross; +Cc: git, Jonathan Nieder

Jason Gross <jgross@MIT.EDU> wrote:
>  		my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";

Can we make the caches sharable by encoding variables like
$Config{use64bitint} and $Storable::VERSION into $cache_path?

Something like this (untested):

	use Config;

	my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
	$cache_path .= "$Config{use64bitint}.$Storable::VERSION/";

We'll blow everybody's cache away once during the git-svn upgrade, but
in the future people will be able to share the same working tree between
different versions of perl/Storable/whatnot without needing extra code
to detect croaks, different build options, and nuking each other's
caches.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] git-svn: Destroy the cache when we fail to read it
  2011-08-23  8:15 ` Eric Wong
@ 2011-08-23 17:05   ` Junio C Hamano
  2011-08-23 19:58     ` Eric Wong
  2012-05-27 19:25     ` [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible Jonathan Nieder
  0 siblings, 2 replies; 26+ messages in thread
From: Junio C Hamano @ 2011-08-23 17:05 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jason Gross, git, Jonathan Nieder

Eric Wong <normalperson@yhbt.net> writes:

> Jason Gross <jgross@MIT.EDU> wrote:
>>  		my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
>
> Can we make the caches sharable by encoding variables like
> $Config{use64bitint} and $Storable::VERSION into $cache_path?
>
> Something like this (untested):
>
> 	use Config;
>
> 	my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
> 	$cache_path .= "$Config{use64bitint}.$Storable::VERSION/";
>
> We'll blow everybody's cache away once during the git-svn upgrade, but
> in the future people will be able to share the same working tree between
> different versions of perl/Storable/whatnot without needing extra code
> to detect croaks, different build options, and nuking each other's
> caches.

Meaning multiple directories and people share with those with systems with
similar characteristics?  It certainly is better than silently getting
confused or barfing when reading 32LE data on 64BE box, and presumably it
would be easy to implement, but is that the best we can do with memoize?

I am wondering if memoize can be told to use a platform independent
serializer backend that is reasonably efficient.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] git-svn: Destroy the cache when we fail to read it
  2011-08-23 17:05   ` Junio C Hamano
@ 2011-08-23 19:58     ` Eric Wong
  2012-05-27 19:25     ` [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible Jonathan Nieder
  1 sibling, 0 replies; 26+ messages in thread
From: Eric Wong @ 2011-08-23 19:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jason Gross, git, Jonathan Nieder

Junio C Hamano <gitster@pobox.com> wrote:
> Eric Wong <normalperson@yhbt.net> writes:
> > Jason Gross <jgross@MIT.EDU> wrote:
> >>  		my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
> >
> > Can we make the caches sharable by encoding variables like
> > $Config{use64bitint} and $Storable::VERSION into $cache_path?
> >
> > Something like this (untested):
> >
> > 	use Config;
> >
> > 	my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
> > 	$cache_path .= "$Config{use64bitint}.$Storable::VERSION/";
> >
> > We'll blow everybody's cache away once during the git-svn upgrade, but
> > in the future people will be able to share the same working tree between
> > different versions of perl/Storable/whatnot without needing extra code
> > to detect croaks, different build options, and nuking each other's
> > caches.
> 
> Meaning multiple directories and people share with those with systems with
> similar characteristics?

Yes.

> It certainly is better than silently getting confused or barfing when
> reading 32LE data on 64BE box, and presumably it would be easy to
> implement, but is that the best we can do with memoize?
> 
> I am wondering if memoize can be told to use a platform independent
> serializer backend that is reasonably efficient.

Storable nstore only guarantees endian-neutrality.  JSON/YAML/XML are
all platform-neutral options but (AFAIK) none are standard parts of all
Perl versions we support.  There's Data::Dumper + eval which should
work, but a Memoize::* interface needs to be implemented/imported.

-- 
Eric Wong

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible
  2011-08-23 17:05   ` Junio C Hamano
  2011-08-23 19:58     ` Eric Wong
@ 2012-05-27 19:25     ` Jonathan Nieder
  2012-05-27 19:48       ` [RFC/PATCH 2/1] fixup! " Jonathan Nieder
  2012-05-27 20:14       ` [RFC/PATCH] " Eric Wong
  1 sibling, 2 replies; 26+ messages in thread
From: Jonathan Nieder @ 2012-05-27 19:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Wong, Jason Gross, git

In August, 2011, Junio C Hamano wrote:
> Eric Wong <normalperson@yhbt.net> writes:

>> Something like this (untested):
>>
>> 	use Config;
>>
>> 	my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
>> 	$cache_path .= "$Config{use64bitint}.$Storable::VERSION/";
[...]
> I am wondering if memoize can be told to use a platform independent
> serializer backend that is reasonably efficient.

Here is a rough sketch.  It uses YAML::Any.

Other choices were considered:

 - thawing data from Data::Dumper involves "eval".  Doing that
   without creating a security risk is fussy

 - the JSON API only works on scalars in memory and doesn't provide
   a standard way to serialize straight to disk

YAML::Any is reasonably fast and has a pleasant API.  In most backends,
LoadFile() seems to read the entire file into a scalar anyway and
convert it as a second step, but having an interface that allows the
deserialization to happen on the fly without a temporary is still a
comfort.

YAML::Any is not a core perl module.  This patch uses it when and only
when it is available.  Installations without that module fall back to
using Storable with all its quirks, keeping their cache files in

	.git/svn/.caches/*.db

Installations with YAML peacefully coexist by keeping a new set of
cache files in

	.git/svn/.caches/*.yaml.

In most cases, switching between is a one-time thing, so it doesn't
seem worth the effort to copy over the existing caches.

My personal motivation is the ability to upgrade or downgrade perl or
change perl's use64bitint compile-time parameter without having to
invalidate caches or (worse, what users unaware of .git/svn/.caches
would probably do) reclone repositories.  The Storable format is not
stable enough for forward-compatibility.

Seems to work.  I haven't run any benchmarks.  

Thoughts?  Improvements?

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 git-svn.perl |  126 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 120 insertions(+), 6 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index c84842ff..96b6046a 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -2039,6 +2039,101 @@ sub gc_directory {
 	}
 }
 
+package Git::SVN::Memoize::YAML;
+use warnings;
+use strict;
+use YAML::Any ();
+
+# based on Memoize::Storable.
+
+sub TIEHASH {
+	my $package = shift;
+	my $filename = shift;
+	my $truehash = (-e $filename) ? YAML::Any::LoadFile($filename) : {};
+	my $self = {FILENAME => $filename, H => $truehash};
+	bless $self => $package;
+}
+
+sub STORE {
+	my $self = shift;
+	$self->{H}{$_[0]} = $_[1];
+}
+
+sub FETCH {
+	my $self = shift;
+	$self->{H}{$_[0]};
+}
+
+sub EXISTS {
+	my $self = shift;
+	exists $self->{H}{$_[0]};
+}
+
+sub DESTROY {
+	my $self = shift;
+	YAML::Any::DumpFile($self->{FILENAME}, $self->{H});
+}
+
+sub SCALAR {
+	my $self = shift;
+	scalar(%{$self->{H}});
+}
+
+sub FIRSTKEY {
+	'Fake hash from Memoize::YAML';
+}
+
+sub NEXTKEY {
+	undef;
+}
+
+1;
+
+=head1 NAME
+
+Git::SVN::Memoize::YAML - store Memoized data in YAML format
+
+=head1 SYNOPSIS
+
+    use Memoize;
+    use Git::SVN::Memoize::YAML;
+
+    tie my %cache => 'Memoize::YAML', $filename;
+    memoize('slow_function', SCALAR_CACHE => [HASH => \%cache]);
+    slow_function(arguments);
+
+=head1 DESCRIPTION
+
+This module provides a class that can be used to tie a hash to a
+YAML file.  The file is read when the hash is initialized and
+rewritten when the hash is destroyed.
+
+The intent is to allow L<Memoize> to back its cache with a file in
+YAML format, just like L<Memoize::Storable> allows L<Memoize> to
+back its cache with a file in Storable format.  Unlike the Storable
+format, the YAML format is platform-independent and fairly stable.
+
+=head1 DIAGNOSTICS
+
+See L<YAML::Any>.
+
+=head1 DEPENDENCIES
+
+L<YAML::Any> from CPAN.
+
+=head1 INCOMPATIBILITIES
+
+None reported.
+
+=head1 BUGS
+
+The entire cache is read into a Perl hash when loading the file,
+so this is not very scalable.
+
+Error handling is awkward.
+=cut
+
+
 package Git::SVN;
 use strict;
 use warnings;
@@ -2056,6 +2151,10 @@ use Time::Local;
 use Memoize;  # core since 5.8.0, Jul 2002
 use Memoize::Storable;
 use POSIX qw(:signal_h);
+my $can_use_yaml;
+BEGIN {
+	$can_use_yaml = eval { require YAML::Any; 1};
+}
 
 my ($_gc_nr, $_gc_period);
 
@@ -3578,6 +3677,17 @@ sub has_no_changes {
 		command_oneline("rev-parse", "$commit~1^{tree}"));
 }
 
+sub tie_for_persistent_memoization {
+	my $hash = shift;
+	my $path = shift;
+
+	if ($can_use_yaml) {
+		tie %$hash => 'Git::SVN::Memoize::YAML', "$path.yaml";
+	} else {
+		tie %$hash => 'Memoize::Storable', "$path.db", 'nstore';
+	}
+}
+
 # The GIT_DIR environment variable is not always set until after the command
 # line arguments are processed, so we can't memoize in a BEGIN block.
 {
@@ -3590,22 +3700,26 @@ sub has_no_changes {
 		my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
 		mkpath([$cache_path]) unless -d $cache_path;
 
-		tie my %lookup_svn_merge_cache => 'Memoize::Storable',
-		    "$cache_path/lookup_svn_merge.db", 'nstore';
+		my %lookup_svn_merge_cache;
+		my %check_cherry_pick_cache;
+		my %has_no_changes_cache;
+
+		tie_for_persistent_memoization(\%lookup_svn_merge_cache,
+		    "$cache_path/lookup_svn_merge");
 		memoize 'lookup_svn_merge',
 			SCALAR_CACHE => 'FAULT',
 			LIST_CACHE => ['HASH' => \%lookup_svn_merge_cache],
 		;
 
-		tie my %check_cherry_pick_cache => 'Memoize::Storable',
-		    "$cache_path/check_cherry_pick.db", 'nstore';
+		tie_for_persistent_memoization(\%check_cherry_pick_cache,
+		    "$cache_path/check_cherry_pick");
 		memoize 'check_cherry_pick',
 			SCALAR_CACHE => 'FAULT',
 			LIST_CACHE => ['HASH' => \%check_cherry_pick_cache],
 		;
 
-		tie my %has_no_changes_cache => 'Memoize::Storable',
-		    "$cache_path/has_no_changes.db", 'nstore';
+		tie_for_persistent_memoization(\%has_no_changes_cache,
+		    "$cache_path/has_no_changes");
 		memoize 'has_no_changes',
 			SCALAR_CACHE => ['HASH' => \%has_no_changes_cache],
 			LIST_CACHE => 'FAULT',
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC/PATCH 2/1] fixup! git-svn: use YAML format for mergeinfo cache when possible
  2012-05-27 19:25     ` [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible Jonathan Nieder
@ 2012-05-27 19:48       ` Jonathan Nieder
  2012-05-27 20:14       ` [RFC/PATCH] " Eric Wong
  1 sibling, 0 replies; 26+ messages in thread
From: Jonathan Nieder @ 2012-05-27 19:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Wong, Jason Gross, git

Jonathan Nieder wrote:

> YAML::Any is not a core perl module.  This patch uses it when and only
> when it is available.

And here's a patch for squashing in that makes that true.  Sorry for
the confusion.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 git-svn.perl |   12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git c/git-svn.perl i/git-svn.perl
index 96b6046a..1b4ef68f 100755
--- c/git-svn.perl
+++ i/git-svn.perl
@@ -2042,7 +2042,11 @@ sub gc_directory {
 package Git::SVN::Memoize::YAML;
 use warnings;
 use strict;
-use YAML::Any ();
+my $usable;
+BEGIN {
+	$Git::SVN::Memoize::YAML::usable = eval { require YAML::Any; 1 };
+	require YAML::Any if $usable;
+}
 
 # based on Memoize::Storable.
 
@@ -2151,10 +2155,6 @@ use Time::Local;
 use Memoize;  # core since 5.8.0, Jul 2002
 use Memoize::Storable;
 use POSIX qw(:signal_h);
-my $can_use_yaml;
-BEGIN {
-	$can_use_yaml = eval { require YAML::Any; 1};
-}
 
 my ($_gc_nr, $_gc_period);
 
@@ -3681,7 +3681,7 @@ sub tie_for_persistent_memoization {
 	my $hash = shift;
 	my $path = shift;
 
-	if ($can_use_yaml) {
+	if ($Git::SVN::Memoize::YAML::usable) {
 		tie %$hash => 'Git::SVN::Memoize::YAML', "$path.yaml";
 	} else {
 		tie %$hash => 'Memoize::Storable', "$path.db", 'nstore';

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible
  2012-05-27 19:25     ` [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible Jonathan Nieder
  2012-05-27 19:48       ` [RFC/PATCH 2/1] fixup! " Jonathan Nieder
@ 2012-05-27 20:14       ` Eric Wong
  2012-05-28  0:39         ` Jonathan Nieder
  2012-06-09 22:20         ` [PATCH v2 0/3] " Jonathan Nieder
  1 sibling, 2 replies; 26+ messages in thread
From: Eric Wong @ 2012-05-27 20:14 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, Jason Gross, git

Jonathan Nieder <jrnieder@gmail.com> wrote:
> My personal motivation is the ability to upgrade or downgrade perl or
> change perl's use64bitint compile-time parameter without having to
> invalidate caches or (worse, what users unaware of .git/svn/.caches
> would probably do) reclone repositories.  The Storable format is not
> stable enough for forward-compatibility.
> 
> Seems to work.  I haven't run any benchmarks.  
> 
> Thoughts?  Improvements?
> 
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

Seems reasonable, a few comments below.

>  git-svn.perl |  126 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---

> +package Git::SVN::Memoize::YAML;

Can we use this as an opportunity to start splitting git-svn.perl into
multiple .pm files?  Thanks.

> +	my $truehash = (-e $filename) ? YAML::Any::LoadFile($filename) : {};

> +	YAML::Any::DumpFile($self->{FILENAME}, $self->{H});

These should die on errors, right?

> +=head1 BUGS

> +Error handling is awkward.

How so?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible
  2012-05-27 20:14       ` [RFC/PATCH] " Eric Wong
@ 2012-05-28  0:39         ` Jonathan Nieder
  2012-05-28  6:57           ` [RFC/PATCH 0/2] git-svn: give SVN::Git::Fetcher its own file Jonathan Nieder
  2012-05-29  0:25           ` [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible Eric Wong
  2012-06-09 22:20         ` [PATCH v2 0/3] " Jonathan Nieder
  1 sibling, 2 replies; 26+ messages in thread
From: Jonathan Nieder @ 2012-05-28  0:39 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Jason Gross, git

Eric Wong wrote:
> Jonathan Nieder <jrnieder@gmail.com> wrote:

>> +package Git::SVN::Memoize::YAML;
>
> Can we use this as an opportunity to start splitting git-svn.perl into
> multiple .pm files?

Not a bad idea.  I've included an example patch to sanity-check the
approach below.

>> +	my $truehash = (-e $filename) ? YAML::Any::LoadFile($filename) : {};
>
>> +	YAML::Any::DumpFile($self->{FILENAME}, $self->{H});
>
> These should die on errors, right?

At least in YAML::Old, they use Carp::croak.  Maybe something like

	local @CARP_NOT = qw(YAML::Any);

to blame the caller for the error would bring sanity.

>> +=head1 BUGS
>
>> +Error handling is awkward.
>
> How so?

I mostly meant that it's not obvious what the state of %hash is at the
point marked with (*) below:

	if (not eval {
		tie my %hash => 'Foo::Bar', @params;
		1;
	}) {
		my $err = $@ ||
			# a destructor might have clobbered $@
			"Zombie error";
		die $err if worth_dying($err);

		(*) ... try to recover ...
	}

That's not specific to Memoize::YAML, though.  It probably is not
awkward for wizards who know the details. :)

-- >8 --
Subject: git-svn: move Git::SVN::Prompt into its own file

git-svn.perl is very long (around 6500 lines) and although it is
nicely split into modules, some new readers do not even notice --- it
is too distracting to see all this functionality collected in a single
file.

Splitting it into multiple files would make it easier for people
to read individual modules straight through and to experiment with
components separately.

Let's start with Git::SVN::Prompt.  For simplicity, we install this as
a module in the standard search path, just like the existing Git and
Git::I18N modules.  In the process, add a manpage explaining its
interface and that it is not likely to be useful for other projects to
avoid confusion.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Thanks for a thoughtful review.
Jonathan

 git-svn.perl           |  145 +---------------------------------
 perl/Git/SVN/Prompt.pm |  202 ++++++++++++++++++++++++++++++++++++++++++++++++
 perl/Makefile.PL       |    1 +
 3 files changed, 204 insertions(+), 144 deletions(-)
 create mode 100644 perl/Git/SVN/Prompt.pm

diff --git a/git-svn.perl b/git-svn.perl
index 1b4ef68f..ef60b874 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -80,6 +80,7 @@ use File::Find;
 use Getopt::Long qw/:config gnu_getopt no_ignore_case auto_abbrev/;
 use IPC::Open3;
 use Git;
+use Git::SVN::Prompt qw//;
 use Memoize;  # core since 5.8.0, Jul 2002
 
 BEGIN {
@@ -4441,150 +4442,6 @@ sub remove_username {
 	$_[0] =~ s{^([^:]*://)[^@]+@}{$1};
 }
 
-package Git::SVN::Prompt;
-use strict;
-use warnings;
-require SVN::Core;
-use vars qw/$_no_auth_cache $_username/;
-
-sub simple {
-	my ($cred, $realm, $default_username, $may_save, $pool) = @_;
-	$may_save = undef if $_no_auth_cache;
-	$default_username = $_username if defined $_username;
-	if (defined $default_username && length $default_username) {
-		if (defined $realm && length $realm) {
-			print STDERR "Authentication realm: $realm\n";
-			STDERR->flush;
-		}
-		$cred->username($default_username);
-	} else {
-		username($cred, $realm, $may_save, $pool);
-	}
-	$cred->password(_read_password("Password for '" .
-	                               $cred->username . "': ", $realm));
-	$cred->may_save($may_save);
-	$SVN::_Core::SVN_NO_ERROR;
-}
-
-sub ssl_server_trust {
-	my ($cred, $realm, $failures, $cert_info, $may_save, $pool) = @_;
-	$may_save = undef if $_no_auth_cache;
-	print STDERR "Error validating server certificate for '$realm':\n";
-	{
-		no warnings 'once';
-		# All variables SVN::Auth::SSL::* are used only once,
-		# so we're shutting up Perl warnings about this.
-		if ($failures & $SVN::Auth::SSL::UNKNOWNCA) {
-			print STDERR " - The certificate is not issued ",
-			    "by a trusted authority. Use the\n",
-			    "   fingerprint to validate ",
-			    "the certificate manually!\n";
-		}
-		if ($failures & $SVN::Auth::SSL::CNMISMATCH) {
-			print STDERR " - The certificate hostname ",
-			    "does not match.\n";
-		}
-		if ($failures & $SVN::Auth::SSL::NOTYETVALID) {
-			print STDERR " - The certificate is not yet valid.\n";
-		}
-		if ($failures & $SVN::Auth::SSL::EXPIRED) {
-			print STDERR " - The certificate has expired.\n";
-		}
-		if ($failures & $SVN::Auth::SSL::OTHER) {
-			print STDERR " - The certificate has ",
-			    "an unknown error.\n";
-		}
-	} # no warnings 'once'
-	printf STDERR
-	        "Certificate information:\n".
-	        " - Hostname: %s\n".
-	        " - Valid: from %s until %s\n".
-	        " - Issuer: %s\n".
-	        " - Fingerprint: %s\n",
-	        map $cert_info->$_, qw(hostname valid_from valid_until
-	                               issuer_dname fingerprint);
-	my $choice;
-prompt:
-	print STDERR $may_save ?
-	      "(R)eject, accept (t)emporarily or accept (p)ermanently? " :
-	      "(R)eject or accept (t)emporarily? ";
-	STDERR->flush;
-	$choice = lc(substr(<STDIN> || 'R', 0, 1));
-	if ($choice =~ /^t$/i) {
-		$cred->may_save(undef);
-	} elsif ($choice =~ /^r$/i) {
-		return -1;
-	} elsif ($may_save && $choice =~ /^p$/i) {
-		$cred->may_save($may_save);
-	} else {
-		goto prompt;
-	}
-	$cred->accepted_failures($failures);
-	$SVN::_Core::SVN_NO_ERROR;
-}
-
-sub ssl_client_cert {
-	my ($cred, $realm, $may_save, $pool) = @_;
-	$may_save = undef if $_no_auth_cache;
-	print STDERR "Client certificate filename: ";
-	STDERR->flush;
-	chomp(my $filename = <STDIN>);
-	$cred->cert_file($filename);
-	$cred->may_save($may_save);
-	$SVN::_Core::SVN_NO_ERROR;
-}
-
-sub ssl_client_cert_pw {
-	my ($cred, $realm, $may_save, $pool) = @_;
-	$may_save = undef if $_no_auth_cache;
-	$cred->password(_read_password("Password: ", $realm));
-	$cred->may_save($may_save);
-	$SVN::_Core::SVN_NO_ERROR;
-}
-
-sub username {
-	my ($cred, $realm, $may_save, $pool) = @_;
-	$may_save = undef if $_no_auth_cache;
-	if (defined $realm && length $realm) {
-		print STDERR "Authentication realm: $realm\n";
-	}
-	my $username;
-	if (defined $_username) {
-		$username = $_username;
-	} else {
-		print STDERR "Username: ";
-		STDERR->flush;
-		chomp($username = <STDIN>);
-	}
-	$cred->username($username);
-	$cred->may_save($may_save);
-	$SVN::_Core::SVN_NO_ERROR;
-}
-
-sub _read_password {
-	my ($prompt, $realm) = @_;
-	my $password = '';
-	if (exists $ENV{GIT_ASKPASS}) {
-		open(PH, "-|", $ENV{GIT_ASKPASS}, $prompt);
-		$password = <PH>;
-		$password =~ s/[\012\015]//; # \n\r
-		close(PH);
-	} else {
-		print STDERR $prompt;
-		STDERR->flush;
-		require Term::ReadKey;
-		Term::ReadKey::ReadMode('noecho');
-		while (defined(my $key = Term::ReadKey::ReadKey(0))) {
-			last if $key =~ /[\012\015]/; # \n\r
-			$password .= $key;
-		}
-		Term::ReadKey::ReadMode('restore');
-		print STDERR "\n";
-		STDERR->flush;
-	}
-	$password;
-}
-
 package SVN::Git::Fetcher;
 use vars qw/@ISA $_ignore_regex $_preserve_empty_dirs $_placeholder_filename
             @deleted_gpath %added_placeholder $repo_id/;
diff --git a/perl/Git/SVN/Prompt.pm b/perl/Git/SVN/Prompt.pm
new file mode 100644
index 00000000..3a6f8af0
--- /dev/null
+++ b/perl/Git/SVN/Prompt.pm
@@ -0,0 +1,202 @@
+package Git::SVN::Prompt;
+use strict;
+use warnings;
+require SVN::Core;
+use vars qw/$_no_auth_cache $_username/;
+
+sub simple {
+	my ($cred, $realm, $default_username, $may_save, $pool) = @_;
+	$may_save = undef if $_no_auth_cache;
+	$default_username = $_username if defined $_username;
+	if (defined $default_username && length $default_username) {
+		if (defined $realm && length $realm) {
+			print STDERR "Authentication realm: $realm\n";
+			STDERR->flush;
+		}
+		$cred->username($default_username);
+	} else {
+		username($cred, $realm, $may_save, $pool);
+	}
+	$cred->password(_read_password("Password for '" .
+	                               $cred->username . "': ", $realm));
+	$cred->may_save($may_save);
+	$SVN::_Core::SVN_NO_ERROR;
+}
+
+sub ssl_server_trust {
+	my ($cred, $realm, $failures, $cert_info, $may_save, $pool) = @_;
+	$may_save = undef if $_no_auth_cache;
+	print STDERR "Error validating server certificate for '$realm':\n";
+	{
+		no warnings 'once';
+		# All variables SVN::Auth::SSL::* are used only once,
+		# so we're shutting up Perl warnings about this.
+		if ($failures & $SVN::Auth::SSL::UNKNOWNCA) {
+			print STDERR " - The certificate is not issued ",
+			    "by a trusted authority. Use the\n",
+			    "   fingerprint to validate ",
+			    "the certificate manually!\n";
+		}
+		if ($failures & $SVN::Auth::SSL::CNMISMATCH) {
+			print STDERR " - The certificate hostname ",
+			    "does not match.\n";
+		}
+		if ($failures & $SVN::Auth::SSL::NOTYETVALID) {
+			print STDERR " - The certificate is not yet valid.\n";
+		}
+		if ($failures & $SVN::Auth::SSL::EXPIRED) {
+			print STDERR " - The certificate has expired.\n";
+		}
+		if ($failures & $SVN::Auth::SSL::OTHER) {
+			print STDERR " - The certificate has ",
+			    "an unknown error.\n";
+		}
+	} # no warnings 'once'
+	printf STDERR
+	        "Certificate information:\n".
+	        " - Hostname: %s\n".
+	        " - Valid: from %s until %s\n".
+	        " - Issuer: %s\n".
+	        " - Fingerprint: %s\n",
+	        map $cert_info->$_, qw(hostname valid_from valid_until
+	                               issuer_dname fingerprint);
+	my $choice;
+prompt:
+	print STDERR $may_save ?
+	      "(R)eject, accept (t)emporarily or accept (p)ermanently? " :
+	      "(R)eject or accept (t)emporarily? ";
+	STDERR->flush;
+	$choice = lc(substr(<STDIN> || 'R', 0, 1));
+	if ($choice =~ /^t$/i) {
+		$cred->may_save(undef);
+	} elsif ($choice =~ /^r$/i) {
+		return -1;
+	} elsif ($may_save && $choice =~ /^p$/i) {
+		$cred->may_save($may_save);
+	} else {
+		goto prompt;
+	}
+	$cred->accepted_failures($failures);
+	$SVN::_Core::SVN_NO_ERROR;
+}
+
+sub ssl_client_cert {
+	my ($cred, $realm, $may_save, $pool) = @_;
+	$may_save = undef if $_no_auth_cache;
+	print STDERR "Client certificate filename: ";
+	STDERR->flush;
+	chomp(my $filename = <STDIN>);
+	$cred->cert_file($filename);
+	$cred->may_save($may_save);
+	$SVN::_Core::SVN_NO_ERROR;
+}
+
+sub ssl_client_cert_pw {
+	my ($cred, $realm, $may_save, $pool) = @_;
+	$may_save = undef if $_no_auth_cache;
+	$cred->password(_read_password("Password: ", $realm));
+	$cred->may_save($may_save);
+	$SVN::_Core::SVN_NO_ERROR;
+}
+
+sub username {
+	my ($cred, $realm, $may_save, $pool) = @_;
+	$may_save = undef if $_no_auth_cache;
+	if (defined $realm && length $realm) {
+		print STDERR "Authentication realm: $realm\n";
+	}
+	my $username;
+	if (defined $_username) {
+		$username = $_username;
+	} else {
+		print STDERR "Username: ";
+		STDERR->flush;
+		chomp($username = <STDIN>);
+	}
+	$cred->username($username);
+	$cred->may_save($may_save);
+	$SVN::_Core::SVN_NO_ERROR;
+}
+
+sub _read_password {
+	my ($prompt, $realm) = @_;
+	my $password = '';
+	if (exists $ENV{GIT_ASKPASS}) {
+		open(PH, "-|", $ENV{GIT_ASKPASS}, $prompt);
+		$password = <PH>;
+		$password =~ s/[\012\015]//; # \n\r
+		close(PH);
+	} else {
+		print STDERR $prompt;
+		STDERR->flush;
+		require Term::ReadKey;
+		Term::ReadKey::ReadMode('noecho');
+		while (defined(my $key = Term::ReadKey::ReadKey(0))) {
+			last if $key =~ /[\012\015]/; # \n\r
+			$password .= $key;
+		}
+		Term::ReadKey::ReadMode('restore');
+		print STDERR "\n";
+		STDERR->flush;
+	}
+	$password;
+}
+
+1;
+__END__
+
+Git::SVN::Prompt - authentication callbacks for git-svn
+
+=head1 SYNOPSIS
+
+    use Git::SVN::Prompt qw(simple ssl_client_cert ssl_client_cert_pw
+                            ssl_server_trust username);
+    use SVN::Client ();
+
+    my $cached_simple = SVN::Client::get_simple_provider();
+    my $git_simple = SVN::Client::get_simple_prompt_provider(\&simple, 2);
+    my $cached_ssl = SVN::Client::get_ssl_server_trust_file_provider();
+    my $git_ssl = SVN::Client::get_ssl_server_trust_prompt_provider(
+        \&ssl_server_trust);
+    my $cached_cert = SVN::Client::get_ssl_client_cert_file_provider();
+    my $git_cert = SVN::Client::get_ssl_client_cert_prompt_provider(
+        \&ssl_client_cert, 2);
+    my $cached_cert_pw = SVN::Client::get_ssl_client_cert_pw_file_provider();
+    my $git_cert_pw = SVN::Client::get_ssl_client_cert_pw_prompt_provider(
+        \&ssl_client_cert_pw, 2);
+    my $cached_username = SVN::Client::get_username_provider();
+    my $git_username = SVN::Client::get_username_prompt_provider(
+        \&username, 2);
+
+    my $ctx = new SVN::Client(
+        auth => [
+            $cached_simple, $git_simple,
+            $cached_ssl, $git_ssl,
+            $cached_cert, $git_cert,
+            $cached_cert_pw, $git_cert_pw,
+            $cached_username, $git_username
+        ]);
+
+=head1 DESCRIPTION
+
+This module is an implementation detail of the "git svn" command.
+It implements git-svn's authentication policy.  Do not use it unless
+you are developing git-svn.
+
+The interface will change as git-svn evolves.
+
+=head1 DEPENDENCIES
+
+L<SVN::Core>.
+
+=head1 SEE ALSO
+
+L<SVN::Client>.
+
+=head1 INCOMPATIBILITIES
+
+None reported.
+
+=head1 BUGS
+
+None.
diff --git a/perl/Makefile.PL b/perl/Makefile.PL
index 456d45bf..4d8e31d2 100644
--- a/perl/Makefile.PL
+++ b/perl/Makefile.PL
@@ -27,6 +27,7 @@ MAKE_FRAG
 my %pm = (
 	'Git.pm' => '$(INST_LIBDIR)/Git.pm',
 	'Git/I18N.pm' => '$(INST_LIBDIR)/Git/I18N.pm',
+	'Git/SVN/Prompt.pm' => '$(INST_LIBDIR)/Git/SVN/Prompt.pm',
 );
 
 # We come with our own bundled Error.pm. It's not in the set of default
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [RFC/PATCH 0/2] git-svn: give SVN::Git::Fetcher its own file
  2012-05-28  0:39         ` Jonathan Nieder
@ 2012-05-28  6:57           ` Jonathan Nieder
  2012-05-28  7:00             ` [PATCH 1/2] git-svn: rename SVN::Git::* packages to Git::SVN::* Jonathan Nieder
  2012-05-28  7:03             ` [PATCH 2/2] git-svn: make Git::SVN::Fetcher a separate file Jonathan Nieder
  2012-05-29  0:25           ` [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible Eric Wong
  1 sibling, 2 replies; 26+ messages in thread
From: Jonathan Nieder @ 2012-05-28  6:57 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Jason Gross, git

Jonathan Nieder wrote:
> Eric Wong wrote:

>> Can we use this as an opportunity to start splitting git-svn.perl into
>> multiple .pm files?
>
> Not a bad idea.  I've included an example patch to sanity-check the
> approach below.
[...]
> Let's start with Git::SVN::Prompt.

And here's a more interesting one: SVN::Git::Fetcher.

Jonathan Nieder (2):
  git-svn: rename SVN::Git::* packages to Git::SVN::*
  git-svn: make Git::SVN::Fetcher a separate file

 git-svn.perl            |  551 ++-----------------------------------------
 perl/Git/SVN/Fetcher.pm |  602 +++++++++++++++++++++++++++++++++++++++++++++++
 perl/Makefile.PL        |    1 +
 3 files changed, 627 insertions(+), 527 deletions(-)
 create mode 100644 perl/Git/SVN/Fetcher.pm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/2] git-svn: rename SVN::Git::* packages to Git::SVN::*
  2012-05-28  6:57           ` [RFC/PATCH 0/2] git-svn: give SVN::Git::Fetcher its own file Jonathan Nieder
@ 2012-05-28  7:00             ` Jonathan Nieder
  2012-05-28  7:03             ` [PATCH 2/2] git-svn: make Git::SVN::Fetcher a separate file Jonathan Nieder
  1 sibling, 0 replies; 26+ messages in thread
From: Jonathan Nieder @ 2012-05-28  7:00 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Jason Gross, git

Using names in the Git:: namespace means these cannot conflict with a
hypothetical binding teaching Subversion to interact with git
repositories.

Currently the packages are private to git-svn.perl so the choice of
name isn't likely to make much difference.  This change is mainly
meant as preparation for splitting out the packages in question as
modules on the public search path.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 git-svn.perl |   50 +++++++++++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index ef60b874..d0bcf3f1 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -68,8 +68,8 @@ sub _req_svn {
 }
 my $can_compress = eval { require Compress::Zlib; 1};
 push @Git::SVN::Ra::ISA, 'SVN::Ra';
-push @SVN::Git::Editor::ISA, 'SVN::Delta::Editor';
-push @SVN::Git::Fetcher::ISA, 'SVN::Delta::Editor';
+push @Git::SVN::Editor::ISA, 'SVN::Delta::Editor';
+push @Git::SVN::Fetcher::ISA, 'SVN::Delta::Editor';
 use Carp qw/croak/;
 use Digest::MD5;
 use IO::File qw//;
@@ -89,7 +89,7 @@ BEGIN {
 	foreach (qw/command command_oneline command_noisy command_output_pipe
 	            command_input_pipe command_close_pipe
 	            command_bidi_pipe command_close_bidi_pipe/) {
-		for my $package ( qw(SVN::Git::Editor SVN::Git::Fetcher
+		for my $package ( qw(Git::SVN::Editor Git::SVN::Fetcher
 			Git::SVN::Migration Git::SVN::Log Git::SVN),
 			__PACKAGE__) {
 			*{"${package}::$_"} = \&{"Git::$_"};
@@ -111,12 +111,12 @@ my ($_stdin, $_help, $_edit,
 	$_prefix, $_no_checkout, $_url, $_verbose,
 	$_git_format, $_commit_url, $_tag, $_merge_info, $_interactive);
 $Git::SVN::_follow_parent = 1;
-$SVN::Git::Fetcher::_placeholder_filename = ".gitignore";
+$Git::SVN::Fetcher::_placeholder_filename = ".gitignore";
 $_q ||= 0;
 my %remote_opts = ( 'username=s' => \$Git::SVN::Prompt::_username,
                     'config-dir=s' => \$Git::SVN::Ra::config_dir,
                     'no-auth-cache' => \$Git::SVN::Prompt::_no_auth_cache,
-                    'ignore-paths=s' => \$SVN::Git::Fetcher::_ignore_regex,
+                    'ignore-paths=s' => \$Git::SVN::Fetcher::_ignore_regex,
                     'ignore-refs=s' => \$Git::SVN::Ra::_ignore_refs_regex );
 my %fc_opts = ( 'follow-parent|follow!' => \$Git::SVN::_follow_parent,
 		'authors-file|A=s' => \$_authors,
@@ -149,10 +149,10 @@ my %init_opts = ( 'template=s' => \$_template, 'shared:s' => \$_shared,
 		  'rewrite-uuid=s' => sub { $icv{rewriteUUID} = $_[1] },
                   %remote_opts );
 my %cmt_opts = ( 'edit|e' => \$_edit,
-		'rmdir' => \$SVN::Git::Editor::_rmdir,
-		'find-copies-harder' => \$SVN::Git::Editor::_find_copies_harder,
-		'l=i' => \$SVN::Git::Editor::_rename_limit,
-		'copy-similarity|C=i'=> \$SVN::Git::Editor::_cp_similarity
+		'rmdir' => \$Git::SVN::Editor::_rmdir,
+		'find-copies-harder' => \$Git::SVN::Editor::_find_copies_harder,
+		'l=i' => \$Git::SVN::Editor::_rename_limit,
+		'copy-similarity|C=i'=> \$Git::SVN::Editor::_cp_similarity
 );
 
 my %cmd = (
@@ -164,9 +164,9 @@ my %cmd = (
 	clone => [ \&cmd_clone, "Initialize and fetch revisions",
 			{ 'revision|r=s' => \$_revision,
 			  'preserve-empty-dirs' =>
-				\$SVN::Git::Fetcher::_preserve_empty_dirs,
+				\$Git::SVN::Fetcher::_preserve_empty_dirs,
 			  'placeholder-filename=s' =>
-				\$SVN::Git::Fetcher::_placeholder_filename,
+				\$Git::SVN::Fetcher::_placeholder_filename,
 			   %fc_opts, %init_opts } ],
 	init => [ \&cmd_init, "Initialize a repo for tracking" .
 			  " (requires URL argument)",
@@ -464,15 +464,15 @@ sub do_git_init_db {
 		command_noisy('config', "$pfx.$i", $icv{$i});
 		$set = $i;
 	}
-	my $ignore_paths_regex = \$SVN::Git::Fetcher::_ignore_regex;
+	my $ignore_paths_regex = \$Git::SVN::Fetcher::_ignore_regex;
 	command_noisy('config', "$pfx.ignore-paths", $$ignore_paths_regex)
 		if defined $$ignore_paths_regex;
 	my $ignore_refs_regex = \$Git::SVN::Ra::_ignore_refs_regex;
 	command_noisy('config', "$pfx.ignore-refs", $$ignore_refs_regex)
 		if defined $$ignore_refs_regex;
 
-	if (defined $SVN::Git::Fetcher::_preserve_empty_dirs) {
-		my $fname = \$SVN::Git::Fetcher::_placeholder_filename;
+	if (defined $Git::SVN::Fetcher::_preserve_empty_dirs) {
+		my $fname = \$Git::SVN::Fetcher::_placeholder_filename;
 		command_noisy('config', "$pfx.preserve-empty-dirs", 'true');
 		command_noisy('config', "$pfx.placeholder-filename", $$fname);
 	}
@@ -942,7 +942,7 @@ sub cmd_dcommit {
 			                },
 					mergeinfo => $_merge_info,
 			                svn_path => '');
-			if (!SVN::Git::Editor->new(\%ed_opts)->apply_diff) {
+			if (!Git::SVN::Editor->new(\%ed_opts)->apply_diff) {
 				print "No changes\n$d~1 == $d\n";
 			} elsif ($parents->{$d} && @{$parents->{$d}}) {
 				$gs->{inject_parents_dcommit}->{$cmt_rev} =
@@ -1066,8 +1066,8 @@ sub cmd_branch {
 		            " with the --destination argument.\n";
 		}
 		foreach my $g (@{$allglobs}) {
-			# SVN::Git::Editor could probably be moved to Git.pm..
-			my $re = SVN::Git::Editor::glob2pat($g->{path}->{left});
+			# Git::SVN::Editor could probably be moved to Git.pm..
+			my $re = Git::SVN::Editor::glob2pat($g->{path}->{left});
 			if ($_branch_dest =~ /$re/) {
 				$glob = $g;
 				last;
@@ -1425,7 +1425,7 @@ sub cmd_commit_diff {
 	                tree_b => $tb,
 	                editor_cb => sub { print "Committed r$_[0]\n" },
 	                svn_path => $svn_path );
-	if (!SVN::Git::Editor->new(\%ed_opts)->apply_diff) {
+	if (!Git::SVN::Editor->new(\%ed_opts)->apply_diff) {
 		print "No changes\n$ta == $tb\n";
 	}
 }
@@ -3256,7 +3256,7 @@ sub find_parent_branch {
 			# at the moment), so we can't rely on it
 			$self->{last_rev} = $r0;
 			$self->{last_commit} = $parent;
-			$ed = SVN::Git::Fetcher->new($self, $gs->{path});
+			$ed = Git::SVN::Fetcher->new($self, $gs->{path});
 			$gs->ra->gs_do_switch($r0, $rev, $gs,
 					      $self->full_url, $ed)
 			  or die "SVN connection failed somewhere...\n";
@@ -3274,7 +3274,7 @@ sub find_parent_branch {
 		} else {
 			print STDERR "Following parent with do_update\n"
 			             unless $::_q > 1;
-			$ed = SVN::Git::Fetcher->new($self);
+			$ed = Git::SVN::Fetcher->new($self);
 			$self->ra->gs_do_update($rev, $rev, $self, $ed)
 			  or die "SVN connection failed somewhere...\n";
 		}
@@ -3297,7 +3297,7 @@ sub do_fetch {
 			push @{$log_entry->{parents}}, $lc;
 			return $log_entry;
 		}
-		$ed = SVN::Git::Fetcher->new($self);
+		$ed = Git::SVN::Fetcher->new($self);
 		$last_rev = $self->{last_rev};
 		$ed->{c} = $lc;
 		@parents = ($lc);
@@ -3306,7 +3306,7 @@ sub do_fetch {
 		if (my $log_entry = $self->find_parent_branch($paths, $rev)) {
 			return $log_entry;
 		}
-		$ed = SVN::Git::Fetcher->new($self);
+		$ed = Git::SVN::Fetcher->new($self);
 	}
 	unless ($self->ra->gs_do_update($last_rev, $rev, $self, $ed)) {
 		die "SVN connection failed somewhere...\n";
@@ -4026,7 +4026,7 @@ sub set_tree {
 	                editor_cb => sub {
 			       $self->set_tree_cb($log_entry, $tree, @_) },
 	                svn_path => $self->{path} );
-	if (!SVN::Git::Editor->new(\%ed_opts)->apply_diff) {
+	if (!Git::SVN::Editor->new(\%ed_opts)->apply_diff) {
 		print "No changes\nr$self->{last_rev} = $tree\n";
 	}
 }
@@ -4442,7 +4442,7 @@ sub remove_username {
 	$_[0] =~ s{^([^:]*://)[^@]+@}{$1};
 }
 
-package SVN::Git::Fetcher;
+package Git::SVN::Fetcher;
 use vars qw/@ISA $_ignore_regex $_preserve_empty_dirs $_placeholder_filename
             @deleted_gpath %added_placeholder $repo_id/;
 use strict;
@@ -4945,7 +4945,7 @@ sub stash_placeholder_list {
 	}
 }
 
-package SVN::Git::Editor;
+package Git::SVN::Editor;
 use vars qw/@ISA $_rmdir $_cp_similarity $_find_copies_harder $_rename_limit/;
 use strict;
 use warnings;
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/2] git-svn: make Git::SVN::Fetcher a separate file
  2012-05-28  6:57           ` [RFC/PATCH 0/2] git-svn: give SVN::Git::Fetcher its own file Jonathan Nieder
  2012-05-28  7:00             ` [PATCH 1/2] git-svn: rename SVN::Git::* packages to Git::SVN::* Jonathan Nieder
@ 2012-05-28  7:03             ` Jonathan Nieder
  1 sibling, 0 replies; 26+ messages in thread
From: Jonathan Nieder @ 2012-05-28  7:03 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Jason Gross, git

This patch removes a chunk of code (the Git::SVN::Fetcher consumer of
libsvn's tree delta protocol) from git-svn.perl and documents its
interface so the hurried reader does not have to read that code right
away.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 git-svn.perl            |  507 +--------------------------------------
 perl/Git/SVN/Fetcher.pm |  602 +++++++++++++++++++++++++++++++++++++++++++++++
 perl/Makefile.PL        |    1 +
 3 files changed, 605 insertions(+), 505 deletions(-)
 create mode 100644 perl/Git/SVN/Fetcher.pm

diff --git a/git-svn.perl b/git-svn.perl
index d0bcf3f1..35073dea 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -69,7 +69,6 @@ sub _req_svn {
 my $can_compress = eval { require Compress::Zlib; 1};
 push @Git::SVN::Ra::ISA, 'SVN::Ra';
 push @Git::SVN::Editor::ISA, 'SVN::Delta::Editor';
-push @Git::SVN::Fetcher::ISA, 'SVN::Delta::Editor';
 use Carp qw/croak/;
 use Digest::MD5;
 use IO::File qw//;
@@ -80,6 +79,7 @@ use File::Find;
 use Getopt::Long qw/:config gnu_getopt no_ignore_case auto_abbrev/;
 use IPC::Open3;
 use Git;
+use Git::SVN::Fetcher qw//;
 use Git::SVN::Prompt qw//;
 use Memoize;  # core since 5.8.0, Jul 2002
 
@@ -89,7 +89,7 @@ BEGIN {
 	foreach (qw/command command_oneline command_noisy command_output_pipe
 	            command_input_pipe command_close_pipe
 	            command_bidi_pipe command_close_bidi_pipe/) {
-		for my $package ( qw(Git::SVN::Editor Git::SVN::Fetcher
+		for my $package ( qw(Git::SVN::Editor
 			Git::SVN::Migration Git::SVN::Log Git::SVN),
 			__PACKAGE__) {
 			*{"${package}::$_"} = \&{"Git::$_"};
@@ -4442,509 +4442,6 @@ sub remove_username {
 	$_[0] =~ s{^([^:]*://)[^@]+@}{$1};
 }
 
-package Git::SVN::Fetcher;
-use vars qw/@ISA $_ignore_regex $_preserve_empty_dirs $_placeholder_filename
-            @deleted_gpath %added_placeholder $repo_id/;
-use strict;
-use warnings;
-use Carp qw/croak/;
-use File::Basename qw/dirname/;
-use IO::File qw//;
-
-# file baton members: path, mode_a, mode_b, pool, fh, blob, base
-sub new {
-	my ($class, $git_svn, $switch_path) = @_;
-	my $self = SVN::Delta::Editor->new;
-	bless $self, $class;
-	if (exists $git_svn->{last_commit}) {
-		$self->{c} = $git_svn->{last_commit};
-		$self->{empty_symlinks} =
-		                  _mark_empty_symlinks($git_svn, $switch_path);
-	}
-
-	# some options are read globally, but can be overridden locally
-	# per [svn-remote "..."] section.  Command-line options will *NOT*
-	# override options set in an [svn-remote "..."] section
-	$repo_id = $git_svn->{repo_id};
-	my $k = "svn-remote.$repo_id.ignore-paths";
-	my $v = eval { command_oneline('config', '--get', $k) };
-	$self->{ignore_regex} = $v;
-
-	$k = "svn-remote.$repo_id.preserve-empty-dirs";
-	$v = eval { command_oneline('config', '--get', '--bool', $k) };
-	if ($v && $v eq 'true') {
-		$_preserve_empty_dirs = 1;
-		$k = "svn-remote.$repo_id.placeholder-filename";
-		$v = eval { command_oneline('config', '--get', $k) };
-		$_placeholder_filename = $v;
-	}
-
-	# Load the list of placeholder files added during previous invocations.
-	$k = "svn-remote.$repo_id.added-placeholder";
-	$v = eval { command_oneline('config', '--get-all', $k) };
-	if ($_preserve_empty_dirs && $v) {
-		# command() prints errors to stderr, so we only call it if
-		# command_oneline() succeeded.
-		my @v = command('config', '--get-all', $k);
-		$added_placeholder{ dirname($_) } = $_ foreach @v;
-	}
-
-	$self->{empty} = {};
-	$self->{dir_prop} = {};
-	$self->{file_prop} = {};
-	$self->{absent_dir} = {};
-	$self->{absent_file} = {};
-	$self->{gii} = $git_svn->tmp_index_do(sub { Git::IndexInfo->new });
-	$self->{pathnameencoding} = Git::config('svn.pathnameencoding');
-	$self;
-}
-
-# this uses the Ra object, so it must be called before do_{switch,update},
-# not inside them (when the Git::SVN::Fetcher object is passed) to
-# do_{switch,update}
-sub _mark_empty_symlinks {
-	my ($git_svn, $switch_path) = @_;
-	my $bool = Git::config_bool('svn.brokenSymlinkWorkaround');
-	return {} if (!defined($bool)) || (defined($bool) && ! $bool);
-
-	my %ret;
-	my ($rev, $cmt) = $git_svn->last_rev_commit;
-	return {} unless ($rev && $cmt);
-
-	# allow the warning to be printed for each revision we fetch to
-	# ensure the user sees it.  The user can also disable the workaround
-	# on the repository even while git svn is running and the next
-	# revision fetched will skip this expensive function.
-	my $printed_warning;
-	chomp(my $empty_blob = `git hash-object -t blob --stdin < /dev/null`);
-	my ($ls, $ctx) = command_output_pipe(qw/ls-tree -r -z/, $cmt);
-	local $/ = "\0";
-	my $pfx = defined($switch_path) ? $switch_path : $git_svn->{path};
-	$pfx .= '/' if length($pfx);
-	while (<$ls>) {
-		chomp;
-		s/\A100644 blob $empty_blob\t//o or next;
-		unless ($printed_warning) {
-			print STDERR "Scanning for empty symlinks, ",
-			             "this may take a while if you have ",
-				     "many empty files\n",
-				     "You may disable this with `",
-				     "git config svn.brokenSymlinkWorkaround ",
-				     "false'.\n",
-				     "This may be done in a different ",
-				     "terminal without restarting ",
-				     "git svn\n";
-			$printed_warning = 1;
-		}
-		my $path = $_;
-		my (undef, $props) =
-		               $git_svn->ra->get_file($pfx.$path, $rev, undef);
-		if ($props->{'svn:special'}) {
-			$ret{$path} = 1;
-		}
-	}
-	command_close_pipe($ls, $ctx);
-	\%ret;
-}
-
-# returns true if a given path is inside a ".git" directory
-sub in_dot_git {
-	$_[0] =~ m{(?:^|/)\.git(?:/|$)};
-}
-
-# return value: 0 -- don't ignore, 1 -- ignore
-sub is_path_ignored {
-	my ($self, $path) = @_;
-	return 1 if in_dot_git($path);
-	return 1 if defined($self->{ignore_regex}) &&
-	            $path =~ m!$self->{ignore_regex}!;
-	return 0 unless defined($_ignore_regex);
-	return 1 if $path =~ m!$_ignore_regex!o;
-	return 0;
-}
-
-sub set_path_strip {
-	my ($self, $path) = @_;
-	$self->{path_strip} = qr/^\Q$path\E(\/|$)/ if length $path;
-}
-
-sub open_root {
-	{ path => '' };
-}
-
-sub open_directory {
-	my ($self, $path, $pb, $rev) = @_;
-	{ path => $path };
-}
-
-sub git_path {
-	my ($self, $path) = @_;
-	if (my $enc = $self->{pathnameencoding}) {
-		require Encode;
-		Encode::from_to($path, 'UTF-8', $enc);
-	}
-	if ($self->{path_strip}) {
-		$path =~ s!$self->{path_strip}!! or
-		  die "Failed to strip path '$path' ($self->{path_strip})\n";
-	}
-	$path;
-}
-
-sub delete_entry {
-	my ($self, $path, $rev, $pb) = @_;
-	return undef if $self->is_path_ignored($path);
-
-	my $gpath = $self->git_path($path);
-	return undef if ($gpath eq '');
-
-	# remove entire directories.
-	my ($tree) = (command('ls-tree', '-z', $self->{c}, "./$gpath")
-	                 =~ /\A040000 tree ([a-f\d]{40})\t\Q$gpath\E\0/);
-	if ($tree) {
-		my ($ls, $ctx) = command_output_pipe(qw/ls-tree
-		                                     -r --name-only -z/,
-				                     $tree);
-		local $/ = "\0";
-		while (<$ls>) {
-			chomp;
-			my $rmpath = "$gpath/$_";
-			$self->{gii}->remove($rmpath);
-			print "\tD\t$rmpath\n" unless $::_q;
-		}
-		print "\tD\t$gpath/\n" unless $::_q;
-		command_close_pipe($ls, $ctx);
-	} else {
-		$self->{gii}->remove($gpath);
-		print "\tD\t$gpath\n" unless $::_q;
-	}
-	# Don't add to @deleted_gpath if we're deleting a placeholder file.
-	push @deleted_gpath, $gpath unless $added_placeholder{dirname($path)};
-	$self->{empty}->{$path} = 0;
-	undef;
-}
-
-sub open_file {
-	my ($self, $path, $pb, $rev) = @_;
-	my ($mode, $blob);
-
-	goto out if $self->is_path_ignored($path);
-
-	my $gpath = $self->git_path($path);
-	($mode, $blob) = (command('ls-tree', '-z', $self->{c}, "./$gpath")
-	                     =~ /\A(\d{6}) blob ([a-f\d]{40})\t\Q$gpath\E\0/);
-	unless (defined $mode && defined $blob) {
-		die "$path was not found in commit $self->{c} (r$rev)\n";
-	}
-	if ($mode eq '100644' && $self->{empty_symlinks}->{$path}) {
-		$mode = '120000';
-	}
-out:
-	{ path => $path, mode_a => $mode, mode_b => $mode, blob => $blob,
-	  pool => SVN::Pool->new, action => 'M' };
-}
-
-sub add_file {
-	my ($self, $path, $pb, $cp_path, $cp_rev) = @_;
-	my $mode;
-
-	if (!$self->is_path_ignored($path)) {
-		my ($dir, $file) = ($path =~ m#^(.*?)/?([^/]+)$#);
-		delete $self->{empty}->{$dir};
-		$mode = '100644';
-
-		if ($added_placeholder{$dir}) {
-			# Remove our placeholder file, if we created one.
-			delete_entry($self, $added_placeholder{$dir})
-				unless $path eq $added_placeholder{$dir};
-			delete $added_placeholder{$dir}
-		}
-	}
-
-	{ path => $path, mode_a => $mode, mode_b => $mode,
-	  pool => SVN::Pool->new, action => 'A' };
-}
-
-sub add_directory {
-	my ($self, $path, $cp_path, $cp_rev) = @_;
-	goto out if $self->is_path_ignored($path);
-	my $gpath = $self->git_path($path);
-	if ($gpath eq '') {
-		my ($ls, $ctx) = command_output_pipe(qw/ls-tree
-		                                     -r --name-only -z/,
-				                     $self->{c});
-		local $/ = "\0";
-		while (<$ls>) {
-			chomp;
-			$self->{gii}->remove($_);
-			print "\tD\t$_\n" unless $::_q;
-			push @deleted_gpath, $gpath;
-		}
-		command_close_pipe($ls, $ctx);
-		$self->{empty}->{$path} = 0;
-	}
-	my ($dir, $file) = ($path =~ m#^(.*?)/?([^/]+)$#);
-	delete $self->{empty}->{$dir};
-	$self->{empty}->{$path} = 1;
-
-	if ($added_placeholder{$dir}) {
-		# Remove our placeholder file, if we created one.
-		delete_entry($self, $added_placeholder{$dir});
-		delete $added_placeholder{$dir}
-	}
-
-out:
-	{ path => $path };
-}
-
-sub change_dir_prop {
-	my ($self, $db, $prop, $value) = @_;
-	return undef if $self->is_path_ignored($db->{path});
-	$self->{dir_prop}->{$db->{path}} ||= {};
-	$self->{dir_prop}->{$db->{path}}->{$prop} = $value;
-	undef;
-}
-
-sub absent_directory {
-	my ($self, $path, $pb) = @_;
-	return undef if $self->is_path_ignored($path);
-	$self->{absent_dir}->{$pb->{path}} ||= [];
-	push @{$self->{absent_dir}->{$pb->{path}}}, $path;
-	undef;
-}
-
-sub absent_file {
-	my ($self, $path, $pb) = @_;
-	return undef if $self->is_path_ignored($path);
-	$self->{absent_file}->{$pb->{path}} ||= [];
-	push @{$self->{absent_file}->{$pb->{path}}}, $path;
-	undef;
-}
-
-sub change_file_prop {
-	my ($self, $fb, $prop, $value) = @_;
-	return undef if $self->is_path_ignored($fb->{path});
-	if ($prop eq 'svn:executable') {
-		if ($fb->{mode_b} != 120000) {
-			$fb->{mode_b} = defined $value ? 100755 : 100644;
-		}
-	} elsif ($prop eq 'svn:special') {
-		$fb->{mode_b} = defined $value ? 120000 : 100644;
-	} else {
-		$self->{file_prop}->{$fb->{path}} ||= {};
-		$self->{file_prop}->{$fb->{path}}->{$prop} = $value;
-	}
-	undef;
-}
-
-sub apply_textdelta {
-	my ($self, $fb, $exp) = @_;
-	return undef if $self->is_path_ignored($fb->{path});
-	my $fh = $::_repository->temp_acquire('svn_delta');
-	# $fh gets auto-closed() by SVN::TxDelta::apply(),
-	# (but $base does not,) so dup() it for reading in close_file
-	open my $dup, '<&', $fh or croak $!;
-	my $base = $::_repository->temp_acquire('git_blob');
-
-	if ($fb->{blob}) {
-		my ($base_is_link, $size);
-
-		if ($fb->{mode_a} eq '120000' &&
-		    ! $self->{empty_symlinks}->{$fb->{path}}) {
-			print $base 'link ' or die "print $!\n";
-			$base_is_link = 1;
-		}
-	retry:
-		$size = $::_repository->cat_blob($fb->{blob}, $base);
-		die "Failed to read object $fb->{blob}" if ($size < 0);
-
-		if (defined $exp) {
-			seek $base, 0, 0 or croak $!;
-			my $got = ::md5sum($base);
-			if ($got ne $exp) {
-				my $err = "Checksum mismatch: ".
-				       "$fb->{path} $fb->{blob}\n" .
-				       "expected: $exp\n" .
-				       "     got: $got\n";
-				if ($base_is_link) {
-					warn $err,
-					     "Retrying... (possibly ",
-					     "a bad symlink from SVN)\n";
-					$::_repository->temp_reset($base);
-					$base_is_link = 0;
-					goto retry;
-				}
-				die $err;
-			}
-		}
-	}
-	seek $base, 0, 0 or croak $!;
-	$fb->{fh} = $fh;
-	$fb->{base} = $base;
-	[ SVN::TxDelta::apply($base, $dup, undef, $fb->{path}, $fb->{pool}) ];
-}
-
-sub close_file {
-	my ($self, $fb, $exp) = @_;
-	return undef if $self->is_path_ignored($fb->{path});
-
-	my $hash;
-	my $path = $self->git_path($fb->{path});
-	if (my $fh = $fb->{fh}) {
-		if (defined $exp) {
-			seek($fh, 0, 0) or croak $!;
-			my $got = ::md5sum($fh);
-			if ($got ne $exp) {
-				die "Checksum mismatch: $path\n",
-				    "expected: $exp\n    got: $got\n";
-			}
-		}
-		if ($fb->{mode_b} == 120000) {
-			sysseek($fh, 0, 0) or croak $!;
-			my $rd = sysread($fh, my $buf, 5);
-
-			if (!defined $rd) {
-				croak "sysread: $!\n";
-			} elsif ($rd == 0) {
-				warn "$path has mode 120000",
-				     " but it points to nothing\n",
-				     "converting to an empty file with mode",
-				     " 100644\n";
-				$fb->{mode_b} = '100644';
-			} elsif ($buf ne 'link ') {
-				warn "$path has mode 120000",
-				     " but is not a link\n";
-			} else {
-				my $tmp_fh = $::_repository->temp_acquire(
-					'svn_hash');
-				my $res;
-				while ($res = sysread($fh, my $str, 1024)) {
-					my $out = syswrite($tmp_fh, $str, $res);
-					defined($out) && $out == $res
-						or croak("write ",
-							Git::temp_path($tmp_fh),
-							": $!\n");
-				}
-				defined $res or croak $!;
-
-				($fh, $tmp_fh) = ($tmp_fh, $fh);
-				Git::temp_release($tmp_fh, 1);
-			}
-		}
-
-		$hash = $::_repository->hash_and_insert_object(
-				Git::temp_path($fh));
-		$hash =~ /^[a-f\d]{40}$/ or die "not a sha1: $hash\n";
-
-		Git::temp_release($fb->{base}, 1);
-		Git::temp_release($fh, 1);
-	} else {
-		$hash = $fb->{blob} or die "no blob information\n";
-	}
-	$fb->{pool}->clear;
-	$self->{gii}->update($fb->{mode_b}, $hash, $path) or croak $!;
-	print "\t$fb->{action}\t$path\n" if $fb->{action} && ! $::_q;
-	undef;
-}
-
-sub abort_edit {
-	my $self = shift;
-	$self->{nr} = $self->{gii}->{nr};
-	delete $self->{gii};
-	$self->SUPER::abort_edit(@_);
-}
-
-sub close_edit {
-	my $self = shift;
-
-	if ($_preserve_empty_dirs) {
-		my @empty_dirs;
-
-		# Any entry flagged as empty that also has an associated
-		# dir_prop represents a newly created empty directory.
-		foreach my $i (keys %{$self->{empty}}) {
-			push @empty_dirs, $i if exists $self->{dir_prop}->{$i};
-		}
-
-		# Search for directories that have become empty due subsequent
-		# file deletes.
-		push @empty_dirs, $self->find_empty_directories();
-
-		# Finally, add a placeholder file to each empty directory.
-		$self->add_placeholder_file($_) foreach (@empty_dirs);
-
-		$self->stash_placeholder_list();
-	}
-
-	$self->{git_commit_ok} = 1;
-	$self->{nr} = $self->{gii}->{nr};
-	delete $self->{gii};
-	$self->SUPER::close_edit(@_);
-}
-
-sub find_empty_directories {
-	my ($self) = @_;
-	my @empty_dirs;
-	my %dirs = map { dirname($_) => 1 } @deleted_gpath;
-
-	foreach my $dir (sort keys %dirs) {
-		next if $dir eq ".";
-
-		# If there have been any additions to this directory, there is
-		# no reason to check if it is empty.
-		my $skip_added = 0;
-		foreach my $t (qw/dir_prop file_prop/) {
-			foreach my $path (keys %{ $self->{$t} }) {
-				if (exists $self->{$t}->{dirname($path)}) {
-					$skip_added = 1;
-					last;
-				}
-			}
-			last if $skip_added;
-		}
-		next if $skip_added;
-
-		# Use `git ls-tree` to get the filenames of this directory
-		# that existed prior to this particular commit.
-		my $ls = command('ls-tree', '-z', '--name-only',
-				 $self->{c}, "$dir/");
-		my %files = map { $_ => 1 } split(/\0/, $ls);
-
-		# Remove the filenames that were deleted during this commit.
-		delete $files{$_} foreach (@deleted_gpath);
-
-		# Report the directory if there are no filenames left.
-		push @empty_dirs, $dir unless (scalar %files);
-	}
-	@empty_dirs;
-}
-
-sub add_placeholder_file {
-	my ($self, $dir) = @_;
-	my $path = "$dir/$_placeholder_filename";
-	my $gpath = $self->git_path($path);
-
-	my $fh = $::_repository->temp_acquire($gpath);
-	my $hash = $::_repository->hash_and_insert_object(Git::temp_path($fh));
-	Git::temp_release($fh, 1);
-	$self->{gii}->update('100644', $hash, $gpath) or croak $!;
-
-	# The directory should no longer be considered empty.
-	delete $self->{empty}->{$dir} if exists $self->{empty}->{$dir};
-
-	# Keep track of any placeholder files we create.
-	$added_placeholder{$dir} = $path;
-}
-
-sub stash_placeholder_list {
-	my ($self) = @_;
-	my $k = "svn-remote.$repo_id.added-placeholder";
-	my $v = eval { command_oneline('config', '--get-all', $k) };
-	command_noisy('config', '--unset-all', $k) if $v;
-	foreach (values %added_placeholder) {
-		command_noisy('config', '--add', $k, $_);
-	}
-}
-
 package Git::SVN::Editor;
 use vars qw/@ISA $_rmdir $_cp_similarity $_find_copies_harder $_rename_limit/;
 use strict;
diff --git a/perl/Git/SVN/Fetcher.pm b/perl/Git/SVN/Fetcher.pm
new file mode 100644
index 00000000..4e9c77d7
--- /dev/null
+++ b/perl/Git/SVN/Fetcher.pm
@@ -0,0 +1,602 @@
+package Git::SVN::Fetcher;
+use vars qw/@ISA $_ignore_regex $_preserve_empty_dirs $_placeholder_filename
+            @deleted_gpath %added_placeholder $repo_id/;
+use strict;
+use warnings;
+use SVN::Delta;
+use Carp qw/croak/;
+use File::Basename qw/dirname/;
+use IO::File qw//;
+use Git qw/command command_oneline command_noisy command_output_pipe
+           command_input_pipe command_close_pipe
+           command_bidi_pipe command_close_bidi_pipe/;
+BEGIN {
+	@ISA = qw(SVN::Delta::Editor);
+}
+
+# file baton members: path, mode_a, mode_b, pool, fh, blob, base
+sub new {
+	my ($class, $git_svn, $switch_path) = @_;
+	my $self = SVN::Delta::Editor->new;
+	bless $self, $class;
+	if (exists $git_svn->{last_commit}) {
+		$self->{c} = $git_svn->{last_commit};
+		$self->{empty_symlinks} =
+		                  _mark_empty_symlinks($git_svn, $switch_path);
+	}
+
+	# some options are read globally, but can be overridden locally
+	# per [svn-remote "..."] section.  Command-line options will *NOT*
+	# override options set in an [svn-remote "..."] section
+	$repo_id = $git_svn->{repo_id};
+	my $k = "svn-remote.$repo_id.ignore-paths";
+	my $v = eval { command_oneline('config', '--get', $k) };
+	$self->{ignore_regex} = $v;
+
+	$k = "svn-remote.$repo_id.preserve-empty-dirs";
+	$v = eval { command_oneline('config', '--get', '--bool', $k) };
+	if ($v && $v eq 'true') {
+		$_preserve_empty_dirs = 1;
+		$k = "svn-remote.$repo_id.placeholder-filename";
+		$v = eval { command_oneline('config', '--get', $k) };
+		$_placeholder_filename = $v;
+	}
+
+	# Load the list of placeholder files added during previous invocations.
+	$k = "svn-remote.$repo_id.added-placeholder";
+	$v = eval { command_oneline('config', '--get-all', $k) };
+	if ($_preserve_empty_dirs && $v) {
+		# command() prints errors to stderr, so we only call it if
+		# command_oneline() succeeded.
+		my @v = command('config', '--get-all', $k);
+		$added_placeholder{ dirname($_) } = $_ foreach @v;
+	}
+
+	$self->{empty} = {};
+	$self->{dir_prop} = {};
+	$self->{file_prop} = {};
+	$self->{absent_dir} = {};
+	$self->{absent_file} = {};
+	$self->{gii} = $git_svn->tmp_index_do(sub { Git::IndexInfo->new });
+	$self->{pathnameencoding} = Git::config('svn.pathnameencoding');
+	$self;
+}
+
+# this uses the Ra object, so it must be called before do_{switch,update},
+# not inside them (when the Git::SVN::Fetcher object is passed) to
+# do_{switch,update}
+sub _mark_empty_symlinks {
+	my ($git_svn, $switch_path) = @_;
+	my $bool = Git::config_bool('svn.brokenSymlinkWorkaround');
+	return {} if (!defined($bool)) || (defined($bool) && ! $bool);
+
+	my %ret;
+	my ($rev, $cmt) = $git_svn->last_rev_commit;
+	return {} unless ($rev && $cmt);
+
+	# allow the warning to be printed for each revision we fetch to
+	# ensure the user sees it.  The user can also disable the workaround
+	# on the repository even while git svn is running and the next
+	# revision fetched will skip this expensive function.
+	my $printed_warning;
+	chomp(my $empty_blob = `git hash-object -t blob --stdin < /dev/null`);
+	my ($ls, $ctx) = command_output_pipe(qw/ls-tree -r -z/, $cmt);
+	local $/ = "\0";
+	my $pfx = defined($switch_path) ? $switch_path : $git_svn->{path};
+	$pfx .= '/' if length($pfx);
+	while (<$ls>) {
+		chomp;
+		s/\A100644 blob $empty_blob\t//o or next;
+		unless ($printed_warning) {
+			print STDERR "Scanning for empty symlinks, ",
+			             "this may take a while if you have ",
+				     "many empty files\n",
+				     "You may disable this with `",
+				     "git config svn.brokenSymlinkWorkaround ",
+				     "false'.\n",
+				     "This may be done in a different ",
+				     "terminal without restarting ",
+				     "git svn\n";
+			$printed_warning = 1;
+		}
+		my $path = $_;
+		my (undef, $props) =
+		               $git_svn->ra->get_file($pfx.$path, $rev, undef);
+		if ($props->{'svn:special'}) {
+			$ret{$path} = 1;
+		}
+	}
+	command_close_pipe($ls, $ctx);
+	\%ret;
+}
+
+# returns true if a given path is inside a ".git" directory
+sub in_dot_git {
+	$_[0] =~ m{(?:^|/)\.git(?:/|$)};
+}
+
+# return value: 0 -- don't ignore, 1 -- ignore
+sub is_path_ignored {
+	my ($self, $path) = @_;
+	return 1 if in_dot_git($path);
+	return 1 if defined($self->{ignore_regex}) &&
+	            $path =~ m!$self->{ignore_regex}!;
+	return 0 unless defined($_ignore_regex);
+	return 1 if $path =~ m!$_ignore_regex!o;
+	return 0;
+}
+
+sub set_path_strip {
+	my ($self, $path) = @_;
+	$self->{path_strip} = qr/^\Q$path\E(\/|$)/ if length $path;
+}
+
+sub open_root {
+	{ path => '' };
+}
+
+sub open_directory {
+	my ($self, $path, $pb, $rev) = @_;
+	{ path => $path };
+}
+
+sub git_path {
+	my ($self, $path) = @_;
+	if (my $enc = $self->{pathnameencoding}) {
+		require Encode;
+		Encode::from_to($path, 'UTF-8', $enc);
+	}
+	if ($self->{path_strip}) {
+		$path =~ s!$self->{path_strip}!! or
+		  die "Failed to strip path '$path' ($self->{path_strip})\n";
+	}
+	$path;
+}
+
+sub delete_entry {
+	my ($self, $path, $rev, $pb) = @_;
+	return undef if $self->is_path_ignored($path);
+
+	my $gpath = $self->git_path($path);
+	return undef if ($gpath eq '');
+
+	# remove entire directories.
+	my ($tree) = (command('ls-tree', '-z', $self->{c}, "./$gpath")
+	                 =~ /\A040000 tree ([a-f\d]{40})\t\Q$gpath\E\0/);
+	if ($tree) {
+		my ($ls, $ctx) = command_output_pipe(qw/ls-tree
+		                                     -r --name-only -z/,
+				                     $tree);
+		local $/ = "\0";
+		while (<$ls>) {
+			chomp;
+			my $rmpath = "$gpath/$_";
+			$self->{gii}->remove($rmpath);
+			print "\tD\t$rmpath\n" unless $::_q;
+		}
+		print "\tD\t$gpath/\n" unless $::_q;
+		command_close_pipe($ls, $ctx);
+	} else {
+		$self->{gii}->remove($gpath);
+		print "\tD\t$gpath\n" unless $::_q;
+	}
+	# Don't add to @deleted_gpath if we're deleting a placeholder file.
+	push @deleted_gpath, $gpath unless $added_placeholder{dirname($path)};
+	$self->{empty}->{$path} = 0;
+	undef;
+}
+
+sub open_file {
+	my ($self, $path, $pb, $rev) = @_;
+	my ($mode, $blob);
+
+	goto out if $self->is_path_ignored($path);
+
+	my $gpath = $self->git_path($path);
+	($mode, $blob) = (command('ls-tree', '-z', $self->{c}, "./$gpath")
+	                     =~ /\A(\d{6}) blob ([a-f\d]{40})\t\Q$gpath\E\0/);
+	unless (defined $mode && defined $blob) {
+		die "$path was not found in commit $self->{c} (r$rev)\n";
+	}
+	if ($mode eq '100644' && $self->{empty_symlinks}->{$path}) {
+		$mode = '120000';
+	}
+out:
+	{ path => $path, mode_a => $mode, mode_b => $mode, blob => $blob,
+	  pool => SVN::Pool->new, action => 'M' };
+}
+
+sub add_file {
+	my ($self, $path, $pb, $cp_path, $cp_rev) = @_;
+	my $mode;
+
+	if (!$self->is_path_ignored($path)) {
+		my ($dir, $file) = ($path =~ m#^(.*?)/?([^/]+)$#);
+		delete $self->{empty}->{$dir};
+		$mode = '100644';
+
+		if ($added_placeholder{$dir}) {
+			# Remove our placeholder file, if we created one.
+			delete_entry($self, $added_placeholder{$dir})
+				unless $path eq $added_placeholder{$dir};
+			delete $added_placeholder{$dir}
+		}
+	}
+
+	{ path => $path, mode_a => $mode, mode_b => $mode,
+	  pool => SVN::Pool->new, action => 'A' };
+}
+
+sub add_directory {
+	my ($self, $path, $cp_path, $cp_rev) = @_;
+	goto out if $self->is_path_ignored($path);
+	my $gpath = $self->git_path($path);
+	if ($gpath eq '') {
+		my ($ls, $ctx) = command_output_pipe(qw/ls-tree
+		                                     -r --name-only -z/,
+				                     $self->{c});
+		local $/ = "\0";
+		while (<$ls>) {
+			chomp;
+			$self->{gii}->remove($_);
+			print "\tD\t$_\n" unless $::_q;
+			push @deleted_gpath, $gpath;
+		}
+		command_close_pipe($ls, $ctx);
+		$self->{empty}->{$path} = 0;
+	}
+	my ($dir, $file) = ($path =~ m#^(.*?)/?([^/]+)$#);
+	delete $self->{empty}->{$dir};
+	$self->{empty}->{$path} = 1;
+
+	if ($added_placeholder{$dir}) {
+		# Remove our placeholder file, if we created one.
+		delete_entry($self, $added_placeholder{$dir});
+		delete $added_placeholder{$dir}
+	}
+
+out:
+	{ path => $path };
+}
+
+sub change_dir_prop {
+	my ($self, $db, $prop, $value) = @_;
+	return undef if $self->is_path_ignored($db->{path});
+	$self->{dir_prop}->{$db->{path}} ||= {};
+	$self->{dir_prop}->{$db->{path}}->{$prop} = $value;
+	undef;
+}
+
+sub absent_directory {
+	my ($self, $path, $pb) = @_;
+	return undef if $self->is_path_ignored($path);
+	$self->{absent_dir}->{$pb->{path}} ||= [];
+	push @{$self->{absent_dir}->{$pb->{path}}}, $path;
+	undef;
+}
+
+sub absent_file {
+	my ($self, $path, $pb) = @_;
+	return undef if $self->is_path_ignored($path);
+	$self->{absent_file}->{$pb->{path}} ||= [];
+	push @{$self->{absent_file}->{$pb->{path}}}, $path;
+	undef;
+}
+
+sub change_file_prop {
+	my ($self, $fb, $prop, $value) = @_;
+	return undef if $self->is_path_ignored($fb->{path});
+	if ($prop eq 'svn:executable') {
+		if ($fb->{mode_b} != 120000) {
+			$fb->{mode_b} = defined $value ? 100755 : 100644;
+		}
+	} elsif ($prop eq 'svn:special') {
+		$fb->{mode_b} = defined $value ? 120000 : 100644;
+	} else {
+		$self->{file_prop}->{$fb->{path}} ||= {};
+		$self->{file_prop}->{$fb->{path}}->{$prop} = $value;
+	}
+	undef;
+}
+
+sub apply_textdelta {
+	my ($self, $fb, $exp) = @_;
+	return undef if $self->is_path_ignored($fb->{path});
+	my $fh = $::_repository->temp_acquire('svn_delta');
+	# $fh gets auto-closed() by SVN::TxDelta::apply(),
+	# (but $base does not,) so dup() it for reading in close_file
+	open my $dup, '<&', $fh or croak $!;
+	my $base = $::_repository->temp_acquire('git_blob');
+
+	if ($fb->{blob}) {
+		my ($base_is_link, $size);
+
+		if ($fb->{mode_a} eq '120000' &&
+		    ! $self->{empty_symlinks}->{$fb->{path}}) {
+			print $base 'link ' or die "print $!\n";
+			$base_is_link = 1;
+		}
+	retry:
+		$size = $::_repository->cat_blob($fb->{blob}, $base);
+		die "Failed to read object $fb->{blob}" if ($size < 0);
+
+		if (defined $exp) {
+			seek $base, 0, 0 or croak $!;
+			my $got = ::md5sum($base);
+			if ($got ne $exp) {
+				my $err = "Checksum mismatch: ".
+				       "$fb->{path} $fb->{blob}\n" .
+				       "expected: $exp\n" .
+				       "     got: $got\n";
+				if ($base_is_link) {
+					warn $err,
+					     "Retrying... (possibly ",
+					     "a bad symlink from SVN)\n";
+					$::_repository->temp_reset($base);
+					$base_is_link = 0;
+					goto retry;
+				}
+				die $err;
+			}
+		}
+	}
+	seek $base, 0, 0 or croak $!;
+	$fb->{fh} = $fh;
+	$fb->{base} = $base;
+	[ SVN::TxDelta::apply($base, $dup, undef, $fb->{path}, $fb->{pool}) ];
+}
+
+sub close_file {
+	my ($self, $fb, $exp) = @_;
+	return undef if $self->is_path_ignored($fb->{path});
+
+	my $hash;
+	my $path = $self->git_path($fb->{path});
+	if (my $fh = $fb->{fh}) {
+		if (defined $exp) {
+			seek($fh, 0, 0) or croak $!;
+			my $got = ::md5sum($fh);
+			if ($got ne $exp) {
+				die "Checksum mismatch: $path\n",
+				    "expected: $exp\n    got: $got\n";
+			}
+		}
+		if ($fb->{mode_b} == 120000) {
+			sysseek($fh, 0, 0) or croak $!;
+			my $rd = sysread($fh, my $buf, 5);
+
+			if (!defined $rd) {
+				croak "sysread: $!\n";
+			} elsif ($rd == 0) {
+				warn "$path has mode 120000",
+				     " but it points to nothing\n",
+				     "converting to an empty file with mode",
+				     " 100644\n";
+				$fb->{mode_b} = '100644';
+			} elsif ($buf ne 'link ') {
+				warn "$path has mode 120000",
+				     " but is not a link\n";
+			} else {
+				my $tmp_fh = $::_repository->temp_acquire(
+					'svn_hash');
+				my $res;
+				while ($res = sysread($fh, my $str, 1024)) {
+					my $out = syswrite($tmp_fh, $str, $res);
+					defined($out) && $out == $res
+						or croak("write ",
+							Git::temp_path($tmp_fh),
+							": $!\n");
+				}
+				defined $res or croak $!;
+
+				($fh, $tmp_fh) = ($tmp_fh, $fh);
+				Git::temp_release($tmp_fh, 1);
+			}
+		}
+
+		$hash = $::_repository->hash_and_insert_object(
+				Git::temp_path($fh));
+		$hash =~ /^[a-f\d]{40}$/ or die "not a sha1: $hash\n";
+
+		Git::temp_release($fb->{base}, 1);
+		Git::temp_release($fh, 1);
+	} else {
+		$hash = $fb->{blob} or die "no blob information\n";
+	}
+	$fb->{pool}->clear;
+	$self->{gii}->update($fb->{mode_b}, $hash, $path) or croak $!;
+	print "\t$fb->{action}\t$path\n" if $fb->{action} && ! $::_q;
+	undef;
+}
+
+sub abort_edit {
+	my $self = shift;
+	$self->{nr} = $self->{gii}->{nr};
+	delete $self->{gii};
+	$self->SUPER::abort_edit(@_);
+}
+
+sub close_edit {
+	my $self = shift;
+
+	if ($_preserve_empty_dirs) {
+		my @empty_dirs;
+
+		# Any entry flagged as empty that also has an associated
+		# dir_prop represents a newly created empty directory.
+		foreach my $i (keys %{$self->{empty}}) {
+			push @empty_dirs, $i if exists $self->{dir_prop}->{$i};
+		}
+
+		# Search for directories that have become empty due subsequent
+		# file deletes.
+		push @empty_dirs, $self->find_empty_directories();
+
+		# Finally, add a placeholder file to each empty directory.
+		$self->add_placeholder_file($_) foreach (@empty_dirs);
+
+		$self->stash_placeholder_list();
+	}
+
+	$self->{git_commit_ok} = 1;
+	$self->{nr} = $self->{gii}->{nr};
+	delete $self->{gii};
+	$self->SUPER::close_edit(@_);
+}
+
+sub find_empty_directories {
+	my ($self) = @_;
+	my @empty_dirs;
+	my %dirs = map { dirname($_) => 1 } @deleted_gpath;
+
+	foreach my $dir (sort keys %dirs) {
+		next if $dir eq ".";
+
+		# If there have been any additions to this directory, there is
+		# no reason to check if it is empty.
+		my $skip_added = 0;
+		foreach my $t (qw/dir_prop file_prop/) {
+			foreach my $path (keys %{ $self->{$t} }) {
+				if (exists $self->{$t}->{dirname($path)}) {
+					$skip_added = 1;
+					last;
+				}
+			}
+			last if $skip_added;
+		}
+		next if $skip_added;
+
+		# Use `git ls-tree` to get the filenames of this directory
+		# that existed prior to this particular commit.
+		my $ls = command('ls-tree', '-z', '--name-only',
+				 $self->{c}, "$dir/");
+		my %files = map { $_ => 1 } split(/\0/, $ls);
+
+		# Remove the filenames that were deleted during this commit.
+		delete $files{$_} foreach (@deleted_gpath);
+
+		# Report the directory if there are no filenames left.
+		push @empty_dirs, $dir unless (scalar %files);
+	}
+	@empty_dirs;
+}
+
+sub add_placeholder_file {
+	my ($self, $dir) = @_;
+	my $path = "$dir/$_placeholder_filename";
+	my $gpath = $self->git_path($path);
+
+	my $fh = $::_repository->temp_acquire($gpath);
+	my $hash = $::_repository->hash_and_insert_object(Git::temp_path($fh));
+	Git::temp_release($fh, 1);
+	$self->{gii}->update('100644', $hash, $gpath) or croak $!;
+
+	# The directory should no longer be considered empty.
+	delete $self->{empty}->{$dir} if exists $self->{empty}->{$dir};
+
+	# Keep track of any placeholder files we create.
+	$added_placeholder{$dir} = $path;
+}
+
+sub stash_placeholder_list {
+	my ($self) = @_;
+	my $k = "svn-remote.$repo_id.added-placeholder";
+	my $v = eval { command_oneline('config', '--get-all', $k) };
+	command_noisy('config', '--unset-all', $k) if $v;
+	foreach (values %added_placeholder) {
+		command_noisy('config', '--add', $k, $_);
+	}
+}
+
+1;
+__END__
+
+Git::SVN::Fetcher - tree delta consumer for "git svn fetch"
+
+=head1 SYNOPSIS
+
+    use SVN::Core;
+    use SVN::Ra;
+    use Git::SVN;
+    use Git::SVN::Fetcher;
+    use Git;
+
+    my $gs = Git::SVN->find_by_url($url);
+    my $ra = SVN::Ra->new(url => $url);
+    my $editor = Git::SVN::Fetcher->new($gs);
+    my $reporter = $ra->do_update($SVN::Core::INVALID_REVNUM, '',
+                                  1, $editor);
+    $reporter->set_path('', $old_rev, 0);
+    $reporter->finish_report;
+    my $tree = $gs->tmp_index_do(sub { command_oneline('write-tree') });
+
+    foreach my $path (keys %{$editor->{dir_prop}) {
+        my $props = $editor->{dir_prop}{$path};
+        foreach my $prop (keys %$props) {
+            print "property $prop at $path changed to $props->{$prop}\n";
+        }
+    }
+    foreach my $path (keys %{$editor->{empty}) {
+        my $action = $editor->{empty}{$path} ? 'added' : 'removed';
+        print "empty directory $path $action\n";
+    }
+    foreach my $path (keys %{$editor->{file_prop}) { ... }
+    foreach my $parent (keys %{$editor->{absent_dir}}) {
+        my @children = @{$editor->{abstent_dir}{$parent}};
+        print "cannot fetch directory $parent/$_: not authorized?\n"
+            foreach @children;
+    }
+    foreach my $parent (keys %{$editor->{absent_file}) { ... }
+
+=head1 DESCRIPTION
+
+This is a subclass of C<SVN::Delta::Editor>, which means it implements
+callbacks to act as a consumer of Subversion tree deltas.  This
+particular implementation of those callbacks is meant to store
+information about the resulting content which B<git svn fetch> could
+use to populate new commits and new entries for F<unhandled.log>.
+More specifically:
+
+=over
+
+=item * Additions, removals, and modifications of files are propagated
+to git-svn's index file F<$GIT_DIR/svn/$refname/index> using
+B<git update-index>.
+
+=item * Changes in Subversion path properties are recorded in the
+C<dir_prop> and C<file_prop> fields (which are hashes).
+
+=item * Addition and removal of empty directories are indicated by
+entries with value 1 and 0 respectively in the C<empty> hash.
+
+=item * Paths that are present but cannot be conveyed (presumably due
+to permissions) are recorded in the C<absent_file> and
+C<absent_dirs> hashes.  For each key, the corresponding value is
+a list of paths under that directory that were present but
+could not be conveyed.
+
+=back
+
+The interface is unstable.  Do not use this module unless you are
+developing git-svn.
+
+=head1 DEPENDENCIES
+
+L<SVN::Delta> from the Subversion perl bindings,
+the core L<Carp>, L<File::Basename>, and L<IO::File> modules,
+and git's L<Git> helper module.
+
+C<Git::SVN::Fetcher> has not been tested using callers other than
+B<git-svn> itself.
+
+=head1 SEE ALSO
+
+L<SVN::Delta>.
+
+=head1 INCOMPATIBILITIES
+
+None reported.
+
+=head1 BUGS
+
+None.
diff --git a/perl/Makefile.PL b/perl/Makefile.PL
index 4d8e31d2..424890a1 100644
--- a/perl/Makefile.PL
+++ b/perl/Makefile.PL
@@ -27,6 +27,7 @@ MAKE_FRAG
 my %pm = (
 	'Git.pm' => '$(INST_LIBDIR)/Git.pm',
 	'Git/I18N.pm' => '$(INST_LIBDIR)/Git/I18N.pm',
+	'Git/SVN/Fetcher.pm' => '$(INST_LIBDIR)/Git/SVN/Fetcher.pm',
 	'Git/SVN/Prompt.pm' => '$(INST_LIBDIR)/Git/SVN/Prompt.pm',
 );
 
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible
  2012-05-28  0:39         ` Jonathan Nieder
  2012-05-28  6:57           ` [RFC/PATCH 0/2] git-svn: give SVN::Git::Fetcher its own file Jonathan Nieder
@ 2012-05-29  0:25           ` Eric Wong
  2012-05-29 20:48             ` Junio C Hamano
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Wong @ 2012-05-29  0:25 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, Jason Gross, git

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Eric Wong wrote:
> > Jonathan Nieder <jrnieder@gmail.com> wrote:
> 
> >> +package Git::SVN::Memoize::YAML;
> >
> > Can we use this as an opportunity to start splitting git-svn.perl into
> > multiple .pm files?
> 
> Not a bad idea.  I've included an example patch to sanity-check the
> approach below.

Thank you.  I've signed-off and pushed that and your other two out to
"master" on git://bogomips.org/git-svn

> >> +	my $truehash = (-e $filename) ? YAML::Any::LoadFile($filename) : {};
> >
> >> +	YAML::Any::DumpFile($self->{FILENAME}, $self->{H});
> >
> > These should die on errors, right?
> 
> At least in YAML::Old, they use Carp::croak.  Maybe something like
> 
> 	local @CARP_NOT = qw(YAML::Any);
> 
> to blame the caller for the error would bring sanity.

Interesting, didn't know about @CARP_NOT.  It's barely documented in
on my Debian stable install(s), but it seems to be available since 5.8.
We should be able to use it without breaking compatibility.

> >> +=head1 BUGS
> >
> >> +Error handling is awkward.
> >
> > How so?
> 
> I mostly meant that it's not obvious what the state of %hash is at the
> point marked with (*) below:
> 
> 	if (not eval {
> 		tie my %hash => 'Foo::Bar', @params;
> 		1;
> 	}) {
> 		my $err = $@ ||
> 			# a destructor might have clobbered $@
> 			"Zombie error";
> 		die $err if worth_dying($err);
> 
> 		(*) ... try to recover ...
> 	}
> 
> That's not specific to Memoize::YAML, though.  It probably is not
> awkward for wizards who know the details. :)

OK, so we're no better or worse off than we were before :)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible
  2012-05-29  0:25           ` [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible Eric Wong
@ 2012-05-29 20:48             ` Junio C Hamano
  0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2012-05-29 20:48 UTC (permalink / raw)
  To: Eric Wong; +Cc: Jonathan Nieder, Jason Gross, git

Eric Wong <normalperson@yhbt.net> writes:

> Jonathan Nieder <jrnieder@gmail.com> wrote:
>> Eric Wong wrote:
>> > Jonathan Nieder <jrnieder@gmail.com> wrote:
>> 
>> >> +package Git::SVN::Memoize::YAML;
>> >
>> > Can we use this as an opportunity to start splitting git-svn.perl into
>> > multiple .pm files?
>> 
>> Not a bad idea.  I've included an example patch to sanity-check the
>> approach below.
>
> Thank you.  I've signed-off and pushed that and your other two out to
> "master" on git://bogomips.org/git-svn

OK, I'll pull it before tagging -rc1.

Thanks.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2 0/3] git-svn: use YAML format for mergeinfo cache when possible
  2012-05-27 20:14       ` [RFC/PATCH] " Eric Wong
  2012-05-28  0:39         ` Jonathan Nieder
@ 2012-06-09 22:20         ` Jonathan Nieder
  2012-06-09 22:25           ` [PATCH 1/3] git-svn: make Git::SVN::Editor a separate file Jonathan Nieder
                             ` (3 more replies)
  1 sibling, 4 replies; 26+ messages in thread
From: Jonathan Nieder @ 2012-06-09 22:20 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Jason Gross, git

Eric Wong wrote:

> Seems reasonable, a few comments below.

Thanks again.

[...]
> These should die on errors, right?

Yep, they do.  I didn't bother doing the CARP_NOT thing --- let's wait
for a bug report or rainy day and then handle errors properly (by
invalidating the cache when appropriate).

Patches 1-2 are independent from patch 3.  I'm sending them this way
for no particular reason.

Thoughts of all kinds welcome, as usual.

Jonathan Nieder (3):
  git-svn: make Git::SVN::Editor a separate file
  git-svn: make Git::SVN::RA a separate file
  git-svn: use YAML format for mergeinfo cache when possible

 git-svn.perl                 | 1115 ++----------------------------------------
 perl/Git/SVN/Editor.pm       |  536 ++++++++++++++++++++
 perl/Git/SVN/Fetcher.pm      |    3 +-
 perl/Git/SVN/Memoize/YAML.pm |   93 ++++
 perl/Git/SVN/Ra.pm           |  658 +++++++++++++++++++++++++
 perl/Makefile.PL             |    3 +
 6 files changed, 1320 insertions(+), 1088 deletions(-)
 create mode 100644 perl/Git/SVN/Editor.pm
 create mode 100644 perl/Git/SVN/Memoize/YAML.pm
 create mode 100644 perl/Git/SVN/Ra.pm

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH 1/3] git-svn: make Git::SVN::Editor a separate file
  2012-06-09 22:20         ` [PATCH v2 0/3] " Jonathan Nieder
@ 2012-06-09 22:25           ` Jonathan Nieder
  2012-06-09 22:28           ` [PATCH 2/3] git-svn: make Git::SVN::RA " Jonathan Nieder
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 26+ messages in thread
From: Jonathan Nieder @ 2012-06-09 22:25 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Jason Gross, git

This makes the git-svn script shorter and less scary for beginners to
read through for the first time.  Take the opportunity to explain the
purpose and basic interface of the Git::SVN::Editor class while at it.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 git-svn.perl            |  472 +----------------------------------------
 perl/Git/SVN/Editor.pm  |  536 +++++++++++++++++++++++++++++++++++++++++++++++
 perl/Git/SVN/Fetcher.pm |    3 +-
 perl/Makefile.PL        |    1 +
 4 files changed, 541 insertions(+), 471 deletions(-)
 create mode 100644 perl/Git/SVN/Editor.pm

diff --git a/git-svn.perl b/git-svn.perl
index 7870cc15..7cec4416 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -68,7 +68,6 @@ sub _req_svn {
 }
 my $can_compress = eval { require Compress::Zlib; 1};
 push @Git::SVN::Ra::ISA, 'SVN::Ra';
-push @Git::SVN::Editor::ISA, 'SVN::Delta::Editor';
 use Carp qw/croak/;
 use Digest::MD5;
 use IO::File qw//;
@@ -79,6 +78,7 @@ use File::Find;
 use Getopt::Long qw/:config gnu_getopt no_ignore_case auto_abbrev/;
 use IPC::Open3;
 use Git;
+use Git::SVN::Editor qw//;
 use Git::SVN::Fetcher qw//;
 use Git::SVN::Prompt qw//;
 use Memoize;  # core since 5.8.0, Jul 2002
@@ -89,8 +89,7 @@ BEGIN {
 	foreach (qw/command command_oneline command_noisy command_output_pipe
 	            command_input_pipe command_close_pipe
 	            command_bidi_pipe command_close_bidi_pipe/) {
-		for my $package ( qw(Git::SVN::Editor
-			Git::SVN::Migration Git::SVN::Log Git::SVN),
+		for my $package ( qw(Git::SVN::Migration Git::SVN::Log Git::SVN),
 			__PACKAGE__) {
 			*{"${package}::$_"} = \&{"Git::$_"};
 		}
@@ -1066,7 +1065,6 @@ sub cmd_branch {
 		            " with the --destination argument.\n";
 		}
 		foreach my $g (@{$allglobs}) {
-			# Git::SVN::Editor could probably be moved to Git.pm..
 			my $re = Git::SVN::Editor::glob2pat($g->{path}->{left});
 			if ($_branch_dest =~ /$re/) {
 				$glob = $g;
@@ -4328,472 +4326,6 @@ sub remove_username {
 	$_[0] =~ s{^([^:]*://)[^@]+@}{$1};
 }
 
-package Git::SVN::Editor;
-use vars qw/@ISA $_rmdir $_cp_similarity $_find_copies_harder $_rename_limit/;
-use strict;
-use warnings;
-use Carp qw/croak/;
-use IO::File;
-
-sub new {
-	my ($class, $opts) = @_;
-	foreach (qw/svn_path r ra tree_a tree_b log editor_cb/) {
-		die "$_ required!\n" unless (defined $opts->{$_});
-	}
-
-	my $pool = SVN::Pool->new;
-	my $mods = generate_diff($opts->{tree_a}, $opts->{tree_b});
-	my $types = check_diff_paths($opts->{ra}, $opts->{svn_path},
-	                             $opts->{r}, $mods);
-
-	# $opts->{ra} functions should not be used after this:
-	my @ce  = $opts->{ra}->get_commit_editor($opts->{log},
-	                                        $opts->{editor_cb}, $pool);
-	my $self = SVN::Delta::Editor->new(@ce, $pool);
-	bless $self, $class;
-	foreach (qw/svn_path r tree_a tree_b/) {
-		$self->{$_} = $opts->{$_};
-	}
-	$self->{url} = $opts->{ra}->{url};
-	$self->{mods} = $mods;
-	$self->{types} = $types;
-	$self->{pool} = $pool;
-	$self->{bat} = { '' => $self->open_root($self->{r}, $self->{pool}) };
-	$self->{rm} = { };
-	$self->{path_prefix} = length $self->{svn_path} ?
-	                       "$self->{svn_path}/" : '';
-	$self->{config} = $opts->{config};
-	$self->{mergeinfo} = $opts->{mergeinfo};
-	return $self;
-}
-
-sub generate_diff {
-	my ($tree_a, $tree_b) = @_;
-	my @diff_tree = qw(diff-tree -z -r);
-	if ($_cp_similarity) {
-		push @diff_tree, "-C$_cp_similarity";
-	} else {
-		push @diff_tree, '-C';
-	}
-	push @diff_tree, '--find-copies-harder' if $_find_copies_harder;
-	push @diff_tree, "-l$_rename_limit" if defined $_rename_limit;
-	push @diff_tree, $tree_a, $tree_b;
-	my ($diff_fh, $ctx) = command_output_pipe(@diff_tree);
-	local $/ = "\0";
-	my $state = 'meta';
-	my @mods;
-	while (<$diff_fh>) {
-		chomp $_; # this gets rid of the trailing "\0"
-		if ($state eq 'meta' && /^:(\d{6})\s(\d{6})\s
-					($::sha1)\s($::sha1)\s
-					([MTCRAD])\d*$/xo) {
-			push @mods, {	mode_a => $1, mode_b => $2,
-					sha1_a => $3, sha1_b => $4,
-					chg => $5 };
-			if ($5 =~ /^(?:C|R)$/) {
-				$state = 'file_a';
-			} else {
-				$state = 'file_b';
-			}
-		} elsif ($state eq 'file_a') {
-			my $x = $mods[$#mods] or croak "Empty array\n";
-			if ($x->{chg} !~ /^(?:C|R)$/) {
-				croak "Error parsing $_, $x->{chg}\n";
-			}
-			$x->{file_a} = $_;
-			$state = 'file_b';
-		} elsif ($state eq 'file_b') {
-			my $x = $mods[$#mods] or croak "Empty array\n";
-			if (exists $x->{file_a} && $x->{chg} !~ /^(?:C|R)$/) {
-				croak "Error parsing $_, $x->{chg}\n";
-			}
-			if (!exists $x->{file_a} && $x->{chg} =~ /^(?:C|R)$/) {
-				croak "Error parsing $_, $x->{chg}\n";
-			}
-			$x->{file_b} = $_;
-			$state = 'meta';
-		} else {
-			croak "Error parsing $_\n";
-		}
-	}
-	command_close_pipe($diff_fh, $ctx);
-	\@mods;
-}
-
-sub check_diff_paths {
-	my ($ra, $pfx, $rev, $mods) = @_;
-	my %types;
-	$pfx .= '/' if length $pfx;
-
-	sub type_diff_paths {
-		my ($ra, $types, $path, $rev) = @_;
-		my @p = split m#/+#, $path;
-		my $c = shift @p;
-		unless (defined $types->{$c}) {
-			$types->{$c} = $ra->check_path($c, $rev);
-		}
-		while (@p) {
-			$c .= '/' . shift @p;
-			next if defined $types->{$c};
-			$types->{$c} = $ra->check_path($c, $rev);
-		}
-	}
-
-	foreach my $m (@$mods) {
-		foreach my $f (qw/file_a file_b/) {
-			next unless defined $m->{$f};
-			my ($dir) = ($m->{$f} =~ m#^(.*?)/?(?:[^/]+)$#);
-			if (length $pfx.$dir && ! defined $types{$dir}) {
-				type_diff_paths($ra, \%types, $pfx.$dir, $rev);
-			}
-		}
-	}
-	\%types;
-}
-
-sub split_path {
-	return ($_[0] =~ m#^(.*?)/?([^/]+)$#);
-}
-
-sub repo_path {
-	my ($self, $path) = @_;
-	if (my $enc = $self->{pathnameencoding}) {
-		require Encode;
-		Encode::from_to($path, $enc, 'UTF-8');
-	}
-	$self->{path_prefix}.(defined $path ? $path : '');
-}
-
-sub url_path {
-	my ($self, $path) = @_;
-	if ($self->{url} =~ m#^https?://#) {
-		$path =~ s!([^~a-zA-Z0-9_./-])!uc sprintf("%%%02x",ord($1))!eg;
-	}
-	$self->{url} . '/' . $self->repo_path($path);
-}
-
-sub rmdirs {
-	my ($self) = @_;
-	my $rm = $self->{rm};
-	delete $rm->{''}; # we never delete the url we're tracking
-	return unless %$rm;
-
-	foreach (keys %$rm) {
-		my @d = split m#/#, $_;
-		my $c = shift @d;
-		$rm->{$c} = 1;
-		while (@d) {
-			$c .= '/' . shift @d;
-			$rm->{$c} = 1;
-		}
-	}
-	delete $rm->{$self->{svn_path}};
-	delete $rm->{''}; # we never delete the url we're tracking
-	return unless %$rm;
-
-	my ($fh, $ctx) = command_output_pipe(qw/ls-tree --name-only -r -z/,
-	                                     $self->{tree_b});
-	local $/ = "\0";
-	while (<$fh>) {
-		chomp;
-		my @dn = split m#/#, $_;
-		while (pop @dn) {
-			delete $rm->{join '/', @dn};
-		}
-		unless (%$rm) {
-			close $fh;
-			return;
-		}
-	}
-	command_close_pipe($fh, $ctx);
-
-	my ($r, $p, $bat) = ($self->{r}, $self->{pool}, $self->{bat});
-	foreach my $d (sort { $b =~ tr#/#/# <=> $a =~ tr#/#/# } keys %$rm) {
-		$self->close_directory($bat->{$d}, $p);
-		my ($dn) = ($d =~ m#^(.*?)/?(?:[^/]+)$#);
-		print "\tD+\t$d/\n" unless $::_q;
-		$self->SUPER::delete_entry($d, $r, $bat->{$dn}, $p);
-		delete $bat->{$d};
-	}
-}
-
-sub open_or_add_dir {
-	my ($self, $full_path, $baton, $deletions) = @_;
-	my $t = $self->{types}->{$full_path};
-	if (!defined $t) {
-		die "$full_path not known in r$self->{r} or we have a bug!\n";
-	}
-	{
-		no warnings 'once';
-		# SVN::Node::none and SVN::Node::file are used only once,
-		# so we're shutting up Perl's warnings about them.
-		if ($t == $SVN::Node::none || defined($deletions->{$full_path})) {
-			return $self->add_directory($full_path, $baton,
-			    undef, -1, $self->{pool});
-		} elsif ($t == $SVN::Node::dir) {
-			return $self->open_directory($full_path, $baton,
-			    $self->{r}, $self->{pool});
-		} # no warnings 'once'
-		print STDERR "$full_path already exists in repository at ",
-		    "r$self->{r} and it is not a directory (",
-		    ($t == $SVN::Node::file ? 'file' : 'unknown'),"/$t)\n";
-	} # no warnings 'once'
-	exit 1;
-}
-
-sub ensure_path {
-	my ($self, $path, $deletions) = @_;
-	my $bat = $self->{bat};
-	my $repo_path = $self->repo_path($path);
-	return $bat->{''} unless (length $repo_path);
-
-	my @p = split m#/+#, $repo_path;
-	my $c = shift @p;
-	$bat->{$c} ||= $self->open_or_add_dir($c, $bat->{''}, $deletions);
-	while (@p) {
-		my $c0 = $c;
-		$c .= '/' . shift @p;
-		$bat->{$c} ||= $self->open_or_add_dir($c, $bat->{$c0}, $deletions);
-	}
-	return $bat->{$c};
-}
-
-# Subroutine to convert a globbing pattern to a regular expression.
-# From perl cookbook.
-sub glob2pat {
-	my $globstr = shift;
-	my %patmap = ('*' => '.*', '?' => '.', '[' => '[', ']' => ']');
-	$globstr =~ s{(.)} { $patmap{$1} || "\Q$1" }ge;
-	return '^' . $globstr . '$';
-}
-
-sub check_autoprop {
-	my ($self, $pattern, $properties, $file, $fbat) = @_;
-	# Convert the globbing pattern to a regular expression.
-	my $regex = glob2pat($pattern);
-	# Check if the pattern matches the file name.
-	if($file =~ m/($regex)/) {
-		# Parse the list of properties to set.
-		my @props = split(/;/, $properties);
-		foreach my $prop (@props) {
-			# Parse 'name=value' syntax and set the property.
-			if ($prop =~ /([^=]+)=(.*)/) {
-				my ($n,$v) = ($1,$2);
-				for ($n, $v) {
-					s/^\s+//; s/\s+$//;
-				}
-				$self->change_file_prop($fbat, $n, $v);
-			}
-		}
-	}
-}
-
-sub apply_autoprops {
-	my ($self, $file, $fbat) = @_;
-	my $conf_t = ${$self->{config}}{'config'};
-	no warnings 'once';
-	# Check [miscellany]/enable-auto-props in svn configuration.
-	if (SVN::_Core::svn_config_get_bool(
-		$conf_t,
-		$SVN::_Core::SVN_CONFIG_SECTION_MISCELLANY,
-		$SVN::_Core::SVN_CONFIG_OPTION_ENABLE_AUTO_PROPS,
-		0)) {
-		# Auto-props are enabled.  Enumerate them to look for matches.
-		my $callback = sub {
-			$self->check_autoprop($_[0], $_[1], $file, $fbat);
-		};
-		SVN::_Core::svn_config_enumerate(
-			$conf_t,
-			$SVN::_Core::SVN_CONFIG_SECTION_AUTO_PROPS,
-			$callback);
-	}
-}
-
-sub A {
-	my ($self, $m, $deletions) = @_;
-	my ($dir, $file) = split_path($m->{file_b});
-	my $pbat = $self->ensure_path($dir, $deletions);
-	my $fbat = $self->add_file($self->repo_path($m->{file_b}), $pbat,
-					undef, -1);
-	print "\tA\t$m->{file_b}\n" unless $::_q;
-	$self->apply_autoprops($file, $fbat);
-	$self->chg_file($fbat, $m);
-	$self->close_file($fbat,undef,$self->{pool});
-}
-
-sub C {
-	my ($self, $m, $deletions) = @_;
-	my ($dir, $file) = split_path($m->{file_b});
-	my $pbat = $self->ensure_path($dir, $deletions);
-	my $fbat = $self->add_file($self->repo_path($m->{file_b}), $pbat,
-				$self->url_path($m->{file_a}), $self->{r});
-	print "\tC\t$m->{file_a} => $m->{file_b}\n" unless $::_q;
-	$self->chg_file($fbat, $m);
-	$self->close_file($fbat,undef,$self->{pool});
-}
-
-sub delete_entry {
-	my ($self, $path, $pbat) = @_;
-	my $rpath = $self->repo_path($path);
-	my ($dir, $file) = split_path($rpath);
-	$self->{rm}->{$dir} = 1;
-	$self->SUPER::delete_entry($rpath, $self->{r}, $pbat, $self->{pool});
-}
-
-sub R {
-	my ($self, $m, $deletions) = @_;
-	my ($dir, $file) = split_path($m->{file_b});
-	my $pbat = $self->ensure_path($dir, $deletions);
-	my $fbat = $self->add_file($self->repo_path($m->{file_b}), $pbat,
-				$self->url_path($m->{file_a}), $self->{r});
-	print "\tR\t$m->{file_a} => $m->{file_b}\n" unless $::_q;
-	$self->apply_autoprops($file, $fbat);
-	$self->chg_file($fbat, $m);
-	$self->close_file($fbat,undef,$self->{pool});
-
-	($dir, $file) = split_path($m->{file_a});
-	$pbat = $self->ensure_path($dir, $deletions);
-	$self->delete_entry($m->{file_a}, $pbat);
-}
-
-sub M {
-	my ($self, $m, $deletions) = @_;
-	my ($dir, $file) = split_path($m->{file_b});
-	my $pbat = $self->ensure_path($dir, $deletions);
-	my $fbat = $self->open_file($self->repo_path($m->{file_b}),
-				$pbat,$self->{r},$self->{pool});
-	print "\t$m->{chg}\t$m->{file_b}\n" unless $::_q;
-	$self->chg_file($fbat, $m);
-	$self->close_file($fbat,undef,$self->{pool});
-}
-
-sub T { shift->M(@_) }
-
-sub change_file_prop {
-	my ($self, $fbat, $pname, $pval) = @_;
-	$self->SUPER::change_file_prop($fbat, $pname, $pval, $self->{pool});
-}
-
-sub change_dir_prop {
-	my ($self, $pbat, $pname, $pval) = @_;
-	$self->SUPER::change_dir_prop($pbat, $pname, $pval, $self->{pool});
-}
-
-sub _chg_file_get_blob ($$$$) {
-	my ($self, $fbat, $m, $which) = @_;
-	my $fh = $::_repository->temp_acquire("git_blob_$which");
-	if ($m->{"mode_$which"} =~ /^120/) {
-		print $fh 'link ' or croak $!;
-		$self->change_file_prop($fbat,'svn:special','*');
-	} elsif ($m->{mode_a} =~ /^120/ && $m->{"mode_$which"} !~ /^120/) {
-		$self->change_file_prop($fbat,'svn:special',undef);
-	}
-	my $blob = $m->{"sha1_$which"};
-	return ($fh,) if ($blob =~ /^0{40}$/);
-	my $size = $::_repository->cat_blob($blob, $fh);
-	croak "Failed to read object $blob" if ($size < 0);
-	$fh->flush == 0 or croak $!;
-	seek $fh, 0, 0 or croak $!;
-
-	my $exp = ::md5sum($fh);
-	seek $fh, 0, 0 or croak $!;
-	return ($fh, $exp);
-}
-
-sub chg_file {
-	my ($self, $fbat, $m) = @_;
-	if ($m->{mode_b} =~ /755$/ && $m->{mode_a} !~ /755$/) {
-		$self->change_file_prop($fbat,'svn:executable','*');
-	} elsif ($m->{mode_b} !~ /755$/ && $m->{mode_a} =~ /755$/) {
-		$self->change_file_prop($fbat,'svn:executable',undef);
-	}
-	my ($fh_a, $exp_a) = _chg_file_get_blob $self, $fbat, $m, 'a';
-	my ($fh_b, $exp_b) = _chg_file_get_blob $self, $fbat, $m, 'b';
-	my $pool = SVN::Pool->new;
-	my $atd = $self->apply_textdelta($fbat, $exp_a, $pool);
-	if (-s $fh_a) {
-		my $txstream = SVN::TxDelta::new ($fh_a, $fh_b, $pool);
-		my $res = SVN::TxDelta::send_txstream($txstream, @$atd, $pool);
-		if (defined $res) {
-			die "Unexpected result from send_txstream: $res\n",
-			    "(SVN::Core::VERSION: $SVN::Core::VERSION)\n";
-		}
-	} else {
-		my $got = SVN::TxDelta::send_stream($fh_b, @$atd, $pool);
-		die "Checksum mismatch\nexpected: $exp_b\ngot: $got\n"
-		    if ($got ne $exp_b);
-	}
-	Git::temp_release($fh_b, 1);
-	Git::temp_release($fh_a, 1);
-	$pool->clear;
-}
-
-sub D {
-	my ($self, $m, $deletions) = @_;
-	my ($dir, $file) = split_path($m->{file_b});
-	my $pbat = $self->ensure_path($dir, $deletions);
-	print "\tD\t$m->{file_b}\n" unless $::_q;
-	$self->delete_entry($m->{file_b}, $pbat);
-}
-
-sub close_edit {
-	my ($self) = @_;
-	my ($p,$bat) = ($self->{pool}, $self->{bat});
-	foreach (sort { $b =~ tr#/#/# <=> $a =~ tr#/#/# } keys %$bat) {
-		next if $_ eq '';
-		$self->close_directory($bat->{$_}, $p);
-	}
-	$self->close_directory($bat->{''}, $p);
-	$self->SUPER::close_edit($p);
-	$p->clear;
-}
-
-sub abort_edit {
-	my ($self) = @_;
-	$self->SUPER::abort_edit($self->{pool});
-}
-
-sub DESTROY {
-	my $self = shift;
-	$self->SUPER::DESTROY(@_);
-	$self->{pool}->clear;
-}
-
-# this drives the editor
-sub apply_diff {
-	my ($self) = @_;
-	my $mods = $self->{mods};
-	my %o = ( D => 0, C => 1, R => 2, A => 3, M => 4, T => 5 );
-	my %deletions;
-
-	foreach my $m (@$mods) {
-		if ($m->{chg} eq "D") {
-			$deletions{$m->{file_b}} = 1;
-		}
-	}
-
-	foreach my $m (sort { $o{$a->{chg}} <=> $o{$b->{chg}} } @$mods) {
-		my $f = $m->{chg};
-		if (defined $o{$f}) {
-			$self->$f($m, \%deletions);
-		} else {
-			fatal("Invalid change type: $f");
-		}
-	}
-
-	if (defined($self->{mergeinfo})) {
-		$self->change_dir_prop($self->{bat}{''}, "svn:mergeinfo",
-			               $self->{mergeinfo});
-	}
-	$self->rmdirs if $_rmdir;
-	if (@$mods == 0 && !defined($self->{mergeinfo})) {
-		$self->abort_edit;
-	} else {
-		$self->close_edit;
-	}
-	return scalar @$mods;
-}
-
 package Git::SVN::Ra;
 use vars qw/@ISA $config_dir $_ignore_refs_regex $_log_window_size/;
 use strict;
diff --git a/perl/Git/SVN/Editor.pm b/perl/Git/SVN/Editor.pm
new file mode 100644
index 00000000..755092fd
--- /dev/null
+++ b/perl/Git/SVN/Editor.pm
@@ -0,0 +1,536 @@
+package Git::SVN::Editor;
+use vars qw/@ISA $_rmdir $_cp_similarity $_find_copies_harder $_rename_limit/;
+use strict;
+use warnings;
+use SVN::Core;
+use SVN::Delta;
+use Carp qw/croak/;
+use IO::File;
+use Git qw/command command_oneline command_noisy command_output_pipe
+           command_input_pipe command_close_pipe
+           command_bidi_pipe command_close_bidi_pipe/;
+BEGIN {
+	@ISA = qw(SVN::Delta::Editor);
+}
+
+sub new {
+	my ($class, $opts) = @_;
+	foreach (qw/svn_path r ra tree_a tree_b log editor_cb/) {
+		die "$_ required!\n" unless (defined $opts->{$_});
+	}
+
+	my $pool = SVN::Pool->new;
+	my $mods = generate_diff($opts->{tree_a}, $opts->{tree_b});
+	my $types = check_diff_paths($opts->{ra}, $opts->{svn_path},
+	                             $opts->{r}, $mods);
+
+	# $opts->{ra} functions should not be used after this:
+	my @ce  = $opts->{ra}->get_commit_editor($opts->{log},
+	                                        $opts->{editor_cb}, $pool);
+	my $self = SVN::Delta::Editor->new(@ce, $pool);
+	bless $self, $class;
+	foreach (qw/svn_path r tree_a tree_b/) {
+		$self->{$_} = $opts->{$_};
+	}
+	$self->{url} = $opts->{ra}->{url};
+	$self->{mods} = $mods;
+	$self->{types} = $types;
+	$self->{pool} = $pool;
+	$self->{bat} = { '' => $self->open_root($self->{r}, $self->{pool}) };
+	$self->{rm} = { };
+	$self->{path_prefix} = length $self->{svn_path} ?
+	                       "$self->{svn_path}/" : '';
+	$self->{config} = $opts->{config};
+	$self->{mergeinfo} = $opts->{mergeinfo};
+	return $self;
+}
+
+sub generate_diff {
+	my ($tree_a, $tree_b) = @_;
+	my @diff_tree = qw(diff-tree -z -r);
+	if ($_cp_similarity) {
+		push @diff_tree, "-C$_cp_similarity";
+	} else {
+		push @diff_tree, '-C';
+	}
+	push @diff_tree, '--find-copies-harder' if $_find_copies_harder;
+	push @diff_tree, "-l$_rename_limit" if defined $_rename_limit;
+	push @diff_tree, $tree_a, $tree_b;
+	my ($diff_fh, $ctx) = command_output_pipe(@diff_tree);
+	local $/ = "\0";
+	my $state = 'meta';
+	my @mods;
+	while (<$diff_fh>) {
+		chomp $_; # this gets rid of the trailing "\0"
+		if ($state eq 'meta' && /^:(\d{6})\s(\d{6})\s
+					($::sha1)\s($::sha1)\s
+					([MTCRAD])\d*$/xo) {
+			push @mods, {	mode_a => $1, mode_b => $2,
+					sha1_a => $3, sha1_b => $4,
+					chg => $5 };
+			if ($5 =~ /^(?:C|R)$/) {
+				$state = 'file_a';
+			} else {
+				$state = 'file_b';
+			}
+		} elsif ($state eq 'file_a') {
+			my $x = $mods[$#mods] or croak "Empty array\n";
+			if ($x->{chg} !~ /^(?:C|R)$/) {
+				croak "Error parsing $_, $x->{chg}\n";
+			}
+			$x->{file_a} = $_;
+			$state = 'file_b';
+		} elsif ($state eq 'file_b') {
+			my $x = $mods[$#mods] or croak "Empty array\n";
+			if (exists $x->{file_a} && $x->{chg} !~ /^(?:C|R)$/) {
+				croak "Error parsing $_, $x->{chg}\n";
+			}
+			if (!exists $x->{file_a} && $x->{chg} =~ /^(?:C|R)$/) {
+				croak "Error parsing $_, $x->{chg}\n";
+			}
+			$x->{file_b} = $_;
+			$state = 'meta';
+		} else {
+			croak "Error parsing $_\n";
+		}
+	}
+	command_close_pipe($diff_fh, $ctx);
+	\@mods;
+}
+
+sub check_diff_paths {
+	my ($ra, $pfx, $rev, $mods) = @_;
+	my %types;
+	$pfx .= '/' if length $pfx;
+
+	sub type_diff_paths {
+		my ($ra, $types, $path, $rev) = @_;
+		my @p = split m#/+#, $path;
+		my $c = shift @p;
+		unless (defined $types->{$c}) {
+			$types->{$c} = $ra->check_path($c, $rev);
+		}
+		while (@p) {
+			$c .= '/' . shift @p;
+			next if defined $types->{$c};
+			$types->{$c} = $ra->check_path($c, $rev);
+		}
+	}
+
+	foreach my $m (@$mods) {
+		foreach my $f (qw/file_a file_b/) {
+			next unless defined $m->{$f};
+			my ($dir) = ($m->{$f} =~ m#^(.*?)/?(?:[^/]+)$#);
+			if (length $pfx.$dir && ! defined $types{$dir}) {
+				type_diff_paths($ra, \%types, $pfx.$dir, $rev);
+			}
+		}
+	}
+	\%types;
+}
+
+sub split_path {
+	return ($_[0] =~ m#^(.*?)/?([^/]+)$#);
+}
+
+sub repo_path {
+	my ($self, $path) = @_;
+	if (my $enc = $self->{pathnameencoding}) {
+		require Encode;
+		Encode::from_to($path, $enc, 'UTF-8');
+	}
+	$self->{path_prefix}.(defined $path ? $path : '');
+}
+
+sub url_path {
+	my ($self, $path) = @_;
+	if ($self->{url} =~ m#^https?://#) {
+		$path =~ s!([^~a-zA-Z0-9_./-])!uc sprintf("%%%02x",ord($1))!eg;
+	}
+	$self->{url} . '/' . $self->repo_path($path);
+}
+
+sub rmdirs {
+	my ($self) = @_;
+	my $rm = $self->{rm};
+	delete $rm->{''}; # we never delete the url we're tracking
+	return unless %$rm;
+
+	foreach (keys %$rm) {
+		my @d = split m#/#, $_;
+		my $c = shift @d;
+		$rm->{$c} = 1;
+		while (@d) {
+			$c .= '/' . shift @d;
+			$rm->{$c} = 1;
+		}
+	}
+	delete $rm->{$self->{svn_path}};
+	delete $rm->{''}; # we never delete the url we're tracking
+	return unless %$rm;
+
+	my ($fh, $ctx) = command_output_pipe(qw/ls-tree --name-only -r -z/,
+	                                     $self->{tree_b});
+	local $/ = "\0";
+	while (<$fh>) {
+		chomp;
+		my @dn = split m#/#, $_;
+		while (pop @dn) {
+			delete $rm->{join '/', @dn};
+		}
+		unless (%$rm) {
+			close $fh;
+			return;
+		}
+	}
+	command_close_pipe($fh, $ctx);
+
+	my ($r, $p, $bat) = ($self->{r}, $self->{pool}, $self->{bat});
+	foreach my $d (sort { $b =~ tr#/#/# <=> $a =~ tr#/#/# } keys %$rm) {
+		$self->close_directory($bat->{$d}, $p);
+		my ($dn) = ($d =~ m#^(.*?)/?(?:[^/]+)$#);
+		print "\tD+\t$d/\n" unless $::_q;
+		$self->SUPER::delete_entry($d, $r, $bat->{$dn}, $p);
+		delete $bat->{$d};
+	}
+}
+
+sub open_or_add_dir {
+	my ($self, $full_path, $baton, $deletions) = @_;
+	my $t = $self->{types}->{$full_path};
+	if (!defined $t) {
+		die "$full_path not known in r$self->{r} or we have a bug!\n";
+	}
+	{
+		no warnings 'once';
+		# SVN::Node::none and SVN::Node::file are used only once,
+		# so we're shutting up Perl's warnings about them.
+		if ($t == $SVN::Node::none || defined($deletions->{$full_path})) {
+			return $self->add_directory($full_path, $baton,
+			    undef, -1, $self->{pool});
+		} elsif ($t == $SVN::Node::dir) {
+			return $self->open_directory($full_path, $baton,
+			    $self->{r}, $self->{pool});
+		} # no warnings 'once'
+		print STDERR "$full_path already exists in repository at ",
+		    "r$self->{r} and it is not a directory (",
+		    ($t == $SVN::Node::file ? 'file' : 'unknown'),"/$t)\n";
+	} # no warnings 'once'
+	exit 1;
+}
+
+sub ensure_path {
+	my ($self, $path, $deletions) = @_;
+	my $bat = $self->{bat};
+	my $repo_path = $self->repo_path($path);
+	return $bat->{''} unless (length $repo_path);
+
+	my @p = split m#/+#, $repo_path;
+	my $c = shift @p;
+	$bat->{$c} ||= $self->open_or_add_dir($c, $bat->{''}, $deletions);
+	while (@p) {
+		my $c0 = $c;
+		$c .= '/' . shift @p;
+		$bat->{$c} ||= $self->open_or_add_dir($c, $bat->{$c0}, $deletions);
+	}
+	return $bat->{$c};
+}
+
+# Subroutine to convert a globbing pattern to a regular expression.
+# From perl cookbook.
+sub glob2pat {
+	my $globstr = shift;
+	my %patmap = ('*' => '.*', '?' => '.', '[' => '[', ']' => ']');
+	$globstr =~ s{(.)} { $patmap{$1} || "\Q$1" }ge;
+	return '^' . $globstr . '$';
+}
+
+sub check_autoprop {
+	my ($self, $pattern, $properties, $file, $fbat) = @_;
+	# Convert the globbing pattern to a regular expression.
+	my $regex = glob2pat($pattern);
+	# Check if the pattern matches the file name.
+	if($file =~ m/($regex)/) {
+		# Parse the list of properties to set.
+		my @props = split(/;/, $properties);
+		foreach my $prop (@props) {
+			# Parse 'name=value' syntax and set the property.
+			if ($prop =~ /([^=]+)=(.*)/) {
+				my ($n,$v) = ($1,$2);
+				for ($n, $v) {
+					s/^\s+//; s/\s+$//;
+				}
+				$self->change_file_prop($fbat, $n, $v);
+			}
+		}
+	}
+}
+
+sub apply_autoprops {
+	my ($self, $file, $fbat) = @_;
+	my $conf_t = ${$self->{config}}{'config'};
+	no warnings 'once';
+	# Check [miscellany]/enable-auto-props in svn configuration.
+	if (SVN::_Core::svn_config_get_bool(
+		$conf_t,
+		$SVN::_Core::SVN_CONFIG_SECTION_MISCELLANY,
+		$SVN::_Core::SVN_CONFIG_OPTION_ENABLE_AUTO_PROPS,
+		0)) {
+		# Auto-props are enabled.  Enumerate them to look for matches.
+		my $callback = sub {
+			$self->check_autoprop($_[0], $_[1], $file, $fbat);
+		};
+		SVN::_Core::svn_config_enumerate(
+			$conf_t,
+			$SVN::_Core::SVN_CONFIG_SECTION_AUTO_PROPS,
+			$callback);
+	}
+}
+
+sub A {
+	my ($self, $m, $deletions) = @_;
+	my ($dir, $file) = split_path($m->{file_b});
+	my $pbat = $self->ensure_path($dir, $deletions);
+	my $fbat = $self->add_file($self->repo_path($m->{file_b}), $pbat,
+					undef, -1);
+	print "\tA\t$m->{file_b}\n" unless $::_q;
+	$self->apply_autoprops($file, $fbat);
+	$self->chg_file($fbat, $m);
+	$self->close_file($fbat,undef,$self->{pool});
+}
+
+sub C {
+	my ($self, $m, $deletions) = @_;
+	my ($dir, $file) = split_path($m->{file_b});
+	my $pbat = $self->ensure_path($dir, $deletions);
+	my $fbat = $self->add_file($self->repo_path($m->{file_b}), $pbat,
+				$self->url_path($m->{file_a}), $self->{r});
+	print "\tC\t$m->{file_a} => $m->{file_b}\n" unless $::_q;
+	$self->chg_file($fbat, $m);
+	$self->close_file($fbat,undef,$self->{pool});
+}
+
+sub delete_entry {
+	my ($self, $path, $pbat) = @_;
+	my $rpath = $self->repo_path($path);
+	my ($dir, $file) = split_path($rpath);
+	$self->{rm}->{$dir} = 1;
+	$self->SUPER::delete_entry($rpath, $self->{r}, $pbat, $self->{pool});
+}
+
+sub R {
+	my ($self, $m, $deletions) = @_;
+	my ($dir, $file) = split_path($m->{file_b});
+	my $pbat = $self->ensure_path($dir, $deletions);
+	my $fbat = $self->add_file($self->repo_path($m->{file_b}), $pbat,
+				$self->url_path($m->{file_a}), $self->{r});
+	print "\tR\t$m->{file_a} => $m->{file_b}\n" unless $::_q;
+	$self->apply_autoprops($file, $fbat);
+	$self->chg_file($fbat, $m);
+	$self->close_file($fbat,undef,$self->{pool});
+
+	($dir, $file) = split_path($m->{file_a});
+	$pbat = $self->ensure_path($dir, $deletions);
+	$self->delete_entry($m->{file_a}, $pbat);
+}
+
+sub M {
+	my ($self, $m, $deletions) = @_;
+	my ($dir, $file) = split_path($m->{file_b});
+	my $pbat = $self->ensure_path($dir, $deletions);
+	my $fbat = $self->open_file($self->repo_path($m->{file_b}),
+				$pbat,$self->{r},$self->{pool});
+	print "\t$m->{chg}\t$m->{file_b}\n" unless $::_q;
+	$self->chg_file($fbat, $m);
+	$self->close_file($fbat,undef,$self->{pool});
+}
+
+sub T { shift->M(@_) }
+
+sub change_file_prop {
+	my ($self, $fbat, $pname, $pval) = @_;
+	$self->SUPER::change_file_prop($fbat, $pname, $pval, $self->{pool});
+}
+
+sub change_dir_prop {
+	my ($self, $pbat, $pname, $pval) = @_;
+	$self->SUPER::change_dir_prop($pbat, $pname, $pval, $self->{pool});
+}
+
+sub _chg_file_get_blob ($$$$) {
+	my ($self, $fbat, $m, $which) = @_;
+	my $fh = $::_repository->temp_acquire("git_blob_$which");
+	if ($m->{"mode_$which"} =~ /^120/) {
+		print $fh 'link ' or croak $!;
+		$self->change_file_prop($fbat,'svn:special','*');
+	} elsif ($m->{mode_a} =~ /^120/ && $m->{"mode_$which"} !~ /^120/) {
+		$self->change_file_prop($fbat,'svn:special',undef);
+	}
+	my $blob = $m->{"sha1_$which"};
+	return ($fh,) if ($blob =~ /^0{40}$/);
+	my $size = $::_repository->cat_blob($blob, $fh);
+	croak "Failed to read object $blob" if ($size < 0);
+	$fh->flush == 0 or croak $!;
+	seek $fh, 0, 0 or croak $!;
+
+	my $exp = ::md5sum($fh);
+	seek $fh, 0, 0 or croak $!;
+	return ($fh, $exp);
+}
+
+sub chg_file {
+	my ($self, $fbat, $m) = @_;
+	if ($m->{mode_b} =~ /755$/ && $m->{mode_a} !~ /755$/) {
+		$self->change_file_prop($fbat,'svn:executable','*');
+	} elsif ($m->{mode_b} !~ /755$/ && $m->{mode_a} =~ /755$/) {
+		$self->change_file_prop($fbat,'svn:executable',undef);
+	}
+	my ($fh_a, $exp_a) = _chg_file_get_blob $self, $fbat, $m, 'a';
+	my ($fh_b, $exp_b) = _chg_file_get_blob $self, $fbat, $m, 'b';
+	my $pool = SVN::Pool->new;
+	my $atd = $self->apply_textdelta($fbat, $exp_a, $pool);
+	if (-s $fh_a) {
+		my $txstream = SVN::TxDelta::new ($fh_a, $fh_b, $pool);
+		my $res = SVN::TxDelta::send_txstream($txstream, @$atd, $pool);
+		if (defined $res) {
+			die "Unexpected result from send_txstream: $res\n",
+			    "(SVN::Core::VERSION: $SVN::Core::VERSION)\n";
+		}
+	} else {
+		my $got = SVN::TxDelta::send_stream($fh_b, @$atd, $pool);
+		die "Checksum mismatch\nexpected: $exp_b\ngot: $got\n"
+		    if ($got ne $exp_b);
+	}
+	Git::temp_release($fh_b, 1);
+	Git::temp_release($fh_a, 1);
+	$pool->clear;
+}
+
+sub D {
+	my ($self, $m, $deletions) = @_;
+	my ($dir, $file) = split_path($m->{file_b});
+	my $pbat = $self->ensure_path($dir, $deletions);
+	print "\tD\t$m->{file_b}\n" unless $::_q;
+	$self->delete_entry($m->{file_b}, $pbat);
+}
+
+sub close_edit {
+	my ($self) = @_;
+	my ($p,$bat) = ($self->{pool}, $self->{bat});
+	foreach (sort { $b =~ tr#/#/# <=> $a =~ tr#/#/# } keys %$bat) {
+		next if $_ eq '';
+		$self->close_directory($bat->{$_}, $p);
+	}
+	$self->close_directory($bat->{''}, $p);
+	$self->SUPER::close_edit($p);
+	$p->clear;
+}
+
+sub abort_edit {
+	my ($self) = @_;
+	$self->SUPER::abort_edit($self->{pool});
+}
+
+sub DESTROY {
+	my $self = shift;
+	$self->SUPER::DESTROY(@_);
+	$self->{pool}->clear;
+}
+
+# this drives the editor
+sub apply_diff {
+	my ($self) = @_;
+	my $mods = $self->{mods};
+	my %o = ( D => 0, C => 1, R => 2, A => 3, M => 4, T => 5 );
+	my %deletions;
+
+	foreach my $m (@$mods) {
+		if ($m->{chg} eq "D") {
+			$deletions{$m->{file_b}} = 1;
+		}
+	}
+
+	foreach my $m (sort { $o{$a->{chg}} <=> $o{$b->{chg}} } @$mods) {
+		my $f = $m->{chg};
+		if (defined $o{$f}) {
+			$self->$f($m, \%deletions);
+		} else {
+			fatal("Invalid change type: $f");
+		}
+	}
+
+	if (defined($self->{mergeinfo})) {
+		$self->change_dir_prop($self->{bat}{''}, "svn:mergeinfo",
+			               $self->{mergeinfo});
+	}
+	$self->rmdirs if $_rmdir;
+	if (@$mods == 0 && !defined($self->{mergeinfo})) {
+		$self->abort_edit;
+	} else {
+		$self->close_edit;
+	}
+	return scalar @$mods;
+}
+
+1;
+__END__
+
+Git::SVN::Editor - commit driver for "git svn set-tree" and dcommit
+
+=head1 SYNOPSIS
+
+	use Git::SVN::Editor;
+	use Git::SVN::Ra;
+
+	my $ra = Git::SVN::Ra->new($url);
+	my %opts = (
+		r => 19,
+		log => "log message",
+		ra => $ra,
+		config => SVN::Core::config_get_config($svn_config_dir),
+		tree_a => "$commit^",
+		tree_b => "$commit",
+		editor_cb => sub { print "Committed r$_[0]\n"; },
+		mergeinfo => "/branches/foo:1-10",
+		svn_path => "trunk"
+	);
+	Git::SVN::Editor->new(\%opts)->apply_diff or print "No changes\n";
+
+	my $re = Git::SVN::Editor::glob2pat("trunk/*");
+	if ($branchname =~ /$re/) {
+		print "matched!\n";
+	}
+
+=head1 DESCRIPTION
+
+This module is an implementation detail of the "git svn" command.
+Do not use it unless you are developing git-svn.
+
+This module adapts the C<SVN::Delta::Editor> object returned by
+C<SVN::Delta::get_commit_editor> and drives it to convey the
+difference between two git tree objects to a remote Subversion
+repository.
+
+The interface will change as git-svn evolves.
+
+=head1 DEPENDENCIES
+
+Subversion perl bindings,
+the core L<Carp> and L<IO::File> modules,
+and git's L<Git> helper module.
+
+C<Git::SVN::Editor> has not been tested using callers other than
+B<git-svn> itself.
+
+=head1 SEE ALSO
+
+L<SVN::Delta>,
+L<Git::SVN::Fetcher>.
+
+=head1 INCOMPATIBILITIES
+
+None reported.
+
+=head1 BUGS
+
+None.
diff --git a/perl/Git/SVN/Fetcher.pm b/perl/Git/SVN/Fetcher.pm
index 4e9c77d7..ef8e9ed2 100644
--- a/perl/Git/SVN/Fetcher.pm
+++ b/perl/Git/SVN/Fetcher.pm
@@ -591,7 +591,8 @@ B<git-svn> itself.
 
 =head1 SEE ALSO
 
-L<SVN::Delta>.
+L<SVN::Delta>,
+L<Git::SVN::Editor>.
 
 =head1 INCOMPATIBILITIES
 
diff --git a/perl/Makefile.PL b/perl/Makefile.PL
index 424890a1..32caf214 100644
--- a/perl/Makefile.PL
+++ b/perl/Makefile.PL
@@ -28,6 +28,7 @@ my %pm = (
 	'Git.pm' => '$(INST_LIBDIR)/Git.pm',
 	'Git/I18N.pm' => '$(INST_LIBDIR)/Git/I18N.pm',
 	'Git/SVN/Fetcher.pm' => '$(INST_LIBDIR)/Git/SVN/Fetcher.pm',
+	'Git/SVN/Editor.pm' => '$(INST_LIBDIR)/Git/SVN/Editor.pm',
 	'Git/SVN/Prompt.pm' => '$(INST_LIBDIR)/Git/SVN/Prompt.pm',
 );
 
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 2/3] git-svn: make Git::SVN::RA a separate file
  2012-06-09 22:20         ` [PATCH v2 0/3] " Jonathan Nieder
  2012-06-09 22:25           ` [PATCH 1/3] git-svn: make Git::SVN::Editor a separate file Jonathan Nieder
@ 2012-06-09 22:28           ` Jonathan Nieder
  2012-06-09 22:35           ` [PATCH 3/3] git-svn: use YAML format for mergeinfo cache when possible Jonathan Nieder
  2012-06-10  9:00           ` [PATCH v2 0/3] " Eric Wong
  3 siblings, 0 replies; 26+ messages in thread
From: Jonathan Nieder @ 2012-06-09 22:28 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Jason Gross, git

This slices off another 600 or so lines from the frighteningly long
git-svn.perl script.

The Git::SVN::Ra interface is similar enough to SVN::Ra that it is
probably safe to ignore most of its implementation on first reading.
(Documenting or moving functions that do not fit that pattern is left
as an exercise to the interested reader.)

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 git-svn.perl       |  612 +-----------------------------------------------
 perl/Git/SVN/Ra.pm |  658 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 perl/Makefile.PL   |    1 +
 3 files changed, 660 insertions(+), 611 deletions(-)
 create mode 100644 perl/Git/SVN/Ra.pm

diff --git a/git-svn.perl b/git-svn.perl
index 7cec4416..24c0af2e 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -67,7 +67,6 @@ sub _req_svn {
 	}
 }
 my $can_compress = eval { require Compress::Zlib; 1};
-push @Git::SVN::Ra::ISA, 'SVN::Ra';
 use Carp qw/croak/;
 use Digest::MD5;
 use IO::File qw//;
@@ -80,6 +79,7 @@ use IPC::Open3;
 use Git;
 use Git::SVN::Editor qw//;
 use Git::SVN::Fetcher qw//;
+use Git::SVN::Ra qw//;
 use Git::SVN::Prompt qw//;
 use Memoize;  # core since 5.8.0, Jul 2002
 
@@ -4326,616 +4326,6 @@ sub remove_username {
 	$_[0] =~ s{^([^:]*://)[^@]+@}{$1};
 }
 
-package Git::SVN::Ra;
-use vars qw/@ISA $config_dir $_ignore_refs_regex $_log_window_size/;
-use strict;
-use warnings;
-my ($ra_invalid, $can_do_switch, %ignored_err, $RA);
-
-BEGIN {
-	# enforce temporary pool usage for some simple functions
-	no strict 'refs';
-	for my $f (qw/rev_proplist get_latest_revnum get_uuid get_repos_root
-	              get_file/) {
-		my $SUPER = "SUPER::$f";
-		*$f = sub {
-			my $self = shift;
-			my $pool = SVN::Pool->new;
-			my @ret = $self->$SUPER(@_,$pool);
-			$pool->clear;
-			wantarray ? @ret : $ret[0];
-		};
-	}
-}
-
-sub _auth_providers () {
-	my @rv = (
-	  SVN::Client::get_simple_provider(),
-	  SVN::Client::get_ssl_server_trust_file_provider(),
-	  SVN::Client::get_simple_prompt_provider(
-	    \&Git::SVN::Prompt::simple, 2),
-	  SVN::Client::get_ssl_client_cert_file_provider(),
-	  SVN::Client::get_ssl_client_cert_prompt_provider(
-	    \&Git::SVN::Prompt::ssl_client_cert, 2),
-	  SVN::Client::get_ssl_client_cert_pw_file_provider(),
-	  SVN::Client::get_ssl_client_cert_pw_prompt_provider(
-	    \&Git::SVN::Prompt::ssl_client_cert_pw, 2),
-	  SVN::Client::get_username_provider(),
-	  SVN::Client::get_ssl_server_trust_prompt_provider(
-	    \&Git::SVN::Prompt::ssl_server_trust),
-	  SVN::Client::get_username_prompt_provider(
-	    \&Git::SVN::Prompt::username, 2)
-	);
-
-	# earlier 1.6.x versions would segfault, and <= 1.5.x didn't have
-	# this function
-	if (::compare_svn_version('1.6.12') > 0) {
-		my $config = SVN::Core::config_get_config($config_dir);
-		my ($p, @a);
-		# config_get_config returns all config files from
-		# ~/.subversion, auth_get_platform_specific_client_providers
-		# just wants the config "file".
-		@a = ($config->{'config'}, undef);
-		$p = SVN::Core::auth_get_platform_specific_client_providers(@a);
-		# Insert the return value from
-		# auth_get_platform_specific_providers
-		unshift @rv, @$p;
-	}
-	\@rv;
-}
-
-sub escape_uri_only {
-	my ($uri) = @_;
-	my @tmp;
-	foreach (split m{/}, $uri) {
-		s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
-		push @tmp, $_;
-	}
-	join('/', @tmp);
-}
-
-sub escape_url {
-	my ($url) = @_;
-	if ($url =~ m#^(https?)://([^/]+)(.*)$#) {
-		my ($scheme, $domain, $uri) = ($1, $2, escape_uri_only($3));
-		$url = "$scheme://$domain$uri";
-	}
-	$url;
-}
-
-sub new {
-	my ($class, $url) = @_;
-	$url =~ s!/+$!!;
-	return $RA if ($RA && $RA->{url} eq $url);
-
-	::_req_svn();
-
-	SVN::_Core::svn_config_ensure($config_dir, undef);
-	my ($baton, $callbacks) = SVN::Core::auth_open_helper(_auth_providers);
-	my $config = SVN::Core::config_get_config($config_dir);
-	$RA = undef;
-	my $dont_store_passwords = 1;
-	my $conf_t = ${$config}{'config'};
-	{
-		no warnings 'once';
-		# The usage of $SVN::_Core::SVN_CONFIG_* variables
-		# produces warnings that variables are used only once.
-		# I had not found the better way to shut them up, so
-		# the warnings of type 'once' are disabled in this block.
-		if (SVN::_Core::svn_config_get_bool($conf_t,
-		    $SVN::_Core::SVN_CONFIG_SECTION_AUTH,
-		    $SVN::_Core::SVN_CONFIG_OPTION_STORE_PASSWORDS,
-		    1) == 0) {
-			SVN::_Core::svn_auth_set_parameter($baton,
-			    $SVN::_Core::SVN_AUTH_PARAM_DONT_STORE_PASSWORDS,
-			    bless (\$dont_store_passwords, "_p_void"));
-		}
-		if (SVN::_Core::svn_config_get_bool($conf_t,
-		    $SVN::_Core::SVN_CONFIG_SECTION_AUTH,
-		    $SVN::_Core::SVN_CONFIG_OPTION_STORE_AUTH_CREDS,
-		    1) == 0) {
-			$Git::SVN::Prompt::_no_auth_cache = 1;
-		}
-	} # no warnings 'once'
-	my $self = SVN::Ra->new(url => escape_url($url), auth => $baton,
-	                      config => $config,
-			      pool => SVN::Pool->new,
-	                      auth_provider_callbacks => $callbacks);
-	$self->{url} = $url;
-	$self->{svn_path} = $url;
-	$self->{repos_root} = $self->get_repos_root;
-	$self->{svn_path} =~ s#^\Q$self->{repos_root}\E(/|$)##;
-	$self->{cache} = { check_path => { r => 0, data => {} },
-	                   get_dir => { r => 0, data => {} } };
-	$RA = bless $self, $class;
-}
-
-sub check_path {
-	my ($self, $path, $r) = @_;
-	my $cache = $self->{cache}->{check_path};
-	if ($r == $cache->{r} && exists $cache->{data}->{$path}) {
-		return $cache->{data}->{$path};
-	}
-	my $pool = SVN::Pool->new;
-	my $t = $self->SUPER::check_path($path, $r, $pool);
-	$pool->clear;
-	if ($r != $cache->{r}) {
-		%{$cache->{data}} = ();
-		$cache->{r} = $r;
-	}
-	$cache->{data}->{$path} = $t;
-}
-
-sub get_dir {
-	my ($self, $dir, $r) = @_;
-	my $cache = $self->{cache}->{get_dir};
-	if ($r == $cache->{r}) {
-		if (my $x = $cache->{data}->{$dir}) {
-			return wantarray ? @$x : $x->[0];
-		}
-	}
-	my $pool = SVN::Pool->new;
-	my ($d, undef, $props) = $self->SUPER::get_dir($dir, $r, $pool);
-	my %dirents = map { $_ => { kind => $d->{$_}->kind } } keys %$d;
-	$pool->clear;
-	if ($r != $cache->{r}) {
-		%{$cache->{data}} = ();
-		$cache->{r} = $r;
-	}
-	$cache->{data}->{$dir} = [ \%dirents, $r, $props ];
-	wantarray ? (\%dirents, $r, $props) : \%dirents;
-}
-
-sub DESTROY {
-	# do not call the real DESTROY since we store ourselves in $RA
-}
-
-# get_log(paths, start, end, limit,
-#         discover_changed_paths, strict_node_history, receiver)
-sub get_log {
-	my ($self, @args) = @_;
-	my $pool = SVN::Pool->new;
-
-	# svn_log_changed_path_t objects passed to get_log are likely to be
-	# overwritten even if only the refs are copied to an external variable,
-	# so we should dup the structures in their entirety.  Using an
-	# externally passed pool (instead of our temporary and quickly cleared
-	# pool in Git::SVN::Ra) does not help matters at all...
-	my $receiver = pop @args;
-	my $prefix = "/".$self->{svn_path};
-	$prefix =~ s#/+($)##;
-	my $prefix_regex = qr#^\Q$prefix\E#;
-	push(@args, sub {
-		my ($paths) = $_[0];
-		return &$receiver(@_) unless $paths;
-		$_[0] = ();
-		foreach my $p (keys %$paths) {
-			my $i = $paths->{$p};
-			# Make path relative to our url, not repos_root
-			$p =~ s/$prefix_regex//;
-			my %s = map { $_ => $i->$_; }
-				qw/copyfrom_path copyfrom_rev action/;
-			if ($s{'copyfrom_path'}) {
-				$s{'copyfrom_path'} =~ s/$prefix_regex//;
-			}
-			$_[0]{$p} = \%s;
-		}
-		&$receiver(@_);
-	});
-
-
-	# the limit parameter was not supported in SVN 1.1.x, so we
-	# drop it.  Therefore, the receiver callback passed to it
-	# is made aware of this limitation by being wrapped if
-	# the limit passed to is being wrapped.
-	if (::compare_svn_version('1.2.0') <= 0) {
-		my $limit = splice(@args, 3, 1);
-		if ($limit > 0) {
-			my $receiver = pop @args;
-			push(@args, sub { &$receiver(@_) if (--$limit >= 0) });
-		}
-	}
-	my $ret = $self->SUPER::get_log(@args, $pool);
-	$pool->clear;
-	$ret;
-}
-
-sub trees_match {
-	my ($self, $url1, $rev1, $url2, $rev2) = @_;
-	my $ctx = SVN::Client->new(auth => _auth_providers);
-	my $out = IO::File->new_tmpfile;
-
-	# older SVN (1.1.x) doesn't take $pool as the last parameter for
-	# $ctx->diff(), so we'll create a default one
-	my $pool = SVN::Pool->new_default_sub;
-
-	$ra_invalid = 1; # this will open a new SVN::Ra connection to $url1
-	$ctx->diff([], $url1, $rev1, $url2, $rev2, 1, 1, 0, $out, $out);
-	$out->flush;
-	my $ret = (($out->stat)[7] == 0);
-	close $out or croak $!;
-
-	$ret;
-}
-
-sub get_commit_editor {
-	my ($self, $log, $cb, $pool) = @_;
-
-	my @lock = (::compare_svn_version('1.2.0') >= 0) ? (undef, 0) : ();
-	$self->SUPER::get_commit_editor($log, $cb, @lock, $pool);
-}
-
-sub gs_do_update {
-	my ($self, $rev_a, $rev_b, $gs, $editor) = @_;
-	my $new = ($rev_a == $rev_b);
-	my $path = $gs->{path};
-
-	if ($new && -e $gs->{index}) {
-		unlink $gs->{index} or die
-		  "Couldn't unlink index: $gs->{index}: $!\n";
-	}
-	my $pool = SVN::Pool->new;
-	$editor->set_path_strip($path);
-	my (@pc) = split m#/#, $path;
-	my $reporter = $self->do_update($rev_b, (@pc ? shift @pc : ''),
-	                                1, $editor, $pool);
-	my @lock = (::compare_svn_version('1.2.0') >= 0) ? (undef) : ();
-
-	# Since we can't rely on svn_ra_reparent being available, we'll
-	# just have to do some magic with set_path to make it so
-	# we only want a partial path.
-	my $sp = '';
-	my $final = join('/', @pc);
-	while (@pc) {
-		$reporter->set_path($sp, $rev_b, 0, @lock, $pool);
-		$sp .= '/' if length $sp;
-		$sp .= shift @pc;
-	}
-	die "BUG: '$sp' != '$final'\n" if ($sp ne $final);
-
-	$reporter->set_path($sp, $rev_a, $new, @lock, $pool);
-
-	$reporter->finish_report($pool);
-	$pool->clear;
-	$editor->{git_commit_ok};
-}
-
-# this requires SVN 1.4.3 or later (do_switch didn't work before 1.4.3, and
-# svn_ra_reparent didn't work before 1.4)
-sub gs_do_switch {
-	my ($self, $rev_a, $rev_b, $gs, $url_b, $editor) = @_;
-	my $path = $gs->{path};
-	my $pool = SVN::Pool->new;
-
-	my $full_url = $self->{url};
-	my $old_url = $full_url;
-	$full_url .= '/' . $path if length $path;
-	my ($ra, $reparented);
-
-	if ($old_url =~ m#^svn(\+ssh)?://# ||
-	    ($full_url =~ m#^https?://# &&
-	     escape_url($full_url) ne $full_url)) {
-		$_[0] = undef;
-		$self = undef;
-		$RA = undef;
-		$ra = Git::SVN::Ra->new($full_url);
-		$ra_invalid = 1;
-	} elsif ($old_url ne $full_url) {
-		SVN::_Ra::svn_ra_reparent($self->{session}, $full_url, $pool);
-		$self->{url} = $full_url;
-		$reparented = 1;
-	}
-
-	$ra ||= $self;
-	$url_b = escape_url($url_b);
-	my $reporter = $ra->do_switch($rev_b, '', 1, $url_b, $editor, $pool);
-	my @lock = (::compare_svn_version('1.2.0') >= 0) ? (undef) : ();
-	$reporter->set_path('', $rev_a, 0, @lock, $pool);
-	$reporter->finish_report($pool);
-
-	if ($reparented) {
-		SVN::_Ra::svn_ra_reparent($self->{session}, $old_url, $pool);
-		$self->{url} = $old_url;
-	}
-
-	$pool->clear;
-	$editor->{git_commit_ok};
-}
-
-sub longest_common_path {
-	my ($gsv, $globs) = @_;
-	my %common;
-	my $common_max = scalar @$gsv;
-
-	foreach my $gs (@$gsv) {
-		my @tmp = split m#/#, $gs->{path};
-		my $p = '';
-		foreach (@tmp) {
-			$p .= length($p) ? "/$_" : $_;
-			$common{$p} ||= 0;
-			$common{$p}++;
-		}
-	}
-	$globs ||= [];
-	$common_max += scalar @$globs;
-	foreach my $glob (@$globs) {
-		my @tmp = split m#/#, $glob->{path}->{left};
-		my $p = '';
-		foreach (@tmp) {
-			$p .= length($p) ? "/$_" : $_;
-			$common{$p} ||= 0;
-			$common{$p}++;
-		}
-	}
-
-	my $longest_path = '';
-	foreach (sort {length $b <=> length $a} keys %common) {
-		if ($common{$_} == $common_max) {
-			$longest_path = $_;
-			last;
-		}
-	}
-	$longest_path;
-}
-
-sub gs_fetch_loop_common {
-	my ($self, $base, $head, $gsv, $globs) = @_;
-	return if ($base > $head);
-	my $inc = $_log_window_size;
-	my ($min, $max) = ($base, $head < $base + $inc ? $head : $base + $inc);
-	my $longest_path = longest_common_path($gsv, $globs);
-	my $ra_url = $self->{url};
-	my $find_trailing_edge;
-	while (1) {
-		my %revs;
-		my $err;
-		my $err_handler = $SVN::Error::handler;
-		$SVN::Error::handler = sub {
-			($err) = @_;
-			skip_unknown_revs($err);
-		};
-		sub _cb {
-			my ($paths, $r, $author, $date, $log) = @_;
-			[ $paths,
-			  { author => $author, date => $date, log => $log } ];
-		}
-		$self->get_log([$longest_path], $min, $max, 0, 1, 1,
-		               sub { $revs{$_[1]} = _cb(@_) });
-		if ($err) {
-			print "Checked through r$max\r";
-		} else {
-			$find_trailing_edge = 1;
-		}
-		if ($err and $find_trailing_edge) {
-			print STDERR "Path '$longest_path' ",
-				     "was probably deleted:\n",
-				     $err->expanded_message,
-				     "\nWill attempt to follow ",
-				     "revisions r$min .. r$max ",
-				     "committed before the deletion\n";
-			my $hi = $max;
-			while (--$hi >= $min) {
-				my $ok;
-				$self->get_log([$longest_path], $min, $hi,
-				               0, 1, 1, sub {
-				               $ok = $_[1];
-				               $revs{$_[1]} = _cb(@_) });
-				if ($ok) {
-					print STDERR "r$min .. r$ok OK\n";
-					last;
-				}
-			}
-			$find_trailing_edge = 0;
-		}
-		$SVN::Error::handler = $err_handler;
-
-		my %exists = map { $_->{path} => $_ } @$gsv;
-		foreach my $r (sort {$a <=> $b} keys %revs) {
-			my ($paths, $logged) = @{$revs{$r}};
-
-			foreach my $gs ($self->match_globs(\%exists, $paths,
-			                                   $globs, $r)) {
-				if ($gs->rev_map_max >= $r) {
-					next;
-				}
-				next unless $gs->match_paths($paths, $r);
-				$gs->{logged_rev_props} = $logged;
-				if (my $last_commit = $gs->last_commit) {
-					$gs->assert_index_clean($last_commit);
-				}
-				my $log_entry = $gs->do_fetch($paths, $r);
-				if ($log_entry) {
-					$gs->do_git_commit($log_entry);
-				}
-				$INDEX_FILES{$gs->{index}} = 1;
-			}
-			foreach my $g (@$globs) {
-				my $k = "svn-remote.$g->{remote}." .
-				        "$g->{t}-maxRev";
-				Git::SVN::tmp_config($k, $r);
-			}
-			if ($ra_invalid) {
-				$_[0] = undef;
-				$self = undef;
-				$RA = undef;
-				$self = Git::SVN::Ra->new($ra_url);
-				$ra_invalid = undef;
-			}
-		}
-		# pre-fill the .rev_db since it'll eventually get filled in
-		# with '0' x40 if something new gets committed
-		foreach my $gs (@$gsv) {
-			next if $gs->rev_map_max >= $max;
-			next if defined $gs->rev_map_get($max);
-			$gs->rev_map_set($max, 0 x40);
-		}
-		foreach my $g (@$globs) {
-			my $k = "svn-remote.$g->{remote}.$g->{t}-maxRev";
-			Git::SVN::tmp_config($k, $max);
-		}
-		last if $max >= $head;
-		$min = $max + 1;
-		$max += $inc;
-		$max = $head if ($max > $head);
-	}
-	Git::SVN::gc();
-}
-
-sub get_dir_globbed {
-	my ($self, $left, $depth, $r) = @_;
-
-	my @x = eval { $self->get_dir($left, $r) };
-	return unless scalar @x == 3;
-	my $dirents = $x[0];
-	my @finalents;
-	foreach my $de (keys %$dirents) {
-		next if $dirents->{$de}->{kind} != $SVN::Node::dir;
-		if ($depth > 1) {
-			my @args = ("$left/$de", $depth - 1, $r);
-			foreach my $dir ($self->get_dir_globbed(@args)) {
-				push @finalents, "$de/$dir";
-			}
-		} else {
-			push @finalents, $de;
-		}
-	}
-	@finalents;
-}
-
-# return value: 0 -- don't ignore, 1 -- ignore
-sub is_ref_ignored {
-	my ($g, $p) = @_;
-	my $refname = $g->{ref}->full_path($p);
-	return 1 if defined($g->{ignore_refs_regex}) &&
-	            $refname =~ m!$g->{ignore_refs_regex}!;
-	return 0 unless defined($_ignore_refs_regex);
-	return 1 if $refname =~ m!$_ignore_refs_regex!o;
-	return 0;
-}
-
-sub match_globs {
-	my ($self, $exists, $paths, $globs, $r) = @_;
-
-	sub get_dir_check {
-		my ($self, $exists, $g, $r) = @_;
-
-		my @dirs = $self->get_dir_globbed($g->{path}->{left},
-		                                  $g->{path}->{depth},
-		                                  $r);
-
-		foreach my $de (@dirs) {
-			my $p = $g->{path}->full_path($de);
-			next if $exists->{$p};
-			next if (length $g->{path}->{right} &&
-				 ($self->check_path($p, $r) !=
-				  $SVN::Node::dir));
-			next unless $p =~ /$g->{path}->{regex}/;
-			$exists->{$p} = Git::SVN->init($self->{url}, $p, undef,
-					 $g->{ref}->full_path($de), 1);
-		}
-	}
-	foreach my $g (@$globs) {
-		if (my $path = $paths->{"/$g->{path}->{left}"}) {
-			if ($path->{action} =~ /^[AR]$/) {
-				get_dir_check($self, $exists, $g, $r);
-			}
-		}
-		foreach (keys %$paths) {
-			if (/$g->{path}->{left_regex}/ &&
-			    !/$g->{path}->{regex}/) {
-				next if $paths->{$_}->{action} !~ /^[AR]$/;
-				get_dir_check($self, $exists, $g, $r);
-			}
-			next unless /$g->{path}->{regex}/;
-			my $p = $1;
-			my $pathname = $g->{path}->full_path($p);
-			next if is_ref_ignored($g, $p);
-			next if $exists->{$pathname};
-			next if ($self->check_path($pathname, $r) !=
-			         $SVN::Node::dir);
-			$exists->{$pathname} = Git::SVN->init(
-			                      $self->{url}, $pathname, undef,
-			                      $g->{ref}->full_path($p), 1);
-		}
-		my $c = '';
-		foreach (split m#/#, $g->{path}->{left}) {
-			$c .= "/$_";
-			next unless ($paths->{$c} &&
-			             ($paths->{$c}->{action} =~ /^[AR]$/));
-			get_dir_check($self, $exists, $g, $r);
-		}
-	}
-	values %$exists;
-}
-
-sub minimize_url {
-	my ($self) = @_;
-	return $self->{url} if ($self->{url} eq $self->{repos_root});
-	my $url = $self->{repos_root};
-	my @components = split(m!/!, $self->{svn_path});
-	my $c = '';
-	do {
-		$url .= "/$c" if length $c;
-		eval {
-			my $ra = (ref $self)->new($url);
-			my $latest = $ra->get_latest_revnum;
-			$ra->get_log("", $latest, 0, 1, 0, 1, sub {});
-		};
-	} while ($@ && ($c = shift @components));
-	$url;
-}
-
-sub can_do_switch {
-	my $self = shift;
-	unless (defined $can_do_switch) {
-		my $pool = SVN::Pool->new;
-		my $rep = eval {
-			$self->do_switch(1, '', 0, $self->{url},
-			                 SVN::Delta::Editor->new, $pool);
-		};
-		if ($@) {
-			$can_do_switch = 0;
-		} else {
-			$rep->abort_report($pool);
-			$can_do_switch = 1;
-		}
-		$pool->clear;
-	}
-	$can_do_switch;
-}
-
-sub skip_unknown_revs {
-	my ($err) = @_;
-	my $errno = $err->apr_err();
-	# Maybe the branch we're tracking didn't
-	# exist when the repo started, so it's
-	# not an error if it doesn't, just continue
-	#
-	# Wonderfully consistent library, eh?
-	# 160013 - svn:// and file://
-	# 175002 - http(s)://
-	# 175007 - http(s):// (this repo required authorization, too...)
-	#   More codes may be discovered later...
-	if ($errno == 175007 || $errno == 175002 || $errno == 160013) {
-		my $err_key = $err->expanded_message;
-		# revision numbers change every time, filter them out
-		$err_key =~ s/\d+/\0/g;
-		$err_key = "$errno\0$err_key";
-		unless ($ignored_err{$err_key}) {
-			warn "W: Ignoring error from SVN, path probably ",
-			     "does not exist: ($errno): ",
-			     $err->expanded_message,"\n";
-			warn "W: Do not be alarmed at the above message ",
-			     "git-svn is just searching aggressively for ",
-			     "old history.\n",
-			     "This may take a while on large repositories\n";
-			$ignored_err{$err_key} = 1;
-		}
-		return;
-	}
-	die "Error from SVN, ($errno): ", $err->expanded_message,"\n";
-}
-
 package Git::SVN::Log;
 use strict;
 use warnings;
diff --git a/perl/Git/SVN/Ra.pm b/perl/Git/SVN/Ra.pm
new file mode 100644
index 00000000..e88b4e79
--- /dev/null
+++ b/perl/Git/SVN/Ra.pm
@@ -0,0 +1,658 @@
+package Git::SVN::Ra;
+use vars qw/@ISA $config_dir $_ignore_refs_regex $_log_window_size/;
+use strict;
+use warnings;
+use SVN::Client;
+use SVN::Ra;
+BEGIN {
+	@ISA = qw(SVN::Ra);
+}
+
+my ($ra_invalid, $can_do_switch, %ignored_err, $RA);
+
+BEGIN {
+	# enforce temporary pool usage for some simple functions
+	no strict 'refs';
+	for my $f (qw/rev_proplist get_latest_revnum get_uuid get_repos_root
+	              get_file/) {
+		my $SUPER = "SUPER::$f";
+		*$f = sub {
+			my $self = shift;
+			my $pool = SVN::Pool->new;
+			my @ret = $self->$SUPER(@_,$pool);
+			$pool->clear;
+			wantarray ? @ret : $ret[0];
+		};
+	}
+}
+
+sub _auth_providers () {
+	my @rv = (
+	  SVN::Client::get_simple_provider(),
+	  SVN::Client::get_ssl_server_trust_file_provider(),
+	  SVN::Client::get_simple_prompt_provider(
+	    \&Git::SVN::Prompt::simple, 2),
+	  SVN::Client::get_ssl_client_cert_file_provider(),
+	  SVN::Client::get_ssl_client_cert_prompt_provider(
+	    \&Git::SVN::Prompt::ssl_client_cert, 2),
+	  SVN::Client::get_ssl_client_cert_pw_file_provider(),
+	  SVN::Client::get_ssl_client_cert_pw_prompt_provider(
+	    \&Git::SVN::Prompt::ssl_client_cert_pw, 2),
+	  SVN::Client::get_username_provider(),
+	  SVN::Client::get_ssl_server_trust_prompt_provider(
+	    \&Git::SVN::Prompt::ssl_server_trust),
+	  SVN::Client::get_username_prompt_provider(
+	    \&Git::SVN::Prompt::username, 2)
+	);
+
+	# earlier 1.6.x versions would segfault, and <= 1.5.x didn't have
+	# this function
+	if (::compare_svn_version('1.6.12') > 0) {
+		my $config = SVN::Core::config_get_config($config_dir);
+		my ($p, @a);
+		# config_get_config returns all config files from
+		# ~/.subversion, auth_get_platform_specific_client_providers
+		# just wants the config "file".
+		@a = ($config->{'config'}, undef);
+		$p = SVN::Core::auth_get_platform_specific_client_providers(@a);
+		# Insert the return value from
+		# auth_get_platform_specific_providers
+		unshift @rv, @$p;
+	}
+	\@rv;
+}
+
+sub escape_uri_only {
+	my ($uri) = @_;
+	my @tmp;
+	foreach (split m{/}, $uri) {
+		s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
+		push @tmp, $_;
+	}
+	join('/', @tmp);
+}
+
+sub escape_url {
+	my ($url) = @_;
+	if ($url =~ m#^(https?)://([^/]+)(.*)$#) {
+		my ($scheme, $domain, $uri) = ($1, $2, escape_uri_only($3));
+		$url = "$scheme://$domain$uri";
+	}
+	$url;
+}
+
+sub new {
+	my ($class, $url) = @_;
+	$url =~ s!/+$!!;
+	return $RA if ($RA && $RA->{url} eq $url);
+
+	::_req_svn();
+
+	SVN::_Core::svn_config_ensure($config_dir, undef);
+	my ($baton, $callbacks) = SVN::Core::auth_open_helper(_auth_providers);
+	my $config = SVN::Core::config_get_config($config_dir);
+	$RA = undef;
+	my $dont_store_passwords = 1;
+	my $conf_t = ${$config}{'config'};
+	{
+		no warnings 'once';
+		# The usage of $SVN::_Core::SVN_CONFIG_* variables
+		# produces warnings that variables are used only once.
+		# I had not found the better way to shut them up, so
+		# the warnings of type 'once' are disabled in this block.
+		if (SVN::_Core::svn_config_get_bool($conf_t,
+		    $SVN::_Core::SVN_CONFIG_SECTION_AUTH,
+		    $SVN::_Core::SVN_CONFIG_OPTION_STORE_PASSWORDS,
+		    1) == 0) {
+			SVN::_Core::svn_auth_set_parameter($baton,
+			    $SVN::_Core::SVN_AUTH_PARAM_DONT_STORE_PASSWORDS,
+			    bless (\$dont_store_passwords, "_p_void"));
+		}
+		if (SVN::_Core::svn_config_get_bool($conf_t,
+		    $SVN::_Core::SVN_CONFIG_SECTION_AUTH,
+		    $SVN::_Core::SVN_CONFIG_OPTION_STORE_AUTH_CREDS,
+		    1) == 0) {
+			$Git::SVN::Prompt::_no_auth_cache = 1;
+		}
+	} # no warnings 'once'
+	my $self = SVN::Ra->new(url => escape_url($url), auth => $baton,
+	                      config => $config,
+			      pool => SVN::Pool->new,
+	                      auth_provider_callbacks => $callbacks);
+	$self->{url} = $url;
+	$self->{svn_path} = $url;
+	$self->{repos_root} = $self->get_repos_root;
+	$self->{svn_path} =~ s#^\Q$self->{repos_root}\E(/|$)##;
+	$self->{cache} = { check_path => { r => 0, data => {} },
+	                   get_dir => { r => 0, data => {} } };
+	$RA = bless $self, $class;
+}
+
+sub check_path {
+	my ($self, $path, $r) = @_;
+	my $cache = $self->{cache}->{check_path};
+	if ($r == $cache->{r} && exists $cache->{data}->{$path}) {
+		return $cache->{data}->{$path};
+	}
+	my $pool = SVN::Pool->new;
+	my $t = $self->SUPER::check_path($path, $r, $pool);
+	$pool->clear;
+	if ($r != $cache->{r}) {
+		%{$cache->{data}} = ();
+		$cache->{r} = $r;
+	}
+	$cache->{data}->{$path} = $t;
+}
+
+sub get_dir {
+	my ($self, $dir, $r) = @_;
+	my $cache = $self->{cache}->{get_dir};
+	if ($r == $cache->{r}) {
+		if (my $x = $cache->{data}->{$dir}) {
+			return wantarray ? @$x : $x->[0];
+		}
+	}
+	my $pool = SVN::Pool->new;
+	my ($d, undef, $props) = $self->SUPER::get_dir($dir, $r, $pool);
+	my %dirents = map { $_ => { kind => $d->{$_}->kind } } keys %$d;
+	$pool->clear;
+	if ($r != $cache->{r}) {
+		%{$cache->{data}} = ();
+		$cache->{r} = $r;
+	}
+	$cache->{data}->{$dir} = [ \%dirents, $r, $props ];
+	wantarray ? (\%dirents, $r, $props) : \%dirents;
+}
+
+sub DESTROY {
+	# do not call the real DESTROY since we store ourselves in $RA
+}
+
+# get_log(paths, start, end, limit,
+#         discover_changed_paths, strict_node_history, receiver)
+sub get_log {
+	my ($self, @args) = @_;
+	my $pool = SVN::Pool->new;
+
+	# svn_log_changed_path_t objects passed to get_log are likely to be
+	# overwritten even if only the refs are copied to an external variable,
+	# so we should dup the structures in their entirety.  Using an
+	# externally passed pool (instead of our temporary and quickly cleared
+	# pool in Git::SVN::Ra) does not help matters at all...
+	my $receiver = pop @args;
+	my $prefix = "/".$self->{svn_path};
+	$prefix =~ s#/+($)##;
+	my $prefix_regex = qr#^\Q$prefix\E#;
+	push(@args, sub {
+		my ($paths) = $_[0];
+		return &$receiver(@_) unless $paths;
+		$_[0] = ();
+		foreach my $p (keys %$paths) {
+			my $i = $paths->{$p};
+			# Make path relative to our url, not repos_root
+			$p =~ s/$prefix_regex//;
+			my %s = map { $_ => $i->$_; }
+				qw/copyfrom_path copyfrom_rev action/;
+			if ($s{'copyfrom_path'}) {
+				$s{'copyfrom_path'} =~ s/$prefix_regex//;
+			}
+			$_[0]{$p} = \%s;
+		}
+		&$receiver(@_);
+	});
+
+
+	# the limit parameter was not supported in SVN 1.1.x, so we
+	# drop it.  Therefore, the receiver callback passed to it
+	# is made aware of this limitation by being wrapped if
+	# the limit passed to is being wrapped.
+	if (::compare_svn_version('1.2.0') <= 0) {
+		my $limit = splice(@args, 3, 1);
+		if ($limit > 0) {
+			my $receiver = pop @args;
+			push(@args, sub { &$receiver(@_) if (--$limit >= 0) });
+		}
+	}
+	my $ret = $self->SUPER::get_log(@args, $pool);
+	$pool->clear;
+	$ret;
+}
+
+sub trees_match {
+	my ($self, $url1, $rev1, $url2, $rev2) = @_;
+	my $ctx = SVN::Client->new(auth => _auth_providers);
+	my $out = IO::File->new_tmpfile;
+
+	# older SVN (1.1.x) doesn't take $pool as the last parameter for
+	# $ctx->diff(), so we'll create a default one
+	my $pool = SVN::Pool->new_default_sub;
+
+	$ra_invalid = 1; # this will open a new SVN::Ra connection to $url1
+	$ctx->diff([], $url1, $rev1, $url2, $rev2, 1, 1, 0, $out, $out);
+	$out->flush;
+	my $ret = (($out->stat)[7] == 0);
+	close $out or croak $!;
+
+	$ret;
+}
+
+sub get_commit_editor {
+	my ($self, $log, $cb, $pool) = @_;
+
+	my @lock = (::compare_svn_version('1.2.0') >= 0) ? (undef, 0) : ();
+	$self->SUPER::get_commit_editor($log, $cb, @lock, $pool);
+}
+
+sub gs_do_update {
+	my ($self, $rev_a, $rev_b, $gs, $editor) = @_;
+	my $new = ($rev_a == $rev_b);
+	my $path = $gs->{path};
+
+	if ($new && -e $gs->{index}) {
+		unlink $gs->{index} or die
+		  "Couldn't unlink index: $gs->{index}: $!\n";
+	}
+	my $pool = SVN::Pool->new;
+	$editor->set_path_strip($path);
+	my (@pc) = split m#/#, $path;
+	my $reporter = $self->do_update($rev_b, (@pc ? shift @pc : ''),
+	                                1, $editor, $pool);
+	my @lock = (::compare_svn_version('1.2.0') >= 0) ? (undef) : ();
+
+	# Since we can't rely on svn_ra_reparent being available, we'll
+	# just have to do some magic with set_path to make it so
+	# we only want a partial path.
+	my $sp = '';
+	my $final = join('/', @pc);
+	while (@pc) {
+		$reporter->set_path($sp, $rev_b, 0, @lock, $pool);
+		$sp .= '/' if length $sp;
+		$sp .= shift @pc;
+	}
+	die "BUG: '$sp' != '$final'\n" if ($sp ne $final);
+
+	$reporter->set_path($sp, $rev_a, $new, @lock, $pool);
+
+	$reporter->finish_report($pool);
+	$pool->clear;
+	$editor->{git_commit_ok};
+}
+
+# this requires SVN 1.4.3 or later (do_switch didn't work before 1.4.3, and
+# svn_ra_reparent didn't work before 1.4)
+sub gs_do_switch {
+	my ($self, $rev_a, $rev_b, $gs, $url_b, $editor) = @_;
+	my $path = $gs->{path};
+	my $pool = SVN::Pool->new;
+
+	my $full_url = $self->{url};
+	my $old_url = $full_url;
+	$full_url .= '/' . $path if length $path;
+	my ($ra, $reparented);
+
+	if ($old_url =~ m#^svn(\+ssh)?://# ||
+	    ($full_url =~ m#^https?://# &&
+	     escape_url($full_url) ne $full_url)) {
+		$_[0] = undef;
+		$self = undef;
+		$RA = undef;
+		$ra = Git::SVN::Ra->new($full_url);
+		$ra_invalid = 1;
+	} elsif ($old_url ne $full_url) {
+		SVN::_Ra::svn_ra_reparent($self->{session}, $full_url, $pool);
+		$self->{url} = $full_url;
+		$reparented = 1;
+	}
+
+	$ra ||= $self;
+	$url_b = escape_url($url_b);
+	my $reporter = $ra->do_switch($rev_b, '', 1, $url_b, $editor, $pool);
+	my @lock = (::compare_svn_version('1.2.0') >= 0) ? (undef) : ();
+	$reporter->set_path('', $rev_a, 0, @lock, $pool);
+	$reporter->finish_report($pool);
+
+	if ($reparented) {
+		SVN::_Ra::svn_ra_reparent($self->{session}, $old_url, $pool);
+		$self->{url} = $old_url;
+	}
+
+	$pool->clear;
+	$editor->{git_commit_ok};
+}
+
+sub longest_common_path {
+	my ($gsv, $globs) = @_;
+	my %common;
+	my $common_max = scalar @$gsv;
+
+	foreach my $gs (@$gsv) {
+		my @tmp = split m#/#, $gs->{path};
+		my $p = '';
+		foreach (@tmp) {
+			$p .= length($p) ? "/$_" : $_;
+			$common{$p} ||= 0;
+			$common{$p}++;
+		}
+	}
+	$globs ||= [];
+	$common_max += scalar @$globs;
+	foreach my $glob (@$globs) {
+		my @tmp = split m#/#, $glob->{path}->{left};
+		my $p = '';
+		foreach (@tmp) {
+			$p .= length($p) ? "/$_" : $_;
+			$common{$p} ||= 0;
+			$common{$p}++;
+		}
+	}
+
+	my $longest_path = '';
+	foreach (sort {length $b <=> length $a} keys %common) {
+		if ($common{$_} == $common_max) {
+			$longest_path = $_;
+			last;
+		}
+	}
+	$longest_path;
+}
+
+sub gs_fetch_loop_common {
+	my ($self, $base, $head, $gsv, $globs) = @_;
+	return if ($base > $head);
+	my $inc = $_log_window_size;
+	my ($min, $max) = ($base, $head < $base + $inc ? $head : $base + $inc);
+	my $longest_path = longest_common_path($gsv, $globs);
+	my $ra_url = $self->{url};
+	my $find_trailing_edge;
+	while (1) {
+		my %revs;
+		my $err;
+		my $err_handler = $SVN::Error::handler;
+		$SVN::Error::handler = sub {
+			($err) = @_;
+			skip_unknown_revs($err);
+		};
+		sub _cb {
+			my ($paths, $r, $author, $date, $log) = @_;
+			[ $paths,
+			  { author => $author, date => $date, log => $log } ];
+		}
+		$self->get_log([$longest_path], $min, $max, 0, 1, 1,
+		               sub { $revs{$_[1]} = _cb(@_) });
+		if ($err) {
+			print "Checked through r$max\r";
+		} else {
+			$find_trailing_edge = 1;
+		}
+		if ($err and $find_trailing_edge) {
+			print STDERR "Path '$longest_path' ",
+				     "was probably deleted:\n",
+				     $err->expanded_message,
+				     "\nWill attempt to follow ",
+				     "revisions r$min .. r$max ",
+				     "committed before the deletion\n";
+			my $hi = $max;
+			while (--$hi >= $min) {
+				my $ok;
+				$self->get_log([$longest_path], $min, $hi,
+				               0, 1, 1, sub {
+				               $ok = $_[1];
+				               $revs{$_[1]} = _cb(@_) });
+				if ($ok) {
+					print STDERR "r$min .. r$ok OK\n";
+					last;
+				}
+			}
+			$find_trailing_edge = 0;
+		}
+		$SVN::Error::handler = $err_handler;
+
+		my %exists = map { $_->{path} => $_ } @$gsv;
+		foreach my $r (sort {$a <=> $b} keys %revs) {
+			my ($paths, $logged) = @{$revs{$r}};
+
+			foreach my $gs ($self->match_globs(\%exists, $paths,
+			                                   $globs, $r)) {
+				if ($gs->rev_map_max >= $r) {
+					next;
+				}
+				next unless $gs->match_paths($paths, $r);
+				$gs->{logged_rev_props} = $logged;
+				if (my $last_commit = $gs->last_commit) {
+					$gs->assert_index_clean($last_commit);
+				}
+				my $log_entry = $gs->do_fetch($paths, $r);
+				if ($log_entry) {
+					$gs->do_git_commit($log_entry);
+				}
+				$Git::SVN::INDEX_FILES{$gs->{index}} = 1;
+			}
+			foreach my $g (@$globs) {
+				my $k = "svn-remote.$g->{remote}." .
+				        "$g->{t}-maxRev";
+				Git::SVN::tmp_config($k, $r);
+			}
+			if ($ra_invalid) {
+				$_[0] = undef;
+				$self = undef;
+				$RA = undef;
+				$self = Git::SVN::Ra->new($ra_url);
+				$ra_invalid = undef;
+			}
+		}
+		# pre-fill the .rev_db since it'll eventually get filled in
+		# with '0' x40 if something new gets committed
+		foreach my $gs (@$gsv) {
+			next if $gs->rev_map_max >= $max;
+			next if defined $gs->rev_map_get($max);
+			$gs->rev_map_set($max, 0 x40);
+		}
+		foreach my $g (@$globs) {
+			my $k = "svn-remote.$g->{remote}.$g->{t}-maxRev";
+			Git::SVN::tmp_config($k, $max);
+		}
+		last if $max >= $head;
+		$min = $max + 1;
+		$max += $inc;
+		$max = $head if ($max > $head);
+	}
+	Git::SVN::gc();
+}
+
+sub get_dir_globbed {
+	my ($self, $left, $depth, $r) = @_;
+
+	my @x = eval { $self->get_dir($left, $r) };
+	return unless scalar @x == 3;
+	my $dirents = $x[0];
+	my @finalents;
+	foreach my $de (keys %$dirents) {
+		next if $dirents->{$de}->{kind} != $SVN::Node::dir;
+		if ($depth > 1) {
+			my @args = ("$left/$de", $depth - 1, $r);
+			foreach my $dir ($self->get_dir_globbed(@args)) {
+				push @finalents, "$de/$dir";
+			}
+		} else {
+			push @finalents, $de;
+		}
+	}
+	@finalents;
+}
+
+# return value: 0 -- don't ignore, 1 -- ignore
+sub is_ref_ignored {
+	my ($g, $p) = @_;
+	my $refname = $g->{ref}->full_path($p);
+	return 1 if defined($g->{ignore_refs_regex}) &&
+	            $refname =~ m!$g->{ignore_refs_regex}!;
+	return 0 unless defined($_ignore_refs_regex);
+	return 1 if $refname =~ m!$_ignore_refs_regex!o;
+	return 0;
+}
+
+sub match_globs {
+	my ($self, $exists, $paths, $globs, $r) = @_;
+
+	sub get_dir_check {
+		my ($self, $exists, $g, $r) = @_;
+
+		my @dirs = $self->get_dir_globbed($g->{path}->{left},
+		                                  $g->{path}->{depth},
+		                                  $r);
+
+		foreach my $de (@dirs) {
+			my $p = $g->{path}->full_path($de);
+			next if $exists->{$p};
+			next if (length $g->{path}->{right} &&
+				 ($self->check_path($p, $r) !=
+				  $SVN::Node::dir));
+			next unless $p =~ /$g->{path}->{regex}/;
+			$exists->{$p} = Git::SVN->init($self->{url}, $p, undef,
+					 $g->{ref}->full_path($de), 1);
+		}
+	}
+	foreach my $g (@$globs) {
+		if (my $path = $paths->{"/$g->{path}->{left}"}) {
+			if ($path->{action} =~ /^[AR]$/) {
+				get_dir_check($self, $exists, $g, $r);
+			}
+		}
+		foreach (keys %$paths) {
+			if (/$g->{path}->{left_regex}/ &&
+			    !/$g->{path}->{regex}/) {
+				next if $paths->{$_}->{action} !~ /^[AR]$/;
+				get_dir_check($self, $exists, $g, $r);
+			}
+			next unless /$g->{path}->{regex}/;
+			my $p = $1;
+			my $pathname = $g->{path}->full_path($p);
+			next if is_ref_ignored($g, $p);
+			next if $exists->{$pathname};
+			next if ($self->check_path($pathname, $r) !=
+			         $SVN::Node::dir);
+			$exists->{$pathname} = Git::SVN->init(
+			                      $self->{url}, $pathname, undef,
+			                      $g->{ref}->full_path($p), 1);
+		}
+		my $c = '';
+		foreach (split m#/#, $g->{path}->{left}) {
+			$c .= "/$_";
+			next unless ($paths->{$c} &&
+			             ($paths->{$c}->{action} =~ /^[AR]$/));
+			get_dir_check($self, $exists, $g, $r);
+		}
+	}
+	values %$exists;
+}
+
+sub minimize_url {
+	my ($self) = @_;
+	return $self->{url} if ($self->{url} eq $self->{repos_root});
+	my $url = $self->{repos_root};
+	my @components = split(m!/!, $self->{svn_path});
+	my $c = '';
+	do {
+		$url .= "/$c" if length $c;
+		eval {
+			my $ra = (ref $self)->new($url);
+			my $latest = $ra->get_latest_revnum;
+			$ra->get_log("", $latest, 0, 1, 0, 1, sub {});
+		};
+	} while ($@ && ($c = shift @components));
+	$url;
+}
+
+sub can_do_switch {
+	my $self = shift;
+	unless (defined $can_do_switch) {
+		my $pool = SVN::Pool->new;
+		my $rep = eval {
+			$self->do_switch(1, '', 0, $self->{url},
+			                 SVN::Delta::Editor->new, $pool);
+		};
+		if ($@) {
+			$can_do_switch = 0;
+		} else {
+			$rep->abort_report($pool);
+			$can_do_switch = 1;
+		}
+		$pool->clear;
+	}
+	$can_do_switch;
+}
+
+sub skip_unknown_revs {
+	my ($err) = @_;
+	my $errno = $err->apr_err();
+	# Maybe the branch we're tracking didn't
+	# exist when the repo started, so it's
+	# not an error if it doesn't, just continue
+	#
+	# Wonderfully consistent library, eh?
+	# 160013 - svn:// and file://
+	# 175002 - http(s)://
+	# 175007 - http(s):// (this repo required authorization, too...)
+	#   More codes may be discovered later...
+	if ($errno == 175007 || $errno == 175002 || $errno == 160013) {
+		my $err_key = $err->expanded_message;
+		# revision numbers change every time, filter them out
+		$err_key =~ s/\d+/\0/g;
+		$err_key = "$errno\0$err_key";
+		unless ($ignored_err{$err_key}) {
+			warn "W: Ignoring error from SVN, path probably ",
+			     "does not exist: ($errno): ",
+			     $err->expanded_message,"\n";
+			warn "W: Do not be alarmed at the above message ",
+			     "git-svn is just searching aggressively for ",
+			     "old history.\n",
+			     "This may take a while on large repositories\n";
+			$ignored_err{$err_key} = 1;
+		}
+		return;
+	}
+	die "Error from SVN, ($errno): ", $err->expanded_message,"\n";
+}
+
+1;
+__END__
+
+Git::SVN::Ra - Subversion remote access functions for git-svn
+
+=head1 SYNOPSIS
+
+    use Git::SVN::Ra;
+
+    my $ra = Git::SVN::Ra->new($branchurl);
+    my ($dirents, $fetched_revnum, $props) =
+        $ra->get_dir('.', $SVN::Core::INVALID_REVNUM);
+
+=head1 DESCRIPTION
+
+This is a wrapper around the L<SVN::Ra> module for use by B<git-svn>.
+It fills in some default parameters (such as the authentication
+scheme), smooths over incompatibilities between libsvn versions, adds
+caching, and implements some functions specific to B<git-svn>.
+
+Do not use it unless you are developing git-svn.  The interface will
+change as git-svn evolves.
+
+=head1 DEPENDENCIES
+
+Subversion perl bindings,
+L<Git::SVN>.
+
+C<Git::SVN::Ra> has not been tested using callers other than
+B<git-svn> itself.
+
+=head1 SEE ALSO
+
+L<SVN::Ra>.
+
+=head1 INCOMPATIBILITIES
+
+None reported.
+
+=head1 BUGS
+
+None.
diff --git a/perl/Makefile.PL b/perl/Makefile.PL
index 32caf214..08afdcb2 100644
--- a/perl/Makefile.PL
+++ b/perl/Makefile.PL
@@ -30,6 +30,7 @@ my %pm = (
 	'Git/SVN/Fetcher.pm' => '$(INST_LIBDIR)/Git/SVN/Fetcher.pm',
 	'Git/SVN/Editor.pm' => '$(INST_LIBDIR)/Git/SVN/Editor.pm',
 	'Git/SVN/Prompt.pm' => '$(INST_LIBDIR)/Git/SVN/Prompt.pm',
+	'Git/SVN/Ra.pm' => '$(INST_LIBDIR)/Git/SVN/Ra.pm',
 );
 
 # We come with our own bundled Error.pm. It's not in the set of default
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH 3/3] git-svn: use YAML format for mergeinfo cache when possible
  2012-06-09 22:20         ` [PATCH v2 0/3] " Jonathan Nieder
  2012-06-09 22:25           ` [PATCH 1/3] git-svn: make Git::SVN::Editor a separate file Jonathan Nieder
  2012-06-09 22:28           ` [PATCH 2/3] git-svn: make Git::SVN::RA " Jonathan Nieder
@ 2012-06-09 22:35           ` Jonathan Nieder
  2012-06-13  1:34             ` Jonathan Nieder
  2012-06-10  9:00           ` [PATCH v2 0/3] " Eric Wong
  3 siblings, 1 reply; 26+ messages in thread
From: Jonathan Nieder @ 2012-06-09 22:35 UTC (permalink / raw)
  To: Eric Wong
  Cc: Junio C Hamano, Jason Gross, git, Sandro Weiser, Bdale Garbee,
	Andrew Myrick

Since v1.7.0-rc2~11 (git-svn: persistent memoization, 2010-01-30),
git-svn has maintained some private per-repository caches in
.git/svn/.caches to avoid refetching and recalculating some
mergeinfo-related information with every "git svn fetch".

These caches use the 'nstore' format from the perl core module
Storable, which can be read and written quickly and was designed for
transfer over the wire (the 'n' stands for 'network').  This format is
endianness-independent and independent of floating-point
representation.

Unfortunately the format is *not* independent of the perl version ---
new perl versions will write files that very old perl cannot read.
Worse, the format is not independent of the size of a perl integer.
So if you toggle perl's use64bitint compile-time option, then using
'git svn fetch' on your old repositories produces errors like this:

	Byte order is not compatible at ../../lib/Storable.pm (autosplit
	into ../../lib/auto/Storable/_retrieve.al) line 380, at
	/usr/share/perl/5.12/Memoize/Storable.pm line 21

That is, upgrading perl to a version that uses use64bitint for the
first time makes git-svn suddenly refuse to fetch in existing
repositories.  Removing .git/svn/.caches lets git-svn recover.

It's time to switch to a platform independent serializer backend with
better compatibility guarantees.  This patch uses YAML::Any.

Other choices were considered:

 - thawing data from Data::Dumper involves "eval".  Doing that without
   creating a security risk is fussy.

 - the JSON API works on scalars in memory and doesn't provide a
   standard way to serialize straight to disk.

YAML::Any is reasonably fast and has a pleasant API.  In most
backends, LoadFile() reads the entire file into a scalar anyway and
converts it as a second step, but having an interface that allows the
deserialization to happen on the fly without a temporary is still a
comfort.

YAML::Any is not a core perl module, so we take care to use it when
and only when it is available.  Installations without that module
should fall back to using Storable with all its quirks, keeping their
cache files in

	.git/svn/.caches/*.db

Installations with YAML peacefully coexist by keeping a separate set
of cache files in

	.git/svn/.caches/*.yaml.

In most cases, switching between is a one-time thing, so it doesn't
seem worth the complication to migrate existing caches.

The upshot: after this patch, as long as YAML::Any is installed you
can move your git repository between machines with different perl
installations and "git svn fetch" will work fine.  If you do not have
YAML::Any, the behavior is unchanged (and in particular does not get
any worse).

Reported-by: Sandro Weiser <sandro.weiser@informatik.tu-chemnitz.de>
Reported-by: Bdale Garbee <bdale@gag.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Thanks for reading.

 git-svn.perl                 |   31 +++++++++++---
 perl/Git/SVN/Memoize/YAML.pm |   93 ++++++++++++++++++++++++++++++++++++++++++
 perl/Makefile.PL             |    1 +
 3 files changed, 119 insertions(+), 6 deletions(-)
 create mode 100644 perl/Git/SVN/Memoize/YAML.pm

diff --git a/git-svn.perl b/git-svn.perl
index 24c0af2e..0b074c4c 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -2055,6 +2055,10 @@ use Time::Local;
 use Memoize;  # core since 5.8.0, Jul 2002
 use Memoize::Storable;
 use POSIX qw(:signal_h);
+my $can_use_yaml;
+BEGIN {
+	$can_use_yaml = eval { require Git::SVN::Memoize::YAML; 1};
+}
 
 my ($_gc_nr, $_gc_period);
 
@@ -3577,6 +3581,17 @@ sub has_no_changes {
 		command_oneline("rev-parse", "$commit~1^{tree}"));
 }
 
+sub tie_for_persistent_memoization {
+	my $hash = shift;
+	my $path = shift;
+
+	if ($can_use_yaml) {
+		tie %$hash => 'Git::SVN::Memoize::YAML', "$path.yaml";
+	} else {
+		tie %$hash => 'Memoize::Storable', "$path.db", 'nstore';
+	}
+}
+
 # The GIT_DIR environment variable is not always set until after the command
 # line arguments are processed, so we can't memoize in a BEGIN block.
 {
@@ -3589,22 +3604,26 @@ sub has_no_changes {
 		my $cache_path = "$ENV{GIT_DIR}/svn/.caches/";
 		mkpath([$cache_path]) unless -d $cache_path;
 
-		tie my %lookup_svn_merge_cache => 'Memoize::Storable',
-		    "$cache_path/lookup_svn_merge.db", 'nstore';
+		my %lookup_svn_merge_cache;
+		my %check_cherry_pick_cache;
+		my %has_no_changes_cache;
+
+		tie_for_persistent_memoization(\%lookup_svn_merge_cache,
+		    "$cache_path/lookup_svn_merge");
 		memoize 'lookup_svn_merge',
 			SCALAR_CACHE => 'FAULT',
 			LIST_CACHE => ['HASH' => \%lookup_svn_merge_cache],
 		;
 
-		tie my %check_cherry_pick_cache => 'Memoize::Storable',
-		    "$cache_path/check_cherry_pick.db", 'nstore';
+		tie_for_persistent_memoization(\%check_cherry_pick_cache,
+		    "$cache_path/check_cherry_pick");
 		memoize 'check_cherry_pick',
 			SCALAR_CACHE => 'FAULT',
 			LIST_CACHE => ['HASH' => \%check_cherry_pick_cache],
 		;
 
-		tie my %has_no_changes_cache => 'Memoize::Storable',
-		    "$cache_path/has_no_changes.db", 'nstore';
+		tie_for_persistent_memoization(\%has_no_changes_cache,
+		    "$cache_path/has_no_changes");
 		memoize 'has_no_changes',
 			SCALAR_CACHE => ['HASH' => \%has_no_changes_cache],
 			LIST_CACHE => 'FAULT',
diff --git a/perl/Git/SVN/Memoize/YAML.pm b/perl/Git/SVN/Memoize/YAML.pm
new file mode 100644
index 00000000..9676b8f2
--- /dev/null
+++ b/perl/Git/SVN/Memoize/YAML.pm
@@ -0,0 +1,93 @@
+package Git::SVN::Memoize::YAML;
+use warnings;
+use strict;
+use YAML::Any ();
+
+# based on Memoize::Storable.
+
+sub TIEHASH {
+	my $package = shift;
+	my $filename = shift;
+	my $truehash = (-e $filename) ? YAML::Any::LoadFile($filename) : {};
+	my $self = {FILENAME => $filename, H => $truehash};
+	bless $self => $package;
+}
+
+sub STORE {
+	my $self = shift;
+	$self->{H}{$_[0]} = $_[1];
+}
+
+sub FETCH {
+	my $self = shift;
+	$self->{H}{$_[0]};
+}
+
+sub EXISTS {
+	my $self = shift;
+	exists $self->{H}{$_[0]};
+}
+
+sub DESTROY {
+	my $self = shift;
+	YAML::Any::DumpFile($self->{FILENAME}, $self->{H});
+}
+
+sub SCALAR {
+	my $self = shift;
+	scalar(%{$self->{H}});
+}
+
+sub FIRSTKEY {
+	'Fake hash from Git::SVN::Memoize::YAML';
+}
+
+sub NEXTKEY {
+	undef;
+}
+
+1;
+__END__
+
+=head1 NAME
+
+Git::SVN::Memoize::YAML - store Memoized data in YAML format
+
+=head1 SYNOPSIS
+
+    use Memoize;
+    use Git::SVN::Memoize::YAML;
+
+    tie my %cache => 'Git::SVN::Memoize::YAML', $filename;
+    memoize('slow_function', SCALAR_CACHE => [HASH => \%cache]);
+    slow_function(arguments);
+
+=head1 DESCRIPTION
+
+This module provides a class that can be used to tie a hash to a
+YAML file.  The file is read when the hash is initialized and
+rewritten when the hash is destroyed.
+
+The intent is to allow L<Memoize> to back its cache with a file in
+YAML format, just like L<Memoize::Storable> allows L<Memoize> to
+back its cache with a file in Storable format.  Unlike the Storable
+format, the YAML format is platform-independent and fairly stable.
+
+Carps on error.
+
+=head1 DIAGNOSTICS
+
+See L<YAML::Any>.
+
+=head1 DEPENDENCIES
+
+L<YAML::Any> from CPAN.
+
+=head1 INCOMPATIBILITIES
+
+None reported.
+
+=head1 BUGS
+
+The entire cache is read into a Perl hash when loading the file,
+so this is not very scalable.
diff --git a/perl/Makefile.PL b/perl/Makefile.PL
index 08afdcb2..2c20290f 100644
--- a/perl/Makefile.PL
+++ b/perl/Makefile.PL
@@ -27,6 +27,7 @@ MAKE_FRAG
 my %pm = (
 	'Git.pm' => '$(INST_LIBDIR)/Git.pm',
 	'Git/I18N.pm' => '$(INST_LIBDIR)/Git/I18N.pm',
+	'Git/SVN/Memoize/YAML.pm' => '$(INST_LIBDIR)/Git/SVN/Memoize/YAML.pm',
 	'Git/SVN/Fetcher.pm' => '$(INST_LIBDIR)/Git/SVN/Fetcher.pm',
 	'Git/SVN/Editor.pm' => '$(INST_LIBDIR)/Git/SVN/Editor.pm',
 	'Git/SVN/Prompt.pm' => '$(INST_LIBDIR)/Git/SVN/Prompt.pm',
-- 
1.7.10

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/3] git-svn: use YAML format for mergeinfo cache when possible
  2012-06-09 22:20         ` [PATCH v2 0/3] " Jonathan Nieder
                             ` (2 preceding siblings ...)
  2012-06-09 22:35           ` [PATCH 3/3] git-svn: use YAML format for mergeinfo cache when possible Jonathan Nieder
@ 2012-06-10  9:00           ` Eric Wong
  2012-06-10 10:04             ` Jonathan Nieder
  3 siblings, 1 reply; 26+ messages in thread
From: Eric Wong @ 2012-06-10  9:00 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, Jason Gross, git

Jonathan Nieder <jrnieder@gmail.com> wrote:
> Eric Wong wrote:
> > These should die on errors, right?
> 
> Yep, they do.  I didn't bother doing the CARP_NOT thing --- let's wait
> for a bug report or rainy day and then handle errors properly (by
> invalidating the cache when appropriate).

OK.

> Patches 1-2 are independent from patch 3.  I'm sending them this way
> for no particular reason.
> 
> Thoughts of all kinds welcome, as usual.

Thanks!  I only had a trivial conflict with 2/3, but fixed it,
signed-off and pushed to master of git://bogomips.org/git-svn

> Jonathan Nieder (3):
>   git-svn: make Git::SVN::Editor a separate file
>   git-svn: make Git::SVN::RA a separate file

2/3 had a conflict with commit c26ddce86d7215b4d9687bd4c6b5dd43a3fabf31
which I resolved by hand
(git-svn: platform auth providers are working only on 1.6.15 or newer)

--- a.mbox	2012-06-10 08:58:19.000000000 +0000
+++ b.mbox	2012-06-10 08:58:23.000000000 +0000
@@ -64,7 +64,7 @@
 -
 -	# earlier 1.6.x versions would segfault, and <= 1.5.x didn't have
 -	# this function
--	if (::compare_svn_version('1.6.12') > 0) {
+-	if (::compare_svn_version('1.6.15') >= 0) {
 -		my $config = SVN::Core::config_get_config($config_dir);
 -		my ($p, @a);
 -		# config_get_config returns all config files from
@@ -688,7 +688,7 @@
 +
 +	# earlier 1.6.x versions would segfault, and <= 1.5.x didn't have
 +	# this function
-+	if (::compare_svn_version('1.6.12') > 0) {
++	if (::compare_svn_version('1.6.15') >= 0) {
 +		my $config = SVN::Core::config_get_config($config_dir);
 +		my ($p, @a);
 +		# config_get_config returns all config files from

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/3] git-svn: use YAML format for mergeinfo cache when possible
  2012-06-10  9:00           ` [PATCH v2 0/3] " Eric Wong
@ 2012-06-10 10:04             ` Jonathan Nieder
  2012-06-11 15:06               ` Junio C Hamano
  0 siblings, 1 reply; 26+ messages in thread
From: Jonathan Nieder @ 2012-06-10 10:04 UTC (permalink / raw)
  To: Eric Wong; +Cc: Junio C Hamano, Jason Gross, git

Eric Wong wrote:

> 2/3 had a conflict with commit c26ddce86d7215b4d9687bd4c6b5dd43a3fabf31
> which I resolved by hand

That was fast. :)  Checked by glancing through the output of

	git diff 9f7ad147^:git-svn.perl 9f7ad147:perl/Git/SVN/Ra.pm

Looks good.

Good night,
Jonathan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2 0/3] git-svn: use YAML format for mergeinfo cache when possible
  2012-06-10 10:04             ` Jonathan Nieder
@ 2012-06-11 15:06               ` Junio C Hamano
  0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2012-06-11 15:06 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Eric Wong, Jason Gross, git

Jonathan Nieder <jrnieder@gmail.com> writes:

> Eric Wong wrote:
>
>> 2/3 had a conflict with commit c26ddce86d7215b4d9687bd4c6b5dd43a3fabf31
>> which I resolved by hand
>
> That was fast. :)  Checked by glancing through the output of
>
> 	git diff 9f7ad147^:git-svn.perl 9f7ad147:perl/Git/SVN/Ra.pm
>
> Looks good.
>
> Good night,
> Jonathan

Thanks, both.  Pulled.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH 3/3] git-svn: use YAML format for mergeinfo cache when possible
  2012-06-09 22:35           ` [PATCH 3/3] git-svn: use YAML format for mergeinfo cache when possible Jonathan Nieder
@ 2012-06-13  1:34             ` Jonathan Nieder
  0 siblings, 0 replies; 26+ messages in thread
From: Jonathan Nieder @ 2012-06-13  1:34 UTC (permalink / raw)
  To: Eric Wong
  Cc: Junio C Hamano, Jason Gross, git, Sandro Weiser, Bdale Garbee,
	Andrew Myrick, Tim Retout

Jonathan Nieder wrote:

> Unfortunately the format is *not* independent of the perl version ---
> new perl versions will write files that very old perl cannot read.

True and documented.

> Worse, the format is not independent of the size of a perl integer.
> So if you toggle perl's use64bitint compile-time option, then using
> 'git svn fetch' on your old repositories produces errors like this:
>
>	Byte order is not compatible at ../../lib/Storable.pm (autosplit

Weird and contrary to documentation.  First, why byte order?  Wouldn't
the problem be word size, if anything?  nstore implies network-endian,
after all.

Examining the cache file explains it:

	$ xxd <.git/svn/.caches/lookup_svn_merge.db | head -1
	0000000: 7073 7430 0408 0831 3233 3435 3637 3804  pst0...12345678.

The byte order is "12345678", which reflects both endianness and word
size.

Notice that the format is not network-endian.  That is because
Memoize::Storable ignores the 'nstore' option, for silly reasons:

	https://rt.cpan.org/Public/Bug/Display.html?id=77790

Ah.  (Thanks to Tim Retout for noticing.)

Avoiding this arcane and fragile facility when possible still seems to
me like a reasonable behavior for git-svn, but it's also nice to see
that the problem will eventually fix itself.  Sorry for the confusion.

And until then, using a renamed cache with a format (YAML) known to be
platform-agnostic means git-svn can keep working on repositories that
have caches written by unfixed versions of perl.

Ciao,
Jonathan

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2012-06-13  1:35 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-22  2:17 [PATCH] git-svn: Destroy the cache when we fail to read it Jason Gross
2011-08-22  4:04 ` Jason Gross
2011-08-22  4:10 ` [PATCH] Add tests for handling corrupted caches Jason Gross
2011-08-23  2:27 ` [PATCH] git-svn: Destroy the cache when we fail to read it Jonathan Nieder
2011-08-23  2:36   ` Jonathan Nieder
2011-08-23  5:51   ` Jason Gross
2011-08-23  8:15 ` Eric Wong
2011-08-23 17:05   ` Junio C Hamano
2011-08-23 19:58     ` Eric Wong
2012-05-27 19:25     ` [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible Jonathan Nieder
2012-05-27 19:48       ` [RFC/PATCH 2/1] fixup! " Jonathan Nieder
2012-05-27 20:14       ` [RFC/PATCH] " Eric Wong
2012-05-28  0:39         ` Jonathan Nieder
2012-05-28  6:57           ` [RFC/PATCH 0/2] git-svn: give SVN::Git::Fetcher its own file Jonathan Nieder
2012-05-28  7:00             ` [PATCH 1/2] git-svn: rename SVN::Git::* packages to Git::SVN::* Jonathan Nieder
2012-05-28  7:03             ` [PATCH 2/2] git-svn: make Git::SVN::Fetcher a separate file Jonathan Nieder
2012-05-29  0:25           ` [RFC/PATCH] git-svn: use YAML format for mergeinfo cache when possible Eric Wong
2012-05-29 20:48             ` Junio C Hamano
2012-06-09 22:20         ` [PATCH v2 0/3] " Jonathan Nieder
2012-06-09 22:25           ` [PATCH 1/3] git-svn: make Git::SVN::Editor a separate file Jonathan Nieder
2012-06-09 22:28           ` [PATCH 2/3] git-svn: make Git::SVN::RA " Jonathan Nieder
2012-06-09 22:35           ` [PATCH 3/3] git-svn: use YAML format for mergeinfo cache when possible Jonathan Nieder
2012-06-13  1:34             ` Jonathan Nieder
2012-06-10  9:00           ` [PATCH v2 0/3] " Eric Wong
2012-06-10 10:04             ` Jonathan Nieder
2012-06-11 15:06               ` Junio C Hamano

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).