git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Nathan Gray <n8gray@n8gray.org>
To: Stephen Bash <bash@genarts.com>
Cc: Andrew Sayers <andrew-git@pileofstuff.org>,
	Jonathan Nieder <jrnieder@gmail.com>, Jeff King <peff@peff.net>,
	git@vger.kernel.org, Sverre Rabbelier <srabbelier@gmail.com>,
	Dmitry Ivankov <divanorama@gmail.com>,
	Ramkumar Ramachandra <artagnon@gmail.com>,
	Sam Vilain <sam@vilain.net>, David Barr <davidbarr@google.com>
Subject: Re: Approaches to SVN to Git conversion (was: Re: [RFC] "Remote helper for Subversion" project)
Date: Tue, 6 Mar 2012 11:29:59 -0800	[thread overview]
Message-ID: <CA+7g9Jwb=7wH7R3=ShhOGMdHXWmq4ZahocpaEuJdf+yBfCpA8A@mail.gmail.com> (raw)
In-Reply-To: <3c2ab05e-b2af-4df4-bca6-ff5512b0c73e@mail>

Hi everyone,

Soon I'm going to be undertaking a migration of a subproject from a
very messy multiproject SVN repo to git, so this is a topic that's
quite near to my heart at the moment.  More inline...

On Mon, Mar 5, 2012 at 7:27 AM, Stephen Bash <bash@genarts.com> wrote:
>
> ----- Original Message -----
>> From: "Andrew Sayers" <andrew-git@pileofstuff.org>
>> Sent: Sunday, March 4, 2012 8:36:41 AM
>> Subject: Re: [RFC] "Remote helper for Subversion" project
>>

[snip]

>> Personally, I think SVN export will always need a strong manual
>> component to get the best results, so I've put quite a bit of work
>> into designing a good SVN history format.  Like git-fast-import, it's
>> an ASCII format designed both for human and machine consumption...
>
> First, I'm very impressed that you managed to get a language like this up and working.  It could prove very useful going forward.  On the flip side, from my experiments over the last year I've actually been leaning toward a solution that is more implicit than explicit.  Taking git-svn as a model, I've been trying to define a mapping system (in Perl):
>
>  my %branch_spec = { '/trunk/projname' => 'master',
>                      '/branches/*/projname' => '/refs/heads/*' };
>  my %tag_spec = { '/tags/*/projname' => '/refs/tags/*' };

The problem of specifying and detecting branches is a major problem in
my upcoming conversion.  We've got toplevel trunk/branches/tags
directories but underneath "branches" it's a free-for-all:

/branches/codenameA/{projectA,projectB,projectC}
/branches/codenameB   (actually a branch of projectA)
/branches/developers/joe/frobnicator-experiment (also a branch of projectA)

Clearly there's no simple regex that's going to capture this, so I'm
reduced to listing every branch of projectA, which is tedious and
error-prone.  However, what *would* work fabulously well for me is
"marker file" detection.  Every copy of projectA has a certain file at
it's root.  Let's call it "markerFile.txt".  What I'd really love is a
way to say:

my %branch_markers = {'/branches/**/markerFile.txt' => '/refs/heads/**'}

I'm using ** to signify that this may match multiple path components
(sorry, I don't know perl glob syntax).  A branch point is any
revision that creates a new file that matches the marker pattern.

Ideally one could use logical connectives like AND and OR to specify a
set of patterns that could account for marker files changing over the
history of the project, but for my purposes that wouldn't be necessary
-- we've got a well-defined marker that's always present.

For bonus points I'd like to be able to speed things up by excluding
known-bad markers.  Say projectB has a file "badMarker.txt" at its
root and I don't want to import projectB into my new repo.  Maybe I
could specify:

my %branch_spec = {
        '/branches/**/markerFile.txt' => '/refs/heads/**',
        '/branches/**/badMarker.txt' => '!'}

I'm assuming that it would be helpful for the script to have this
information (e.g. it could stop recursive searches when badMarker.txt
is found), but maybe that's not the case.

I'd welcome any comments or (especially!) code to try out.  ;^)

Cheers,
-Nathan

-- 
http://n8gray.org

  parent reply	other threads:[~2012-03-06 19:30 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-03 12:27 [RFC] "Remote helper for Subversion" project David Barr
2012-03-03 12:41 ` David Barr
2012-03-04  7:54   ` Jonathan Nieder
2012-03-04 10:37     ` David Barr
2012-03-04 13:36       ` Andrew Sayers
2012-03-05 15:27         ` Approaches to SVN to Git conversion (was: Re: [RFC] "Remote helper for Subversion" project) Stephen Bash
2012-03-05 23:27           ` Approaches to SVN to Git conversion Andrew Sayers
2012-03-06 14:36             ` Stephen Bash
2012-03-06 19:29           ` Nathan Gray [this message]
2012-03-06 20:35             ` Approaches to SVN to Git conversion (was: Re: [RFC] "Remote helper for Subversion" project) Stephen Bash
2012-03-06 23:59               ` [spf:guess] " Sam Vilain
2012-03-07 22:06                 ` Andrew Sayers
2012-03-07 23:15                   ` [spf:guess,iffy] " Sam Vilain
2012-03-08 20:51                     ` Andrew Sayers
2012-03-06 22:34             ` Approaches to SVN to Git conversion Andrew Sayers
2012-03-07 15:38               ` Sam Vilain
2012-03-07 20:28                 ` Andrew Sayers
2012-03-07 22:33               ` Phil Hord
2012-03-07 23:08               ` Nathan Gray
2012-03-07 23:32                 ` Andrew Sayers
2012-03-04 16:23       ` [RFC] "Remote helper for Subversion" project Jonathan Nieder
2012-03-27  3:58     ` Ramkumar Ramachandra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+7g9Jwb=7wH7R3=ShhOGMdHXWmq4ZahocpaEuJdf+yBfCpA8A@mail.gmail.com' \
    --to=n8gray@n8gray.org \
    --cc=andrew-git@pileofstuff.org \
    --cc=artagnon@gmail.com \
    --cc=bash@genarts.com \
    --cc=davidbarr@google.com \
    --cc=divanorama@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=peff@peff.net \
    --cc=sam@vilain.net \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).