git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* git-svn: importing internal externals
@ 2009-05-27 16:51 John Koleszar
  2009-05-28 11:25 ` Yann Dirson
  0 siblings, 1 reply; 4+ messages in thread
From: John Koleszar @ 2009-05-27 16:51 UTC (permalink / raw)
  To: git@vger.kernel.org

Hi,

I'm working on a one-off import of an SVN repo that makes use of
"internal" svn:externals; i.e. all URLs refer to different spots in the
same repo, potentially with peg revisions. The SVN repo holds a number
of projects, and my plan is to import them into individual git repos,
incorporating the history from any svn:external linked sub-projects.

My current strategy is to let git-svn fetch the project as it normally
would, then fix up the history on each branch by parsing the
unhandled.log with a filter-branch script. I'm currently using an index
filter, which works reasonably well, but has the undesired effect of
squashing commits on the subproject into commits on the parent project.
So if the subproject is modified in rA and rC, and the project in rB and
rD, my modified history shows only rB and rD, with rA squashed into rB
and rC squashed into rD, when I'd really like to see all 4 commits.

I have a git-svn clone of the whole repo available, and I use that to
pull out the missing external objects when recreating history. Maybe a
better idea would be to resolve the externals with my index filter on
this whole super-repo, where the history is properly linear, then
somehow filter the now-populated projects from that. I had originally
set aside this idea because I wanted git-svn to do its automagic
branch/tag extraction, and while I think I could get a git repo with the
svn layout, I'm unsure how to turn that into proper git branches and
tags.

Can anyone offer any suggestions on how to achieve this? I'm still new
to git, so nothing is jumping out at me. I'd be happy to share my filter
script (about 100 lines) if someone wants to see it.

Thanks!

John

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git-svn: importing internal externals
  2009-05-27 16:51 git-svn: importing internal externals John Koleszar
@ 2009-05-28 11:25 ` Yann Dirson
  2009-05-29 21:05   ` John Koleszar
  2009-06-10 21:58   ` Yann Dirson
  0 siblings, 2 replies; 4+ messages in thread
From: Yann Dirson @ 2009-05-28 11:25 UTC (permalink / raw)
  To: John Koleszar; +Cc: git@vger.kernel.org

On Wed, May 27, 2009 at 12:51:29PM -0400, John Koleszar wrote:
> Hi,
> 
> I'm working on a one-off import of an SVN repo that makes use of
> "internal" svn:externals; i.e. all URLs refer to different spots in the
> same repo, potentially with peg revisions. The SVN repo holds a number
> of projects, and my plan is to import them into individual git repos,
> incorporating the history from any svn:external linked sub-projects.

I have started to work on exactly this, at fetch time instead of as a
post-process.  I have for now only hooked parsing of the svn:externals
properties, and just need to find the time to resume and finish.

My plan on the user side is to provide flags to map svn urls to git urls.


> My current strategy is to let git-svn fetch the project as it normally
> would, then fix up the history on each branch by parsing the
> unhandled.log with a filter-branch script. I'm currently using an index
> filter, which works reasonably well, but has the undesired effect of
> squashing commits on the subproject into commits on the parent project.
> So if the subproject is modified in rA and rC, and the project in rB and
> rD, my modified history shows only rB and rD, with rA squashed into rB
> and rC squashed into rD, when I'd really like to see all 4 commits.

Yes, this is the main issue for correctness.  For this we would have to
check, before processing any new commit, if any of the submodules got any
intervening commits, and commit them first.


> I'd be happy to share my filter
> script (about 100 lines) if someone wants to see it.

It can be a good idea to share your script nevertheless :)

Best regards,
-- 
Yann

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git-svn: importing internal externals
  2009-05-28 11:25 ` Yann Dirson
@ 2009-05-29 21:05   ` John Koleszar
  2009-06-10 21:58   ` Yann Dirson
  1 sibling, 0 replies; 4+ messages in thread
From: John Koleszar @ 2009-05-29 21:05 UTC (permalink / raw)
  To: Yann Dirson; +Cc: git@vger.kernel.org

On Thu, 2009-05-28 at 07:25 -0400, Yann Dirson wrote:
> On Wed, May 27, 2009 at 12:51:29PM -0400, John Koleszar wrote:
> > Hi,
> > 
> > I'm working on a one-off import of an SVN repo that makes use of
> > "internal" svn:externals; i.e. all URLs refer to different spots in the
> > same repo, potentially with peg revisions. The SVN repo holds a number
> > of projects, and my plan is to import them into individual git repos,
> > incorporating the history from any svn:external linked sub-projects.
> 
[...]
> It can be a good idea to share your script nevertheless :)
> 

I hacked on this some more and got something pretty usable (for me). It
operates on a git-svn clone of the whole repository, propagates commits
to different paths if referenced by an external, rearranges the tree to
isolate each svn branch on its own head, reparents the branches to their
proper branch points, and converts any tags branches to real git tags.

Don't know what the netequitte is on this list regarding attachments for
this sort of thing, so I posted it here:
http://github.com/jkoleszar/git-svn-internal-externals/tree/master

It's not as fast as I'd like, but it's workable, at least for small
repositories. Bottleneck seems to be git-update-index (100s of ms/call)
but I haven't looked into it too much. I'm sure I could be smarter in
some of my pipelines too. Some numbers (2246 revisions, ~15k files,
Core2 6600 @ 2.4GHz, tmpfs):

git-svn fetch:
197.03user 174.63system 22:36.59elapsed 27%CPU (0avgtext+0avgdata
0maxresident)k 0inputs+0outputs (0major+35448577minor)pagefaults 0swaps

propagating externals:
1381.29user 744.42system 34:28.67elapsed 102%CPU (0avgtext+0avgdata
0maxresident)k 0inputs+0outputs (2major+305234667minor)pagefaults 0swaps

rearranging heads:
46.13user 64.23system 1:52.42elapsed 98%CPU (0avgtext+0avgdata
0maxresident)k 0inputs+0outputs (4major+28752709minor)pagefaults 0swaps

reparenting branches:
151.52user 263.50system 6:19.54elapsed 109%CPU (0avgtext+0avgdata
0maxresident)k 0inputs+0outputs (2major+135830914minor)pagefaults 0swaps

Hope this is useful for someone!

John

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: git-svn: importing internal externals
  2009-05-28 11:25 ` Yann Dirson
  2009-05-29 21:05   ` John Koleszar
@ 2009-06-10 21:58   ` Yann Dirson
  1 sibling, 0 replies; 4+ messages in thread
From: Yann Dirson @ 2009-06-10 21:58 UTC (permalink / raw)
  To: GIT list; +Cc: John Koleszar

On Thu, May 28, 2009 at 01:25:43PM +0200, Yann Dirson wrote:
> On Wed, May 27, 2009 at 12:51:29PM -0400, John Koleszar wrote:
> > Hi,
> > 
> > I'm working on a one-off import of an SVN repo that makes use of
> > "internal" svn:externals; i.e. all URLs refer to different spots in the
> > same repo, potentially with peg revisions. The SVN repo holds a number
> > of projects, and my plan is to import them into individual git repos,
> > incorporating the history from any svn:external linked sub-projects.
> 
> I have started to work on exactly this, at fetch time instead of as a
> post-process.  I have for now only hooked parsing of the svn:externals
> properties, and just need to find the time to resume and finish.
> 
> My plan on the user side is to provide flags to map svn urls to git urls.

Just a quick note to make it public that my WIP on this issue is
available from the t/svn-externals branch at
http://repo.or.cz/w/git/ydirson.git.  Progressing slowly (as time
permits with a 3-children family and a real-life job ;), but still
progressing.

This is targetted at incremental conversion - if one looks for
one-shot imports, John's script is today the way to go.

Best regards,
-- 
Yann

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-06-10 21:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-27 16:51 git-svn: importing internal externals John Koleszar
2009-05-28 11:25 ` Yann Dirson
2009-05-29 21:05   ` John Koleszar
2009-06-10 21:58   ` Yann Dirson

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).