git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Git:  CVS to Git import
@ 2011-11-11 23:17 Jvsrvcs
  2011-11-11 23:43 ` Jakub Narebski
  2011-11-14  2:44 ` Matthew Ogilvie
  0 siblings, 2 replies; 4+ messages in thread
From: Jvsrvcs @ 2011-11-11 23:17 UTC (permalink / raw
  To: git

Git:  CVS to Git import

We are moving from CVS to Git and want to know if anyone has had any
experience there doing this and could share do's  / dont's, best practices
when doing the initial import.

Also are there any known problems/bugs with the cvs to git import with
regards to CVS history?

Regards,

J.V.

--
View this message in context: http://git.661346.n2.nabble.com/Git-CVS-to-Git-import-tp6987037p6987037.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Git:  CVS to Git import
  2011-11-11 23:17 Git: CVS to Git import Jvsrvcs
@ 2011-11-11 23:43 ` Jakub Narebski
  2011-11-12  0:24   ` Jonathan Nieder
  2011-11-14  2:44 ` Matthew Ogilvie
  1 sibling, 1 reply; 4+ messages in thread
From: Jakub Narebski @ 2011-11-11 23:43 UTC (permalink / raw
  To: Jvsrvcs; +Cc: git

Jvsrvcs <jvsrvcs@gmail.com> writes:

> Git:  CVS to Git import
> 
> We are moving from CVS to Git and want to know if anyone has had any
> experience there doing this and could share do's  / dont's, best practices
> when doing the initial import.
> 
> Also are there any known problems/bugs with the cvs to git import with
> regards to CVS history?

I think that Eric S Raymond "DVCS Migration Guide"

   http://www.catb.org/esr/dvcs-migration-guide.html

and reposurgeon tool (to clean up conversion artifacts)

   http://www.catb.org/esr/reposurgeon/

might help.

-- 
Jakub Narębski

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Git:  CVS to Git import
  2011-11-11 23:43 ` Jakub Narebski
@ 2011-11-12  0:24   ` Jonathan Nieder
  0 siblings, 0 replies; 4+ messages in thread
From: Jonathan Nieder @ 2011-11-12  0:24 UTC (permalink / raw
  To: Jvsrvcs; +Cc: Jakub Narebski, git

Hi,

Jakub Narebski wrote:
> Jvsrvcs <jvsrvcs@gmail.com> writes:

>> Git:  CVS to Git import
>> 
>> We are moving from CVS to Git and want to know if anyone has had any
>> experience there doing this and could share do's  / dont's, best practices
>> when doing the initial import.
[...]
> I think that Eric S Raymond "DVCS Migration Guide"
>
>    http://www.catb.org/esr/dvcs-migration-guide.html

That page says that "git cvsimport" tends to be your best bet.  But my
experience is exactly the opposite --- git-cvsimport can make a lot of
mistakes, some of them documented in the ISSUES section of its
manpage, and it is hard to notice until later.

I've have good experiences using cvs2git from
<git://repo.or.cz/cvs2svn.git> (but note that it does not support
incremental imports).

> and reposurgeon tool (to clean up conversion artifacts)
>
>    http://www.catb.org/esr/reposurgeon/

Last time I checked, reposurgeon loads the entire history in memory.
For projects with a longer history, "git filter-branch" might be a
better fit.

By the way, thanks for writing.  If your experience results in some
ideas for improving Documentation/gitcvs-migration.txt page, we'll
probably be happy to take them. :)

Good luck,
Jonathan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Git:  CVS to Git import
  2011-11-11 23:17 Git: CVS to Git import Jvsrvcs
  2011-11-11 23:43 ` Jakub Narebski
@ 2011-11-14  2:44 ` Matthew Ogilvie
  1 sibling, 0 replies; 4+ messages in thread
From: Matthew Ogilvie @ 2011-11-14  2:44 UTC (permalink / raw
  To: Jvsrvcs; +Cc: git

On Fri, Nov 11, 2011 at 03:17:33PM -0800, Jvsrvcs wrote:
> Git:  CVS to Git import
> 
> We are moving from CVS to Git and want to know if anyone has had any
> experience there doing this and could share do's  / dont's, best practices
> when doing the initial import.

Some ideas:

I wouldn't trust "git cvsimport".  In my testing, it was actaully fairly
common for the resulting git tags and branches to be inconsistent with the
original CVS tags and branches: checking out a tag from CVS and the same
tag from GIT, the trees were often different.  See the manpage
for a list of some of the known issues. 

Use cvs2git instead.

Write up your own script to do the conversion.  Iteratively inspect
the results, find ways to fix up anything you don't like,
and re-run the script.  Any "fixups" you want should be
scripted, so that you can try different things, examine
the result.  Then when the actual "real" conversion
happens, you have a minimal amount of downtime as you your
already-tested script runs.

The exact fixups your script should do depend on your
circumstances, but in my case, some of things my script did included:

  - First, copy the CVS repository, and work with the copy:
  - Delete some ",v" files we didn't interested in importing into git for
    various reasons.
  - Tweak some CVS commit timestamps in some files (such as a version
    file), to reduce import odditities.  (The most common oddities
    resulted from an old CVS workflow that would often sequence:
    (a) checkout, (b) modify version number file, (c) build, (d) commit
    the new version number file, and (e) tag the sandbox.  It was
    was moderately common for other changes (in other files) to
    be committed between (a) and (d), which will either cause
    strange import artifacts or actually break import tools, due to
    the out-of-order timestamps.  Tweaking back the timestamp in the
    CVS file typically allows the import tool to avoid the
    oddity.  Completely cleaning this up would have been a
    lot of work, so I focused just on just improving recent
    history.)  (sed -i ...)
  - Do the bulk of the import work using cvs2git.
  - Graft on appropriate merge history (multiple parents) for
    CVS merges.  To save time, I only worried about recent merges.
     - If you have a nice consistent tag naming
       convention, there are ways to do this as part of cvs2git.
       Unfortunately, we didn't.
     - Do not refer to a previous run's commit SHA-1's; they'll
       likely change as things change.  Use CVS tags instead.
     - git rev-parse is useful for looking up current references
       to construct graft lines.
  - Use git filter-branch to both make the above grafts permanent,
    and to fix commiter/author username/email.
  - Move imported tags and branches to refs/oldcvstags/*
    and refs/oldcvsbranches/*, to bury a lot of the noise
    (automatic build tags, tags applied as part of doing a
    merge, etc) to where a normal "git clone" will not grab
    them, but they can still be fetched manually if necessary.
  - Copy/rename a few recent release tags and branches to
    normal refs/tags/* and refs/heads/*, when they are actually
    useful. (git pack-refs and sed)
  - Something like: sleep 5 ; git gc --aggressive --prune='1 second ago'

--
Matthew Ogilvie   [mmogilvi_git@miniinfo.net]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-11-14  2:50 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-11-11 23:17 Git: CVS to Git import Jvsrvcs
2011-11-11 23:43 ` Jakub Narebski
2011-11-12  0:24   ` Jonathan Nieder
2011-11-14  2:44 ` Matthew Ogilvie

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).