git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Kelly F. Hickel" <kfh@mqsoftware.com>
To: "Robin H. Johnson" <robbat2@gentoo.org>,
	"Git Mailing List" <git@vger.kernel.org>
Subject: RE: Best way to merge two repos with same content, differenthistory
Date: Fri, 5 Jun 2009 15:08:55 -0500	[thread overview]
Message-ID: <63BEA5E623E09F4D92233FB12A9F794303117DD2@emailmn.mqsoftware.com> (raw)
In-Reply-To: <robbat2-20090605T194802-473902673Z@orbis-terrarum.net>

> -----Original Message-----
> From: git-owner@vger.kernel.org [mailto:git-owner@vger.kernel.org] On
> Behalf Of Robin H. Johnson
> Sent: Friday, June 05, 2009 3:02 PM
> To: Git Mailing List
> Subject: Re: Best way to merge two repos with same content,
> differenthistory
> 
> On Fri, Jun 05, 2009 at 02:06:25PM -0500, Kelly F. Hickel wrote:
> > Robin,
> > 	That's all good news, I have an 8 way box with 32gb of ram
> running a
> > 64 bit Linux, a box with 4 gb of ram panics during the conversion.
> Thanks for your data.
> 
> For comparison, our conversion box is also 8-way, but only 16GiB RAM.
> 
> I'm surprised at how long pass1 is for you, especially since you've
got
> a lot less CVS Files and CVS Revisions than the Gentoo repo (I do
> deduce
> that your individual revisions are larger, averaging at 15KiB vs. our
> 711 bytes).
> 
> I think there's something odd in the total CVS branches/tags count
> however, as the counts there imply an average of 67 branches and 173
> tags per CVS revision. You might want to dig into that part manually
> and
> see about it (not sure of your Python skills). That would probably cut
> down both your pass1 and pass4 times significantly.

Robin, I'm not much with python, so haven't dug into the code much at
all. The numbers are high, although we do create a lot of branches (had
to contribute a fix a year or two to CVS to get the branching time down
from the 2.5 hours it was taking).  At one point I carefully examined
the symbol file that cvs2git was outputting and convinced myself that it
was doing the right thing, but that was awhile ago.

> 
> Hopefully mhagger will get the external blob stuff committed soon, I
> was
> working on validating it's results.
> 
> In doing so discovered a testcase where RCSRevisionReader and
> CVSRevisionReader gave different output themselves, the latter (which
> is
> documented as more accurate otherwise) missing the contents of an
> entire
> file. It's on the cvs2svn-dev mailing list now. Tracing that first,
> thereafter comparing it to the new Git side.
> 
> > git repack -a -d -f --depth=4000 --window=4000 && git pack-refs
--all
> Did those extreme depth/window values actually help size much? The
> Gentoo ones actually didn't improve significantly over
depth=window=50.

I know that they were still (apparently) improving after the 200 mark,
it took long enough at 200 that I just decided to crank the numbers way
up and let it run over the weekend.

> 
> --
> Robin Hugh Johnson
> Gentoo Linux Developer & Infra Guy
> E-Mail     : robbat2@gentoo.org
> GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85

I'll be looking forward to a newer faster cvs2git, although I did just
get the graft idea working, so not sure if we'll wait that long or not
(would be nice not to have to muck around with it though).

Thanks,
Kelly

  reply	other threads:[~2009-06-05 20:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-05 16:30 Best way to merge two repos with same content, different history Kelly F. Hickel
2009-06-05 16:53 ` Rostislav Svoboda
2009-06-05 17:10   ` Kelly F. Hickel
2009-06-05 17:19     ` Rostislav Svoboda
2009-06-05 18:46     ` Robin H. Johnson
2009-06-05 19:06       ` Best way to merge two repos with same content, differenthistory Kelly F. Hickel
2009-06-05 20:02         ` Robin H. Johnson
2009-06-05 20:08           ` Kelly F. Hickel [this message]
2009-06-19  9:52           ` Michael Haggerty
2009-06-05 17:01 ` Best way to merge two repos with same content, different history Avery Pennarun
2009-06-05 17:11   ` Kelly F. Hickel
2009-06-05 17:15 ` Markus Heidelberg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63BEA5E623E09F4D92233FB12A9F794303117DD2@emailmn.mqsoftware.com \
    --to=kfh@mqsoftware.com \
    --cc=git@vger.kernel.org \
    --cc=robbat2@gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).