git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jan Holesovsky <kendy@suse.cz>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org, releases@openoffice.org
Subject: Re: Git benchmarks at OpenOffice.org wiki
Date: Wed, 2 May 2007 16:24:24 +0200	[thread overview]
Message-ID: <200705021624.25560.kendy@suse.cz> (raw)
In-Reply-To: <200705012346.14997.jnareb@gmail.com>

Hi Jakub,

On Tuesday 01 May 2007 23:46, Jakub Narebski wrote:

> OpenOffice.org is looking for a new SCM (Software Configuration
> Management) tool, or at least was on Friday, 19 Jan 2007;
> see: http://blogs.sun.com/GullFOSS/entry/openoffice_org_scm
>
> One of the SCMs considered is Git. One of others is Subversion.
> There is a functional git tree with the entire OOo history for testing
> purposes that can be found at: http://go-oo.org/git.
>
> What I am concerned about is some of git benchmark results at Git page
> on OpenOffice.org wiki:
>   http://wiki.services.openoffice.org/wiki/Git#Comparison
> Actually it is comparison with CVS and Subversion, although most
> benchmarks are done only for git.

I did the git numbers, so if they are wrong - blame me :-)  I am also curious
about the SVN numbers, because the SVN conversion [from my point of view]
cheats a lot.  From what I know, it does not contain the historical branches
(yes, the >3000 of them that are in the git tree), and if I understood that
correctly, instead of history in the branches, they commit just
'integration commits' [one commit for all the changes in the branch] which
breaks 'svn blame' completely.

Unfortunately, I did not have a chance to try the SVN tree yet to see it
myself to prove this true or false :-(

> In 'Size of data on the server' git has CVS beat hands down: 1.3G vs
> 8.5G for sources, 591M vs 1.1G for third party. I think it is similar
> for Subversion. I hope that repository is fully packed: IIRC the Mozilla
> CVS repository import was about 0.6GB pack file, not 1.3GB.
>
> The problem is with 'Size of checkout': to start working in repository
> one needs 1.4G (sources) and 98M (third party) for CVS checkout (it is
> 1.5G for sources for Subversion checkout). Ordinary for distributed SCM
> you would need size of repository + size of sources (working area),
> which is 2.8G for sources and 688M for third party stuff files you can
> hack on + the history]. This makes some prefer to go centralized SCM
> route, i.e. Subversion as replacement for CVS (+ CWS, ChildWorkSpace).

Considering the size OOo needs for build (>8G without languages),
the ~1.4G overhead for history is very well bearable.  I am surprised about
the 100M overhead for SVN as well - from my experience it is usually about
the size of the project itself; but maybe they improved something in SVN
in the meantime.

> What might help here is splitting repository into current (e.g. from
> OOo 2.0) and historical part,

No, I don't want this ;-)

> and / or using shallow clone. Implementing 
> partial checkouts, i.e. checking out only part of working area (and
> using 'theirs' strategy for merging not-checked-out part for merges)
> would help. Splitting repository into submodules, and submodule
> support -- it depends on organization of OOo sources, would certainly
> help for third party stuff repository.

We should better split the OOo sources; it's a process that already started
[UNO runtime environment vs. OOo without URE], and I proposed some more
changes already.

> 'Checkout time' (which should be renamed to 'Initial checkout time'),
> in which git also loses with 130 minutes (Linux, 2MBit DSL) [from
> go-oo.org], 100min (Linux, 2MBit DSL, Wireless, no proxy) [from
> go-oo.org] versus 117 minutes (Linux, 2MBit DSL), 26 minutes (Linux,
> 2MBit DSL, with compression (-z 6)) for CVS, and  60 Minutes (Windows,
> 34Mbit Line) for Subversion, would also be helped by the above.

Good point, and I already changed the page in the morning.  I also added the
checkout time that I got over a fast line [it was 44min].

> What I'm really concerned about is branch switch and merging branches,
> when one of the branches is an old one (e.g. unxsplash branch), which
> takes 3min (!) according to the benchmark. 13-25sec for commit is also
> bit long, but BRANCH SWITCHING which takes 3 MINUTES!? There is no
> comparison benchmark for CVS or Subversion, though...

I am really curious about the SVN tree.  As I said, I did not see it yet.
There is just some info about it here:
http://wiki.services.openoffice.org/wiki/SVNMigration, but I cannot check it
now, the Wiki is down :-(

> Comparison / benchmark lacks some crucial info, like what computer was
> used (CPU, RAM, HDD), what filesystem was used, git version etc. It
> does have commands used for tests (benchmarks).

For the git tests, it was:

CPU: AMD Athlon(tm) 64 Processor 3200+

RAM: 1G RAM

Disk (info from bonnie):
              ---Sequential Output (nosync)--- ---Sequential Input-- --Rnd Seek-
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --04k (03)-
Machine    MB K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU K/sec %CPU   /sec %CPU
one    1*2000 37819 77.6 44296 16.8 16982  5.1 35203 63.9 45915  6.6  152.4  0.4

Filesystem: ext3

> Could you confirm (or deny) those results? go-oo.org uses git 1.4.3.4;
> was there some improvement or bugfix related to the speed of checkout?

Regards,
Jan


  parent reply	other threads:[~2007-05-02 14:24 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-01 21:46 Git benchmarks at OpenOffice.org wiki Jakub Narebski
2007-05-01 22:27 ` Junio C Hamano
2007-05-02  8:55 ` Andy Parkins
2007-05-02  9:51   ` Julian Phillips
2007-05-02 10:58     ` Andy Parkins
2007-05-02 14:28       ` Julian Phillips
2007-05-02 15:30         ` Andy Parkins
2007-05-02 17:11           ` Julian Phillips
2007-05-02 14:37       ` Jan Holesovsky
2007-05-02 15:33         ` Andy Parkins
2007-05-02 17:26       ` Junio C Hamano
2007-05-02 10:24 ` Johannes Schindelin
2007-05-02 11:33   ` Jakub Narebski
2007-05-02 14:55     ` Johannes Schindelin
2007-05-05  3:56     ` Linus Torvalds
2007-05-07  8:05       ` Junio C Hamano
2007-05-07 15:22         ` Linus Torvalds
2007-05-02 14:41   ` Jan Holesovsky
2007-05-02 16:24     ` Johannes Schindelin
2007-05-02 14:24 ` Jan Holesovsky [this message]
2007-05-02 14:35   ` Johannes Schindelin
2007-05-02 16:15   ` Petr Baudis
2007-05-02 16:27     ` Jan Holesovsky
2007-05-02 16:37       ` Petr Baudis
2007-05-02 16:48         ` Petr Baudis
2007-05-02 23:30   ` Jakub Narebski
2007-05-03 11:51     ` [tools-dev] " Jan Holesovsky
2007-05-03 12:54       ` Alex Riesen
2007-05-03 15:14       ` Johannes Sixt
2007-05-04  0:48       ` Jakub Narebski
2007-05-03  7:03 ` Florian Weimer
2007-05-03  9:33   ` Johannes Schindelin
2007-05-03 10:16     ` Robin Rosenberg
2007-05-03 10:48       ` Martin Langhoff
2007-05-06 20:05         ` Robin Rosenberg
2007-05-03 23:36       ` Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200705021624.25560.kendy@suse.cz \
    --to=kendy@suse.cz \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=releases@openoffice.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).