git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Jon Smirl" <jonsmirl@gmail.com>
To: "Johannes Schindelin" <Johannes.Schindelin@gmx.de>
Cc: "Nicolas Pitre" <nico@cam.org>, "Jan Holesovsky" <kendy@suse.cz>,
	"Jakub Narebski" <jnareb@gmail.com>,
	git@vger.kernel.org, "Junio C Hamano" <gitster@pobox.com>
Subject: Re: [PATCH] RFC: git lazy clone proof-of-concept
Date: Tue, 12 Feb 2008 16:25:58 -0500	[thread overview]
Message-ID: <9e4733910802121325p7ce6b58axae71f698f76dbfd2@mail.gmail.com> (raw)
In-Reply-To: <alpine.LSU.1.00.0802122036150.3870@racer.site>

On 2/12/08, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Sun, 10 Feb 2008, Johannes Schindelin wrote:
>
> > $ /usr/bin/time git repack -a -d -f --window=150 --depth=150
> > Counting objects: 2477715, done.
> > Compressing objects:  19% (481551/2411764)
> > Compressing objects:  19% (482333/2411764)
> > fatal: Out of memory, malloc failed411764)
> > Command exited with non-zero status 1
> > 7118.37user 54.15system 2:01:44elapsed 98%CPU (0avgtext+0avgdata
> > 0maxresident)k
> > 0inputs+0outputs (29834major+17122977minor)pagefaults 0swaps
>
> I made the window much smaller (512 megabyte), and it still runs, after 27
> hours:
>
> Compressing objects:  20% (484132/2411764)
>
> However, it seems that it only worked on about 4000 objects in the last
> 20(!) hours.

I found that out with gcc. 95% went down in no time and the last 5%
took two hours. The 5% that got stuck were chains with 2000+ entries.

The neat thing about the multithread code is that it will keep
splitting the work load. That lets all of the easy deltas finish and
not get stuck behind the problem objects.

With quad core on gcc one core would get stuck on the problem objects.
The other three would finish their list and start splitting the
problem list. This effectively sorts the problems to the end of the
work load. By printing the object hash out as they are completed you
can easily identify the problem objects. If I recall right on gcc the
problem was a configure file that had 2000 entries in its delta chain.
That one delta chain took over an hour to process.

Could there be an N squared type problem when 2000 entry delta chains
are encountered? Maybe something that just isn't noticeable when
depth/window=50. Has testing been done with really long object chains
to make sure that only the minimal amount of work is being done? It
seems like something is breaking down when the chain length exceeds
the window size.

So, the first 19% were relatively quick.  The next percent
> not at all.
>
> Will keep you posted,
> Dscho
>
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
Jon Smirl
jonsmirl@gmail.com

  parent reply	other threads:[~2008-02-12 21:27 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-08 17:28 [PATCH] RFC: git lazy clone proof-of-concept Jan Holesovsky
2008-02-08 18:03 ` Nicolas Pitre
2008-02-09 14:25   ` Jan Holesovsky
2008-02-09 22:05     ` Mike Hommey
2008-02-09 23:38       ` Nicolas Pitre
2008-02-10  7:23     ` Marco Costalba
2008-02-10 12:08       ` Johannes Schindelin
2008-02-10 16:46         ` David Symonds
2008-02-10 17:45           ` Johannes Schindelin
2008-02-10 19:45             ` Nicolas Pitre
2008-02-10 20:32               ` Johannes Schindelin
2008-02-08 18:14 ` Harvey Harrison
2008-02-09 14:27   ` Jan Holesovsky
2008-02-08 18:20 ` Johannes Schindelin
2008-02-08 18:49 ` Mike Hommey
2008-02-08 19:04   ` Johannes Schindelin
2008-02-09 15:06   ` Jan Holesovsky
2008-02-08 19:00 ` Jakub Narebski
2008-02-08 19:26   ` Jon Smirl
2008-02-08 20:09     ` Nicolas Pitre
2008-02-11 10:13       ` Andreas Ericsson
2008-02-12  2:55         ` [PATCH 1/2] pack-objects: Allow setting the #threads equal to #cpus automatically Brandon Casey
2008-02-12  5:53           ` Andreas Ericsson
     [not found]         ` <1202784078-23700-1-git-send-email-casey@nrlssc.navy.mil>
2008-02-12  2:59           ` [PATCH 2/2] pack-objects: Default to zero threads, meaning auto-assign to #cpus Brandon Casey
2008-02-12  4:57             ` Nicolas Pitre
2008-02-08 20:19     ` [PATCH] RFC: git lazy clone proof-of-concept Harvey Harrison
2008-02-08 20:24       ` Jon Smirl
2008-02-08 20:25         ` Harvey Harrison
2008-02-08 20:41           ` Jon Smirl
2008-02-09 15:27   ` Jan Holesovsky
2008-02-10  3:10     ` Nicolas Pitre
2008-02-10  4:59       ` Sean
2008-02-10  5:22         ` Nicolas Pitre
2008-02-10  5:35           ` Sean
2008-02-11  1:42             ` Jakub Narebski
2008-02-11  2:04               ` Nicolas Pitre
2008-02-11 10:11                 ` Jakub Narebski
2008-02-10  9:34         ` Joachim B Haga
2008-02-10 16:43       ` Johannes Schindelin
2008-02-10 17:01         ` Jon Smirl
2008-02-10 17:36           ` Johannes Schindelin
2008-02-10 18:47         ` Johannes Schindelin
2008-02-10 19:42           ` Nicolas Pitre
2008-02-10 20:11             ` Jon Smirl
2008-02-12 20:37           ` Johannes Schindelin
2008-02-12 21:05             ` Nicolas Pitre
2008-02-12 21:08             ` Linus Torvalds
2008-02-12 21:36               ` Jon Smirl
2008-02-12 21:59                 ` Linus Torvalds
2008-02-12 22:25                   ` Linus Torvalds
2008-02-12 22:43                     ` Jon Smirl
2008-02-12 23:39                       ` Linus Torvalds
2008-02-12 21:25             ` Jon Smirl [this message]
2008-02-14 19:20             ` Johannes Schindelin
2008-02-14 20:05               ` Jakub Narebski
2008-02-14 20:16                 ` Nicolas Pitre
2008-02-14 21:04                 ` Johannes Schindelin
2008-02-14 21:59                   ` Jakub Narebski
2008-02-14 23:38                     ` Johannes Schindelin
2008-02-14 23:51                       ` Brian Downing
2008-02-14 23:57                         ` Brian Downing
2008-02-15  0:08                         ` Johannes Schindelin
2008-02-15  1:41                           ` Nicolas Pitre
2008-02-17  8:18                             ` Shawn O. Pearce
2008-02-17  9:05                               ` Junio C Hamano
2008-02-17 18:44                               ` Nicolas Pitre
2008-02-15  1:07                       ` Jakub Narebski
2008-02-15  9:43                     ` Jan Holesovsky
2008-02-14 21:08                 ` Brandon Casey
2008-02-15  9:34               ` Jan Holesovsky
2008-02-10 19:50         ` Nicolas Pitre
2008-02-14 19:41           ` Brandon Casey
2008-02-14 19:58             ` Johannes Schindelin
2008-02-14 20:11             ` Nicolas Pitre
2008-02-11  1:20     ` Jakub Narebski
2008-02-08 20:16 ` Johannes Schindelin
2008-02-08 21:35   ` Jakub Narebski
2008-02-08 21:52     ` Johannes Schindelin
2008-02-08 22:03       ` Mike Hommey
2008-02-08 22:34         ` Johannes Schindelin
2008-02-08 22:50           ` Mike Hommey
2008-02-08 23:14             ` Johannes Schindelin
2008-02-08 23:38               ` Mike Hommey
2008-02-09 21:20                 ` Jan Hudec
2008-02-09 15:54       ` Jan Holesovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e4733910802121325p7ce6b58axae71f698f76dbfd2@mail.gmail.com \
    --to=jonsmirl@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jnareb@gmail.com \
    --cc=kendy@suse.cz \
    --cc=nico@cam.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).