git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: David Kastrup <dak@gnu.org>
Cc: git@vger.kernel.org
Subject: Re: performance on repack
Date: Thu, 16 Aug 2007 11:34:19 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.0.999.0708161103080.16727@xanadu.home> (raw)
In-Reply-To: <86d4xn5287.fsf@lola.quinscape.zz>

On Thu, 16 Aug 2007, David Kastrup wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > On Wed, 15 Aug 2007, Jon Smirl wrote:
> >
> >> On 8/15/07, Martin Koegler <mkoegler@auto.tuwien.ac.at> wrote:
> >> > git-pack-objects knows the order, in which it will use the objects.  A
> >> > seperate thread could pre-read the next object and wait until the main
> >> > thread starts processing it. After the read is complete, another
> >> > thread could start computing the delta index.
> >> 
> >> The hope is that the new adaptive read ahead code in the kernel will
> >> get this right and you won't need the second thread. Letting the
> >> kernel handle the read ahead will dynamically scale as other demands
> >> are made on the host. There's effectively only one read ahead cache in
> >> the system, only the kernel really knows how to divide it up between
> >> competing apps.
> >
> > No read ahead will ever help the delta search phase.
> 
> Well, the delta search phase consists of computing a delta index and
> then matching against it.

No, what I mean is what happens at a higher level where one object is 
deltified against several base object candidates to find the best match.  
Those several objects are presorted according to a combination of 
heuristics that makes their actual access completely random, hence no 
kernel read ahead might help here.

> If I understand correctly, delta indices
> for the search window are kept, and the current file is compared
> against them.  Locality might be better if just one delta index gets
> calculated and then compared with all _upcoming_ delta candidates in
> one go.

This appears so obvious that I attempted that a while ago already.

The idea turned up to be so complex to implement correctly and produced 
suboptimal results in practice that I abandoned it.

See http://marc.info/?l=git&m=114610715706599&w=2 for the details if 
you're interested.

PS: please at least CC me when replying to my mails


Nicolas

  reply	other threads:[~2007-08-16 15:34 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-11 21:12 performance on repack Jon Smirl
2007-08-11 22:09 ` David Kastrup
2007-08-11 22:34   ` Linus Torvalds
2007-08-11 23:21     ` Jon Smirl
2007-08-12 10:33 ` Martin Koegler
2007-08-12 13:49   ` Jon Smirl
2007-08-14  3:12     ` Shawn O. Pearce
2007-08-14  4:10       ` Jon Smirl
2007-08-14  5:13         ` Shawn O. Pearce
2007-08-14  5:57           ` Jon Smirl
2007-08-14 14:52       ` Nicolas Pitre
2007-08-14 21:41       ` Nicolas Pitre
2007-08-15  1:20         ` Jon Smirl
2007-08-15  1:59           ` Nicolas Pitre
2007-08-15  5:32         ` Shawn O. Pearce
2007-08-15 15:08           ` Jon Smirl
2007-08-15 17:11             ` Martin Koegler
2007-08-15 18:38               ` Jon Smirl
2007-08-15 19:00                 ` Nicolas Pitre
2007-08-15 19:42                   ` Jon Smirl
2007-08-16  8:10                   ` David Kastrup
2007-08-16 15:34                     ` Nicolas Pitre [this message]
2007-08-16 16:13                       ` Jon Smirl
2007-08-16 16:21                         ` Nicolas Pitre
2007-08-15 21:05             ` Nicolas Pitre
2007-08-15 20:49           ` Nicolas Pitre
2007-08-30  4:27             ` Nicolas Pitre
2007-08-30  4:36               ` Nicolas Pitre
2007-08-30 16:17                 ` Jon Smirl
2007-09-01 21:54                 ` Jon Smirl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.0.999.0708161103080.16727@xanadu.home \
    --to=nico@cam.org \
    --cc=dak@gnu.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).