git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Jon Smirl" <jonsmirl@gmail.com>
To: "Jakub Narebski" <jnareb@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: git-daemon on NSLU2
Date: Fri, 24 Aug 2007 19:46:50 -0400	[thread overview]
Message-ID: <9e4733910708241646x7b285574t94c3d7eb32bb60c9@mail.gmail.com> (raw)
In-Reply-To: <fanmmk$f5q$1@sea.gmane.org>

On 8/24/07, Jakub Narebski <jnareb@gmail.com> wrote:
> There was idea to special case clone (just concatenate the packs, the
> receiving side as someone told there can detect pack boundaries; do not
> forget to pack loose objects, first), instead of using generic fetch --all
> for clone, bnut no code. Code speaks louder than words (although if someone
> would provide details of pack boundary detection...)

A related concept, initial clone of a repository does the equivalent
of repack -a on the repo before transmitting it. Why aren't we saving
those results by switching the repo onto the new pack file? Then the
next clone that comes along won't have to do anything but send the
file.

But this logic can be flipped around, if the remote needs any object
from the pack file, just send them the whole pack file and let the
remote sort it out. Using this logic you can still minimize the IO
statistically.

When a remote does a fetch you have to pack all of the loose objects.
When the loose object pile reaches 20MB or so, the fetch can trigger a
repack of the oldest half into a pack that is kept by the tree and
replaces those older loose objects. For future fetches simply apply
the rule of sending the whole pack if any object is needed.

The repack of the 10MB of older objects can be kicked out to another
process and copied into the tree when it is finished. At that point
the loose objects can be deleted. The git db can tolerate a process
copying in a new packfile and deleting the old objects while other
processes may be using the database, right?

This model shouldn't statistically change the amount of data very
much. If you haven't synced your tree in a month a few too many
objects may get sent to you. However, it should dramatically reduce
the IO load on the server cause by git protocol initial clones.

-- 
Jon Smirl
jonsmirl@gmail.com

  parent reply	other threads:[~2007-08-24 23:47 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-24  5:54 git-daemon on NSLU2 Jon Smirl
2007-08-24  6:21 ` Shawn O. Pearce
2007-08-24 19:38   ` Jon Smirl
2007-08-24 20:23     ` Nicolas Pitre
2007-08-24 21:17       ` Jon Smirl
2007-08-24 21:54         ` Nicolas Pitre
2007-08-24 22:06         ` Jon Smirl
2007-08-24 22:39           ` Jakub Narebski
2007-08-24 22:59             ` Junio C Hamano
2007-08-24 23:21               ` Jakub Narebski
2007-08-24 23:46             ` Jon Smirl [this message]
2007-08-25  0:04               ` Junio C Hamano
2007-08-25  7:12                 ` David Kastrup
2007-08-25 17:02                 ` Salikh Zakirov
2007-08-25  0:10           ` Nicolas Pitre
2007-08-24 23:28         ` Linus Torvalds
2007-08-25 15:44           ` Jon Smirl
2007-08-26  9:33             ` Jeff King
2007-08-26 16:34               ` Jon Smirl
2007-08-26 17:15                 ` Linus Torvalds
2007-08-26 18:06                   ` Jon Smirl
2007-08-26 18:26                     ` Linus Torvalds
2007-08-26 19:00                       ` Jon Smirl
2007-08-26 20:19                         ` Linus Torvalds
2007-08-26 21:22                           ` Junio C Hamano
2007-08-27 11:03                       ` Theodore Tso
2007-08-27 16:26                         ` Linus Torvalds
2007-08-26 22:24                   ` Daniel Hulme
2007-08-27  0:14               ` Jakub Narebski
2007-08-24 20:27     ` Jon Smirl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9e4733910708241646x7b285574t94c3d7eb32bb60c9@mail.gmail.com \
    --to=jonsmirl@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).