git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: Jeff King <peff@peff.net>, Git Mailing List <git@vger.kernel.org>
Cc: Jon Smirl <jonsmirl@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Nicolas Pitre <nico@cam.org>,
	"Shawn O. Pearce" <spearce@spearce.org>
Subject: Re: git-daemon on NSLU2
Date: Mon, 27 Aug 2007 02:14:28 +0200	[thread overview]
Message-ID: <200708270214.28652.jnareb@gmail.com> (raw)
In-Reply-To: <20070826093331.GC30474@coredump.intra.peff.net>

On Sun, Aug 26, 2007, Jeff King wrote:
> On Sat, Aug 25, 2007 at 11:44:07AM -0400, Jon Smirl wrote:
> 
>> A very simple solution is to sendfile() existing packs if they contain
>> any objects that the client wants and let the client deal with the
>> unwanted objects. Yes this does send extra traffic over the net, but
>> the only group significantly impacted is #2 which is the most
>> infrequent group.
>>
>> Loose objects are handled as they are currently. To optimize this
>> scheme you need to let the loose objects build up at the server and
>> then periodically sweep only the older ones into a pack. Packing the
>> entire repo into a single pack would cause recent fetches to retrieve
>> the entire pack.
> 
> I was about to write "but then 'fetch recent' clients will have to get
> the entire repo after the upstream does a 'git-repack -a -d'" but you
> seem to have figured that out already.
> 
> I'm unclear: are you proposing new behavior for git-daemon in general,
> or a special mode for resource-constrained servers? If general behavior,
> are you suggesting that we never use 'git-repack -a' on repos which
> might be cloned?

I think that "reuse existing packs if sensible" idea (instead of generating
always new pack) is a good one, even if at first limited to the clone case.

There are nevertheless a few complications.

1. When discussing this idea on git mailing list some time ago somebody
said that we don't need to implement "multi pack" extension (which was
at the beginning in the design, to add later, if I understand correctly),
it is enough to concatenate packs. The receiving side can then detect
boundaries between packs and split them appropriately. But is a
concatenated a proper pack? If not, then we can send concatenation of
packs only if the client (receiving side) understands it, and can split it;
it means checking for protocol extension...

2. How to detect that request is for a clone? git-clone is get all remote
heads and fetch from just received heads. But because fecthing refs and
fetching objects is separate, we cannot I think use this sequence for
detecting that we want a clone. We can use "no haves" as heuristic to
detect a clone request, but "no haves" occurs also for initial fetching of
single branch (i.e. using: git-remote; git-fetch sequence instead of
git-clone).

3. The problem with alternates mentioned by Linus is not much a problem,
as we can simply consider packs from the alternate repository/repositories.
For example if we use single alternate, we would send concatenation of
packs from this repository, and from alternate (and pack of loose objects
from this repository).


We would probably want to have some heuristic (besides configuring
git-daemon) to choose between reusing existing packs (and sending them
concatenated), and generating a pack for sending. Note that for dumb
transports we have the opposite problem and opposite idea: we always
send full packs for dumb transports; the idea was to use range downloading
(available at least for http and ftp protocols) to download only needed
fragments of packs. Perhaps if some % of pack (number of objects in the
pack or size of pack) is to be send then we reuse the pack, and remove
objects in the pack from consideration. No idea of how to implement that,
though. Or if number of objects in pack to be send crosses some threshold,
or generating pack/doing reachability analysis takes to loong, then reuse
existing packs.

Or you can wait fro the GitTorrent protocol to be implemented, or implement
it yourself... ;-)

-- 
Jakub Narebski
Poland

  parent reply	other threads:[~2007-08-27  0:15 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-08-24  5:54 git-daemon on NSLU2 Jon Smirl
2007-08-24  6:21 ` Shawn O. Pearce
2007-08-24 19:38   ` Jon Smirl
2007-08-24 20:23     ` Nicolas Pitre
2007-08-24 21:17       ` Jon Smirl
2007-08-24 21:54         ` Nicolas Pitre
2007-08-24 22:06         ` Jon Smirl
2007-08-24 22:39           ` Jakub Narebski
2007-08-24 22:59             ` Junio C Hamano
2007-08-24 23:21               ` Jakub Narebski
2007-08-24 23:46             ` Jon Smirl
2007-08-25  0:04               ` Junio C Hamano
2007-08-25  7:12                 ` David Kastrup
2007-08-25 17:02                 ` Salikh Zakirov
2007-08-25  0:10           ` Nicolas Pitre
2007-08-24 23:28         ` Linus Torvalds
2007-08-25 15:44           ` Jon Smirl
2007-08-26  9:33             ` Jeff King
2007-08-26 16:34               ` Jon Smirl
2007-08-26 17:15                 ` Linus Torvalds
2007-08-26 18:06                   ` Jon Smirl
2007-08-26 18:26                     ` Linus Torvalds
2007-08-26 19:00                       ` Jon Smirl
2007-08-26 20:19                         ` Linus Torvalds
2007-08-26 21:22                           ` Junio C Hamano
2007-08-27 11:03                       ` Theodore Tso
2007-08-27 16:26                         ` Linus Torvalds
2007-08-26 22:24                   ` Daniel Hulme
2007-08-27  0:14               ` Jakub Narebski [this message]
2007-08-24 20:27     ` Jon Smirl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200708270214.28652.jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonsmirl@gmail.com \
    --cc=nico@cam.org \
    --cc=peff@peff.net \
    --cc=spearce@spearce.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).