git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Linus Torvalds <torvalds@osdl.org>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: [PATCH] rev-list: add "--full-objects" flag.
Date: Sat, 09 Jul 2005 15:09:02 -0600	[thread overview]
Message-ID: <m1pstrr8k1.fsf@ebiederm.dsl.xmission.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0507071928220.25104@g5.osdl.org> (Linus Torvalds's message of "Thu, 7 Jul 2005 19:39:23 -0700 (PDT)")

Linus Torvalds <torvalds@osdl.org> writes:

> On Thu, 7 Jul 2005, Junio C Hamano wrote:
>> 
>> However it does not automatically mean that the avenue I have
>> been pursuing would not work; the server side preparation needs
>> to be a bit more careful than what I sent, which unconditionally
>> runs "prune-packed".  It instead should leave the files that
>> "--whole-trees" would have packed as plain SHA1 files, so that
>> the bulk is obtained by statically generated packs and the rest
>> can be handled in the commit-chain walker as before.

> The "fetch one object, parse it, fetch the next one, parse that.." 
> approach is just horrible.

Agreed.  That does not cover up latency at all and depending on the 
parsing cost can potentially even keep you from having anything on
your network connection for a noticeable amount of time.

> I ended up preferring the "rsync" thing even though rsync sucked badly on
> big object stores too, if only because when rsync got working, it at least
> nicely pipelined the transfers, and would transfer things ten times faster
> than git-ssh-pull did (maybe I'm exaggerating, but I don't think so, it
> really felt that way).

This feels to me like an implementation issue (no pipelining) rather
than a design issue (pipelining is impossible).

> And the thing is, if you purely follow one tree (which is likely the
> common case for a lot of users), then you are actually always likely
> better off with the "mirror it" model. Which is _not_ a good model for
> developers (for example, me rsync'ing from Jeff's kernel repository always
> got me hundreds of useless objects), but it's fine for somebody who
> actually just wants to track somebody else.

I assume the problem with the mirror it model was simply there were
to many objects?

> And then you really can use just rsync or wget or ncftpget or anything
> else that has a "fetch recursively, optimizing existing objects" mode.

Sane.  But with an intelligent fetcher and a little extra information
a dumb server should still be able to not fetch branches we care
nothing about.  I think that extra information is simply commit
object graph and which packs those commit objects are in.  I assume
the commit graph information will be fairly modest.

Once you have that extra information you can generate incremental
packs whenever you upload to the server, and you can make the
incremental packs per branch.

That should allow an dumb fetcher to look at the list of commits
and just fetch those packs it cares about, and since it only has
to look one place first it should be fairly sane.

The core idea is that if the dumb-server-preparation can anticipate
common access patterns (mirror a branch) and give enough information
so that can be done cheaply and pipelined I don't expect it to be much
worse than an intelligent fetcher.

The current intelligent fetch currently has a problem that it cannot
be used to bootstrap a repository.  If you don't have an ancestor
of what you are fetching you can't fetch it.

Eric

  reply	other threads:[~2005-07-09 21:09 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-07-03 23:46 [ANNOUNCE] Cogito-0.12 Petr Baudis
2005-07-06 12:01 ` Brian Gerst
2005-07-07 14:45   ` Petr Baudis
2005-07-07 17:21     ` Junio C Hamano
2005-07-07 19:04       ` Linus Torvalds
2005-07-07 19:57         ` Junio C Hamano
2005-07-07 21:58           ` Linus Torvalds
2005-07-07 22:10             ` Junio C Hamano
2005-07-07 20:00         ` Junio C Hamano
2005-07-07 21:29         ` Eric W. Biederman
2005-07-07 22:23           ` Linus Torvalds
2005-07-08  2:11             ` Eric W. Biederman
2005-07-08  1:54           ` Dumb servers (was: [ANNOUNCE] Cogito-0.12) Kevin Smith
2005-07-08  2:27             ` Linus Torvalds
2005-07-07 22:14         ` [ANNOUNCE] Cogito-0.12 Petr Baudis
2005-07-07 22:52           ` Linus Torvalds
2005-07-07 23:16             ` [PATCH] Pull efficiently from a dumb git store Junio C Hamano
2005-07-07 23:50               ` [PATCH] rev-list: add "--objects=self-sufficient" flag Junio C Hamano
2005-07-07 23:58                 ` Linus Torvalds
2005-07-08  1:02                   ` [PATCH] rev-list: add "--full-objects" flag Junio C Hamano
2005-07-08  1:33                     ` Linus Torvalds
2005-07-08  1:46                     ` Linus Torvalds
2005-07-08  2:17                       ` Junio C Hamano
2005-07-08  2:39                         ` Linus Torvalds
2005-07-09 21:09                           ` Eric W. Biederman [this message]
2005-07-10  5:11                             ` Linus Torvalds
2005-07-10  6:28                               ` Junio C Hamano
2005-07-10 21:48                             ` Sven Verdoolaege
2005-07-10 22:36                             ` Linus Torvalds
2005-07-11 15:19                               ` Eric W. Biederman
2005-07-11 16:38                                 ` Linus Torvalds
2005-07-12  0:44                                   ` Eric W. Biederman
2005-07-12  1:14                                     ` Linus Torvalds
2005-07-12  2:38                                       ` Eric W. Biederman
2005-07-12  3:21                                         ` Linus Torvalds
2005-07-12  3:39                                           ` Eric W. Biederman
2005-07-12  4:48                                             ` Linus Torvalds
2005-07-11 17:53                                 ` Linus Torvalds
     [not found]                           ` <7vy88gzn6s.fsf@assigned-by-dhcp.cox.net>
     [not found]                             ` <Pine.LNX.4.58.0507082109140.17536@g5.osdl.org>
     [not found]                               ` <7vfyumj8hn.fsf_-_@assigned-by-dhcp.cox.net>
2005-07-11  7:00                                 ` [PATCH] Check packs and then files Junio C Hamano
2005-07-08  1:03                   ` [PATCH] Give --full-objects flag to rev-list when preparing a dumb server Junio C Hamano
2005-07-07 23:50               ` [PATCH] Use --objects=self-sufficient flag to rev-list Junio C Hamano
2005-07-07 23:52             ` [ANNOUNCE] Cogito-0.12 Tony Luck
2005-07-07 23:54               ` Junio C Hamano
2005-07-07 23:59               ` Linus Torvalds
2005-07-08  0:09                 ` Tony Luck
2005-07-08  0:23                   ` Linus Torvalds
2005-07-09 21:58                     ` Russell King
2005-07-09 22:29                       ` Russell King
2005-07-09 23:46                         ` Junio C Hamano
2005-07-10  5:02                           ` Linus Torvalds
2005-07-10  5:15                             ` Linus Torvalds
2005-07-10  6:55                               ` Russell King
2005-07-10  7:15                                 ` Junio C Hamano
2005-07-10 12:46                                   ` Russell King
2005-07-10 16:51                                     ` Linus Torvalds
2005-07-10 19:15                                       ` Russell King
2005-07-10 20:03                                         ` Linus Torvalds
2005-07-10 20:32                                           ` Russell King
2005-07-10 21:40                                             ` Linus Torvalds
2005-07-10  8:09                       ` Russell King
2005-07-10 14:59                         ` Petr Baudis
2005-07-11 20:30                           ` Chris Wright
2005-07-08  0:09                 ` Linus Torvalds
2005-07-08  8:14                   ` Petr Baudis
2005-07-08 15:56                     ` Daniel Barkalow
2005-07-07  6:22 ` Chris Wright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m1pstrr8k1.fsf@ebiederm.dsl.xmission.com \
    --to=ebiederm@xmission.com \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).