git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: David Lang <david.lang@digitalinsight.com>
To: Martin Waitz <tali@admingilde.org>
Cc: Junio C Hamano <junkio@cox.net>,
	Josef Weidendorfer <Josef.Weidendorfer@gmx.de>,
	Eric Lesh <eclesh@ucla.edu>, Matthieu Moy <Matthieu.Moy@imag.fr>,
	git@vger.kernel.org
Subject: Re: Submodule object store
Date: Mon, 26 Mar 2007 15:40:15 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.63.0703261530030.14387@qynat.qvtvafvgr.pbz> (raw)
In-Reply-To: <20070326235527.GM22773@admingilde.org>

On Tue, 27 Mar 2007, Martin Waitz wrote:

> hoi :)
>
> On Mon, Mar 26, 2007 at 03:20:34PM -0800, David Lang wrote:
>>> I want to be able to list all objects which are not reachable in the
>>> object store, without traversing all submodules at the same time.
>>> The only way I can think of to achieve this is to have one separate
>>> object store per submodule and then do the traversal per submodule.
>>
>> why do you want to optimize for the relativly rare fsck function rather
>> then the more common read functions (which would benifit from shareing
>> object that are identical between projects)?
>
> Because I don't know how to make it _possible_ for large repositories
> otherwise.  Consider a Linux-distribution which handles each package
> as one submodule.
>
> I don't think that it's too much balanced towards fsck.
> The separated object store also helps reduce the memory requirement for
> large pushs/pulls.
> Sharing objects can be achieved by alternates if you want.

alternates require explicitly setting up the sharing.

useing the same object store makes this work automaticaly (think of all the 
copies of COPYING that would end up being the same as a trivial example)

> If someone comes up with a nice way to handle everything in one big
> object store I would happily use that! :-)

what exactly are the problems with one big object store?

ones that I can think of:

1. when you are doing a fsck you need to walk all the trees and find out the 
list of objects that you know about.

   done as a tree of binary values you can hold a LOT in memory before running 
into swap.

   if it's enough larger then available ram then an option for fsck to use trees 
on disk is an option.

2. when creating a pack you will eventually run into pack-size limits with too 
many objects

   teach the pack creators to make packs that are subsets rather then everything 
(I belive that most of the smarts are there, it just needs the upper control 
logic to tell the existing things what to include)

3. when doing a pull it takes longer to figure out what to pull to get a 
duplicate of _everything_

   add a way to do a 'pull projectlist' that would look at what objects are 
needed by the project(s) requested and only try to pack up those objects

what else is there that I'm not thinking of? so far these look like long-term 
problems as opposed to short-term problems, and all of them have fairly simple 
fixes that can be implemented as they become an issue.

David Lang

  reply	other threads:[~2007-03-27  0:07 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-25 12:30 .gitlink for Summer of Code Eric Lesh
2007-03-25 15:20 ` Matthieu Moy
2007-03-25 20:39   ` Shawn O. Pearce
2007-03-25 20:54     ` Johannes Schindelin
2007-03-25 21:03       ` Shawn O. Pearce
2007-03-25 20:55     ` Junio C Hamano
2007-03-25 21:05       ` Shawn O. Pearce
2007-03-27  3:40       ` Petr Baudis
2007-03-26 17:16   ` Eric Lesh
2007-03-26 17:22     ` Matthieu Moy
2007-03-26 17:38       ` Eric Lesh
2007-03-26 18:35         ` Martin Waitz
2007-03-26 19:33           ` Josef Weidendorfer
2007-03-26 19:49             ` Matthieu Moy
2007-03-26 23:14               ` Josef Weidendorfer
2007-03-27 16:59                 ` Matthieu Moy
2007-03-26 22:03             ` Martin Waitz
2007-03-26 22:51               ` Junio C Hamano
2007-03-26 23:16                 ` Submodule object store Martin Waitz
2007-03-26 23:28                   ` Junio C Hamano
2007-03-26 23:36                     ` Martin Waitz
2007-03-26 23:20                       ` David Lang
2007-03-26 23:55                         ` Martin Waitz
2007-03-26 23:40                           ` David Lang [this message]
2007-03-27 15:25                             ` Martin Waitz
2007-03-27 16:53                               ` David Lang
2007-03-27  0:29                           ` Junio C Hamano
2007-03-27 14:28                             ` Martin Waitz
2007-03-27 11:25                       ` Uwe Kleine-König
2007-03-27 11:50                         ` Uwe Kleine-König
2007-03-27 15:53                           ` Martin Waitz
2007-03-27 16:56                             ` Josef Weidendorfer
2007-03-27 16:44                               ` Martin Waitz
2007-03-27 17:22                             ` Uwe Kleine-König
2007-03-27 18:41                               ` Linus Torvalds
2007-03-27 19:42                                 ` Uwe Kleine-König
2007-03-27 19:53                                   ` Linus Torvalds
2007-03-27 19:59                                     ` Linus Torvalds
2007-03-27 15:46                         ` Martin Waitz
2007-03-26 23:17                 ` .gitlink for Summer of Code Josef Weidendorfer
     [not found]                   ` <Pine.LNX.4.64.0703270952020. 6730@woody.linux-foundation.org>
2007-03-26 23:24                   ` Junio C Hamano
2007-03-27 17:04                   ` Linus Torvalds
2007-03-27 17:00                     ` David Lang
2007-03-27 18:15                       ` Linus Torvalds
2007-03-27 17:35                     ` Martin Waitz
2007-03-27 18:09                     ` Daniel Barkalow
2007-03-27 18:19                       ` Linus Torvalds
2007-03-27 20:54                         ` Daniel Barkalow
2007-03-27 21:11                           ` Linus Torvalds
2007-03-27 20:54                             ` David Lang
2007-03-27 23:31                               ` Jakub Narebski
2007-03-27 23:20                                 ` David Lang
2007-03-27 18:36                       ` Steven Grimm
2007-03-27 20:02                         ` Daniel Barkalow
2007-03-27 21:27                           ` Linus Torvalds
2007-03-26 23:00               ` Josef Weidendorfer
2007-03-26 23:27                 ` Martin Waitz
2007-03-26 17:31   ` Jakub Narebski
2007-03-26 18:21     ` Matthieu Moy
2007-03-27  0:48       ` Jakub Narebski
2007-03-25 20:46 ` Shawn O. Pearce

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.63.0703261530030.14387@qynat.qvtvafvgr.pbz \
    --to=david.lang@digitalinsight.com \
    --cc=Josef.Weidendorfer@gmx.de \
    --cc=Matthieu.Moy@imag.fr \
    --cc=eclesh@ucla.edu \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=tali@admingilde.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).