git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Stefan Beller <sbeller@google.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	git <git@vger.kernel.org>,
	git-users@googlegroups.com,
	"Christian Couder" <christian.couder@gmail.com>
Subject: Re: How de-duplicate similar repositories with alternates
Date: Tue, 4 Dec 2018 02:06:02 -0500	[thread overview]
Message-ID: <20181204070602.GB11010@sigill.intra.peff.net> (raw)
In-Reply-To: <CAGZ79ka1sjU+rHctRP4SVMP0GQsK2iZghkU46=f96ugqvX5Neg@mail.gmail.com>

On Thu, Nov 29, 2018 at 10:55:49AM -0800, Stefan Beller wrote:

> On Thu, Nov 29, 2018 at 7:00 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
> >
> > A co-worker asked me today how space could be saved when you have
> > multiple checkouts of the same repository (at different revs) on the
> > same machine. I said since these won't block-level de-duplicate well[1]
> > one way to do this is with alternates.
> 
> Another way is to use git-worktree, which would solve the gc issues
> mentioned below?
> 
> I view alternates as a historic artefact as the deduping
> of objects client side can be done using worktrees, and on the
> serverside - I think - most of the git hosters use namespaces
> and put a fork network into the same repository and use pack islands.

Nope, we definitely use alternates. The ref namespace support in Git is
not nearly complete enough to run a modern hosting site; it only kicks
in for upload-pack and receive-pack. Other commands (e.g., rev-list to
traverse for a history-view page) have no support at all. So we share
object storage, but not ref storage.

In theory the caller could namespace requests (e.g., the user asks for
"foo", the web site feeds "refs/forks/$id/refs/heads/foo" to git). But
any bugs are a lot more likely to lead to security problems (oops, you
accidentally wrote into somebody else's fork!). And ref storage has
traditionally been a sore point for scaling, so giving each fork its own
repo and refs helps break that up.

By contrast, object storage is pretty easy to share. It scales
reasonably well, and the security model is much simpler due to the
immutable nature of object names.

-Peff

  parent reply	other threads:[~2018-12-04  7:06 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-29 14:59 How de-duplicate similar repositories with alternates Ævar Arnfjörð Bjarmason
2018-11-29 16:09 ` Ævar Arnfjörð Bjarmason
2018-11-29 18:55 ` Stefan Beller
2018-11-29 20:10   ` Ævar Arnfjörð Bjarmason
2018-11-29 20:43     ` Duy Nguyen
2018-12-04  7:06   ` Jeff King [this message]
2018-12-04 12:07     ` Derrick Stolee
2018-12-04  6:59 ` Jeff King
2018-12-04 10:43   ` Ævar Arnfjörð Bjarmason
2018-12-04 13:27     ` [PATCH 0/3] sha1-file: warn if alternate is a git repo (not object dir) Ævar Arnfjörð Bjarmason
2018-12-04 13:27     ` [PATCH 1/3] sha1-file: test the error behavior of alt_odb_usable() Ævar Arnfjörð Bjarmason
2019-03-28 20:04       ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2019-03-29 13:46         ` Jeff King
2019-03-29 13:55           ` Ævar Arnfjörð Bjarmason
2019-04-08 15:57             ` Ævar Arnfjörð Bjarmason
2019-04-09  8:21               ` Junio C Hamano
2019-04-09  8:45                 ` Ævar Arnfjörð Bjarmason
2019-04-09  9:43                   ` Junio C Hamano
2019-04-09 14:14                     ` Jeff King
2019-04-09  8:29               ` Junio C Hamano
2018-12-04 13:27     ` [PATCH 2/3] sha1-file: emit error if an alternate looks like a repository Ævar Arnfjörð Bjarmason
2018-12-05  3:35       ` Junio C Hamano
2018-12-05  6:10         ` Jeff King
2018-12-04 13:27     ` [PATCH 3/3] sha1-file: change alternate "error:" message to "warning:" Ævar Arnfjörð Bjarmason
2018-12-05  3:37       ` Junio C Hamano
2018-12-05  5:54         ` Jeff King
2018-12-05  3:30     ` How de-duplicate similar repositories with alternates Junio C Hamano
2018-12-04 13:35 ` Ævar Arnfjörð Bjarmason
2018-12-04 14:17   ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181204070602.GB11010@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git-users@googlegroups.com \
    --cc=git@vger.kernel.org \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).