git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Christian Couder <christian.couder@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, git <git@vger.kernel.org>
Subject: Re: What's cooking in git.git (Aug 2018, #04; Fri, 17)
Date: Sat, 18 Aug 2018 14:54:27 +0200	[thread overview]
Message-ID: <CAP8UFD3tyfvBbzUkcLyowNt0jfDR8Bv4MAFmirQPcOQJVBZisg@mail.gmail.com> (raw)
In-Reply-To: <876007rjqc.fsf@evledraar.gmail.com>

On Sat, Aug 18, 2018 at 1:34 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Sat, Aug 18 2018, Christian Couder wrote:

> > FYI this has been requested from GitLab by Drupal (as well as others)
> > see https://www.drupal.org/drupalorg/blog/developer-tools-initiative-part-5-gitlab-partnership
> > which contains:
> >
> > "The timeline for Phase 2 is dependent on GitLab’s resolution of a
> > diskspace deduplication issue, which they have committed to on our
> > behalf: https://gitlab.com/gitlab-org/gitlab-ce/issues/23029"
>
> This is not a critique of the delta islands feature, just something I'm
> curious about.
>
> Why is Drupal blocked on something like delta-islands? The blog post
> mentions they have 45k projects, which can be browsed at
> https://cgit.drupalcode.org
>
> Almost all of those are completely independent projects, so they
> wouldn't benefit from delta islands, and I'd guess >98% are of them are
> in an obscure long tail and probably won't have even a single fork.
>
> That leaves forks of say drupal.git, which is ~150MB, the mirror on
> GitHub has 1500 forks: https://github.com/drupal Even if there were 5000
> forks of that that would be 750G of disk space.
>
> So accounting for backups, me being off by a lot etc. let's say that's
> 5TB. That's relatively cheap today. Are they really just holding up
> their GitLab migration plans to save something on the order of that disk
> space, or have I missed something here?

I am not sure why. I haven't been in touch directly with Drupal people
and I haven't followed all the discussions with them.

When I discussed this topic with someone responsible for another
significant open source project. He told me that they don't want forks
at all on their self hosted GitLab instance because of the disk space
and management burden that would come with them. But they would like
people to fork on a separate GitLab instance like gitlab.com or
github.com (but then be able to send merge/pull requests to their self
hosted GitLab instance) because they know that developers like to have
their own fork :-)

So my guess is that Drupal people are also afraid of the possible disk
space burden, even if your numbers seem to say that they shouldn't.

> Again, not a critique of delta-islands, because it's most certainly
> useful for the likes of github/gitlab, but I wonder if for this
> particular problem it wouldn't be more straightforward of a solution for
> GitLab to allow anyone to push to
> refs/for-merge/<their-username>/<some-name-they-pick> on any
> repository. Then they could open a MR for an existing branch in the repo
> (which GitLab already supports).

My guess is that people are used to forks on GitHub and they like them
and want to have the same thing and same workflow on GitLab.

The delta islands documentation though says that it's possible to use
git namespaces along with delta islands, and if forks are indeed
implemented as namespaces, it will be kind of similar in the GitLab
internals as what you suggest. There are still discussions by the way
in the GitLab issue referenced above about whether it's better for
GitLab to use git namespaces or alternates to implement deduplication
in forks (along with delta islands).

  reply	other threads:[~2018-08-18 12:54 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-17 22:44 What's cooking in git.git (Aug 2018, #04; Fri, 17) Junio C Hamano
2018-08-18  6:29 ` Duy Nguyen
2018-08-18  6:59 ` Jonathan Nieder
2018-08-20 17:26   ` Junio C Hamano
2018-08-20 18:14     ` Derrick Stolee
2018-08-18  9:34 ` Christian Couder
2018-08-18 11:34   ` Ævar Arnfjörð Bjarmason
2018-08-18 12:54     ` Christian Couder [this message]
2018-08-18 11:10 ` Ævar Arnfjörð Bjarmason
2018-08-20 10:23 ` Phillip Wood
2018-08-20 17:44   ` Eric Sunshine
2018-08-20 19:36   ` pw/rebase-i-author-script-fix, was " Johannes Schindelin
2018-08-20 18:11 ` Stefan Beller
2018-08-20 21:32   ` Junio C Hamano
2018-08-20 18:19 ` Jonathan Nieder

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAP8UFD3tyfvBbzUkcLyowNt0jfDR8Bv4MAFmirQPcOQJVBZisg@mail.gmail.com \
    --to=christian.couder@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).