git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Lars Schneider <larsxschneider@gmail.com>
To: Christian Couder <christian.couder@gmail.com>
Cc: git <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>,
	Jeff King <peff@peff.net>,
	Nguyen Thai Ngoc Duy <pclouds@gmail.com>,
	Mike Hommey <mh@glandium.org>, Eric Wong <e@80x24.org>,
	Christian Couder <chriscool@tuxfamily.org>
Subject: Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
Date: Sun, 18 Dec 2016 14:13:11 +0100	[thread overview]
Message-ID: <3819FEA8-BF58-47D3-B60D-2840062022B7@gmail.com> (raw)
In-Reply-To: <CAP8UFD2uyq3Uf1co_BUKJX_eogdCDJ30KJZmQ1BQXNQ1dw=w3A@mail.gmail.com>


> On 13 Dec 2016, at 18:20, Christian Couder <christian.couder@gmail.com> wrote:
> 
> On Sat, Dec 3, 2016 at 7:47 PM, Lars Schneider <larsxschneider@gmail.com> wrote:
>> 
>>> On 30 Nov 2016, at 22:04, Christian Couder <christian.couder@gmail.com> wrote:
>>> 
>>> Goal
>>> ~~~~
>>> 
>>> Git can store its objects only in the form of loose objects in
>>> separate files or packed objects in a pack file.
>>> 
>>> To be able to better handle some kind of objects, for example big
>>> blobs, it would be nice if Git could store its objects in other object
>>> databases (ODB).
>> ...
>> 
>> Minor nit: I feel the term "other" could be more expressive. Plus
>> "database" might confuse people. What do you think about
>> "External Object Storage" or something?
> 
> In the current Git code, "DB" is already used a lot. For example in
> cache.h there is:
> ...

I am not worried about Git core developers as I don't think they would
be confused by the term "DB". I wonder if it would make sense to have
a clearer "external name" for the average Git user (== non Git devs) 
or if this would create just more confusion. 


>>> * Transfer
>>> 
>>> To tranfer information about the blobs stored in external ODB, some
>>> special refs, called "odb ref", similar as replace refs, are used.
>>> 
>>> For now there should be one odb ref per blob. Each ref name should be
>>> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
>>> in the external odb named <odbname>.
>>> 
>>> These odb refs should all point to a blob that should be stored in the
>>> Git repository and contain information about the blob stored in the
>>> external odb. This information can be specific to the external odb.
>>> The repos can then share this information using commands like:
>>> 
>>> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`
>> 
>> The "odbref" would point to a blob and the blob could contain anything,
>> right? E.g. it could contain an existing GitLFS pointer, right?
>> 
>> version https://git-lfs.github.com/spec/v1
>> oid sha256:4d7a214614ab2935c943f9e0ff69d22eadbb8f32b1258daaa5e2ca24d17e2393
>> size 12345
> 
> Yes, but I think that the sha1 should be added. So yes, it could
> easily be made compatible with git LFS.

What do you mean with "the sha1 should be added"? Do you suggest to add
sha1 to GitLFS?


>>> Future work
>>> ~~~~~~~~~~~
>>> 
>>> I think that the odb refs don't prevent a regular fetch or push from
>>> wanting to send the objects that are managed by an external odb. So I
>>> am interested in suggestions about this problem. I will take a look at
>>> previous discussions and how other mechanisms (shallow clone, bundle
>>> v3, ...) handle this.
>> 
>> If the ODB configuration is stored in the Git repo similar to
>> .gitmodules then every client that clones ODB references would be able
>> to resolve them, right?
> 
> Yeah, but I am not sure that being able to resolve the odb refs will
> prevent the big blobs from being sent.
> With Git LFS, git doesn't know about the big blobs, only about the
> substituted files, but that is not the case in what I am doing.

I think the biggest problem in Git are huge blobs that are not in the
head revision. In the great majority of cases you don't need these blobs
but you always have to transfer them during clone. That's what GitLFS
is solving today and what I hope your protocol could solve better in
the future!

Cheers,
Lars

      reply	other threads:[~2016-12-18 13:13 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-30 21:04 [RFC/PATCH v3 00/16] Add initial experimental external ODB support Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 01/16] Add initial external odb support Christian Couder
2016-11-30 23:30   ` Junio C Hamano
2016-11-30 23:37     ` Jeff King
2017-08-03  7:48       ` Christian Couder
2017-08-03  7:46     ` Christian Couder
2017-08-03  8:06       ` Jeff King
2016-11-30 21:04 ` [RFC/PATCH v3 02/16] external odb foreach Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 03/16] t0400: use --batch-all-objects to get all objects Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 04/16] t0400: add 'put' command to odb-helper script Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 05/16] t0400: add test for 'put' command Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 06/16] external odb: add write support Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 07/16] external-odb: accept only blobs for now Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 08/16] t0400: add test for external odb write support Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 09/16] Add GIT_NO_EXTERNAL_ODB env variable Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 10/16] Add t0410 to test external ODB transfer Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 11/16] lib-httpd: pass config file to start_httpd() Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 12/16] lib-httpd: add upload.sh Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 13/16] lib-httpd: add list.sh Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 14/16] lib-httpd: add apache-e-odb.conf Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 15/16] odb-helper: add 'store_plain_objects' to 'struct odb_helper' Christian Couder
2016-11-30 21:04 ` [RFC/PATCH v3 16/16] t0420: add test with HTTP external odb Christian Couder
2016-11-30 22:36 ` [RFC/PATCH v3 00/16] Add initial experimental external ODB support Junio C Hamano
2016-12-13 16:40   ` Christian Couder
2016-12-13 20:05     ` Junio C Hamano
2016-12-15  9:56       ` Christian Couder
2016-12-03 18:47 ` Lars Schneider
2016-12-05 13:23   ` Jeff King
2016-12-13 17:20   ` Christian Couder
2016-12-18 13:13     ` Lars Schneider [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3819FEA8-BF58-47D3-B60D-2840062022B7@gmail.com \
    --to=larsxschneider@gmail.com \
    --cc=chriscool@tuxfamily.org \
    --cc=christian.couder@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mh@glandium.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).