git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Philip Oakley" <philipoakley@iee.org>
To: "Jonathan Nieder" <jrnieder@gmail.com>
Cc: "Jonathan Tan" <jonathantanmy@google.com>, <git@vger.kernel.org>,
	"Ben Peart" <peartben@gmail.com>
Subject: Re: [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs")
Date: Sat, 29 Jul 2017 13:51:16 +0100	[thread overview]
Message-ID: <8EE0108BA72B42EA9494B571DDE2005D@PhilipOakley> (raw)
In-Reply-To: 20170717180322.GM93855@aiede.mtv.corp.google.com

From: "Jonathan Nieder" <jrnieder@gmail.com>
Sent: Monday, July 17, 2017 7:03 PM
> Hi Philip,
>
> Philip Oakley wrote:
>> From: "Jonathan Tan" <jonathantanmy@google.com>
>
>>> These patches are part of a set of patches implementing partial clone,
>>> as you can see here:
>>>
>>> https://github.com/jonathantanmy/git/tree/partialclone
> [...]
>> If I understand correctly, this method doesn't give any direct user
>> visibility of missing blobs in the file system. Is that correct?
>>
>> I was hoping that eventually the various 'on demand' approaches
>> would still allow users to continue to work as they go off-line such
>> that they can see directly (in the FS) where the missing blobs (and
>> trees) are located, so that they can continue to commit new work on
>> existing files.
>>
>> I had felt that some sort of 'gitlink' should be present (huma
>> readable) as a place holder for the missing blob/tree. e.g.
>> 'gitblob: 1234abcd' (showing the missing oid, jsut like sub-modules
>> can do - it's no different really.
>
> That's a reasonable thing to want, but it's a little different from
> the use cases that partial clone work so far has aimed to support.
> They are:
>
> A. Avoiding downloading all blobs (and likely trees as well) that are
>    not needed in the current operation (e.g. checkout).  This blends
>    well with the sparse checkout feature, which allows the current
>    checkout to be fairly small in a large repository.

True. In my case I was looking for a method that would allow a 'Narrow 
clone' such that the local repo would be smaller (have less content), but 
would feel as if all the usefull files/directories were available, and there 
would be place holders at the points where the trees were pruned, both in 
the object store, and in the user's work-tree.

As you say, in some ways its conceptually orthogonal to the original sparse 
checket (which has a full width object store / repo, and then omitted files 
from the checkout.
>
>    GVFS uses a trick that makes it a little easier to widen a sparse
>    checkout upon access of a directory.  But the same building blocks
>    should work fine with a sparse checkout that has been set up
>    explicitly.
>
> B. Avoiding downloading large blobs, except for those needed in the
>    current operation (e.g. checkout).
>
>    When not using sparse checkout, the main benefit out of the box is
>    avoiding downloading *historical versions* of large blobs.
>

> It sounds like you are looking for a sort of placeholder outside the
> sparse checkout area.

True.

> In a way, that's orthogonal to these patches:
> even if you have all relevant blobs, you may want to avoid inflating
> them to check them out and reading them to compare to the index (i.e.
> the usual benefits of sparse checkout).

In my concept, it should be possible to create the ('sparse'/narrow) index 
from the content of the local object store, without any network connection 
(though that content is determined by the prior fetch/clone;-). The proper 
git sparse checkout could proceed from there as being a further local 
restriction on what is omitted from the worktree.

Those missing from the narrow clone would still show as place holders with 
content ".gitnarrowtree 13a24b..<oid>" (so we know what the hash oid of the 
file/tree should be (so they can be moved/renamed etc!). The index would 
only know the content/structure as far as the place holders (just like 
sub-modules are a break point in the tracking, with identical caveats)


It would be interesting to know from Ben the level of sparseness/narrowness 
has been seen typically in the BigWin GVFS repo case.

>  In a sparse checkout, you
> still might like to be able to get a listing of files outside the
> sparse area (which you can get with "git ls-tree") and you may even
> want to be able to get such a listing with plain "ls" (as with your
> proposal).
>
> Thanks and hope that helps,
> Jonathan

Thanks, yes. It has help consolidate some of the parts of my concept that 
has been in the back of my mind for a while now.

Philip 


  reply	other threads:[~2017-07-29 12:51 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-11 19:48 [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs") Jonathan Tan
2017-07-11 19:48 ` [RFC PATCH 1/3] promised-blob, fsck: introduce promised blobs Jonathan Tan
2017-07-11 22:02   ` Stefan Beller
2017-07-19 23:37     ` Jonathan Tan
2017-07-12 17:29   ` Jeff Hostetler
2017-07-12 19:28     ` Jonathan Nieder
2017-07-13 14:48       ` Jeff Hostetler
2017-07-13 15:05         ` Jeff Hostetler
2017-07-13 19:39     ` Jonathan Tan
2017-07-14 20:03       ` Jeff Hostetler
2017-07-14 21:30         ` Jonathan Nieder
2017-07-11 19:48 ` [RFC PATCH 2/3] sha1-array: support appending unsigned char hash Jonathan Tan
2017-07-11 22:06   ` Stefan Beller
2017-07-19 23:56     ` Jonathan Tan
2017-07-20  0:06       ` Stefan Beller
2017-07-11 19:48 ` [RFC PATCH 3/3] sha1_file: add promised blob hook support Jonathan Tan
2017-07-11 22:38   ` Stefan Beller
2017-07-12 17:40   ` Ben Peart
2017-07-12 20:38     ` Jonathan Nieder
2017-07-16 15:23 ` [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs") Philip Oakley
2017-07-17 17:43   ` Ben Peart
2017-07-25 20:48     ` Philip Oakley
2017-07-17 18:03   ` Jonathan Nieder
2017-07-29 12:51     ` Philip Oakley [this message]
2017-07-20  0:21 ` [RFC PATCH v2 0/4] Partial clone: promised objects (not only blobs) Jonathan Tan
2017-07-20  0:21 ` [RFC PATCH v2 1/4] object: remove "used" field from struct object Jonathan Tan
2017-07-20  0:36   ` Stefan Beller
2017-07-20  0:55     ` Jonathan Tan
2017-07-20 17:44       ` Ben Peart
2017-07-20 21:20   ` Junio C Hamano
2017-07-20  0:21 ` [RFC PATCH v2 2/4] promised-object, fsck: introduce promised objects Jonathan Tan
2017-07-20 18:07   ` Stefan Beller
2017-07-20 19:17     ` Jonathan Tan
2017-07-20 19:58   ` Ben Peart
2017-07-20 21:13     ` Jonathan Tan
2017-07-21 16:24       ` Ben Peart
2017-07-21 20:33         ` Jonathan Tan
2017-07-25 15:10           ` Ben Peart
2017-07-29 13:26             ` Philip Oakley
2017-07-20  0:21 ` [RFC PATCH v2 3/4] sha1-array: support appending unsigned char hash Jonathan Tan
2017-07-20  0:21 ` [RFC PATCH v2 4/4] sha1_file: support promised object hook Jonathan Tan
2017-07-20 18:23   ` Stefan Beller
2017-07-20 20:58     ` Ben Peart
2017-07-20 21:18       ` Jonathan Tan
2017-07-21 16:27         ` Ben Peart
  -- strict thread matches above, loose matches on Subject: below --
2022-09-17 23:56 [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs") Вероника Кулешова

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8EE0108BA72B42EA9494B571DDE2005D@PhilipOakley \
    --to=philipoakley@iee.org \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=peartben@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).