From: "Philip Oakley" <philipoakley@iee.org>
To: "Jonathan Nieder" <jrnieder@gmail.com>
Cc: "Jonathan Tan" <jonathantanmy@google.com>, <git@vger.kernel.org>,
"Ben Peart" <peartben@gmail.com>
Subject: Re: [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs")
Date: Sat, 29 Jul 2017 13:51:16 +0100 [thread overview]
Message-ID: <8EE0108BA72B42EA9494B571DDE2005D@PhilipOakley> (raw)
In-Reply-To: 20170717180322.GM93855@aiede.mtv.corp.google.com
From: "Jonathan Nieder" <jrnieder@gmail.com>
Sent: Monday, July 17, 2017 7:03 PM
> Hi Philip,
>
> Philip Oakley wrote:
>> From: "Jonathan Tan" <jonathantanmy@google.com>
>
>>> These patches are part of a set of patches implementing partial clone,
>>> as you can see here:
>>>
>>> https://github.com/jonathantanmy/git/tree/partialclone
> [...]
>> If I understand correctly, this method doesn't give any direct user
>> visibility of missing blobs in the file system. Is that correct?
>>
>> I was hoping that eventually the various 'on demand' approaches
>> would still allow users to continue to work as they go off-line such
>> that they can see directly (in the FS) where the missing blobs (and
>> trees) are located, so that they can continue to commit new work on
>> existing files.
>>
>> I had felt that some sort of 'gitlink' should be present (huma
>> readable) as a place holder for the missing blob/tree. e.g.
>> 'gitblob: 1234abcd' (showing the missing oid, jsut like sub-modules
>> can do - it's no different really.
>
> That's a reasonable thing to want, but it's a little different from
> the use cases that partial clone work so far has aimed to support.
> They are:
>
> A. Avoiding downloading all blobs (and likely trees as well) that are
> not needed in the current operation (e.g. checkout). This blends
> well with the sparse checkout feature, which allows the current
> checkout to be fairly small in a large repository.
True. In my case I was looking for a method that would allow a 'Narrow
clone' such that the local repo would be smaller (have less content), but
would feel as if all the usefull files/directories were available, and there
would be place holders at the points where the trees were pruned, both in
the object store, and in the user's work-tree.
As you say, in some ways its conceptually orthogonal to the original sparse
checket (which has a full width object store / repo, and then omitted files
from the checkout.
>
> GVFS uses a trick that makes it a little easier to widen a sparse
> checkout upon access of a directory. But the same building blocks
> should work fine with a sparse checkout that has been set up
> explicitly.
>
> B. Avoiding downloading large blobs, except for those needed in the
> current operation (e.g. checkout).
>
> When not using sparse checkout, the main benefit out of the box is
> avoiding downloading *historical versions* of large blobs.
>
> It sounds like you are looking for a sort of placeholder outside the
> sparse checkout area.
True.
> In a way, that's orthogonal to these patches:
> even if you have all relevant blobs, you may want to avoid inflating
> them to check them out and reading them to compare to the index (i.e.
> the usual benefits of sparse checkout).
In my concept, it should be possible to create the ('sparse'/narrow) index
from the content of the local object store, without any network connection
(though that content is determined by the prior fetch/clone;-). The proper
git sparse checkout could proceed from there as being a further local
restriction on what is omitted from the worktree.
Those missing from the narrow clone would still show as place holders with
content ".gitnarrowtree 13a24b..<oid>" (so we know what the hash oid of the
file/tree should be (so they can be moved/renamed etc!). The index would
only know the content/structure as far as the place holders (just like
sub-modules are a break point in the tracking, with identical caveats)
It would be interesting to know from Ben the level of sparseness/narrowness
has been seen typically in the BigWin GVFS repo case.
> In a sparse checkout, you
> still might like to be able to get a listing of files outside the
> sparse area (which you can get with "git ls-tree") and you may even
> want to be able to get such a listing with plain "ls" (as with your
> proposal).
>
> Thanks and hope that helps,
> Jonathan
Thanks, yes. It has help consolidate some of the parts of my concept that
has been in the back of my mind for a while now.
Philip
next prev parent reply other threads:[~2017-07-29 12:51 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-11 19:48 [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs") Jonathan Tan
2017-07-11 19:48 ` [RFC PATCH 1/3] promised-blob, fsck: introduce promised blobs Jonathan Tan
2017-07-11 22:02 ` Stefan Beller
2017-07-19 23:37 ` Jonathan Tan
2017-07-12 17:29 ` Jeff Hostetler
2017-07-12 19:28 ` Jonathan Nieder
2017-07-13 14:48 ` Jeff Hostetler
2017-07-13 15:05 ` Jeff Hostetler
2017-07-13 19:39 ` Jonathan Tan
2017-07-14 20:03 ` Jeff Hostetler
2017-07-14 21:30 ` Jonathan Nieder
2017-07-11 19:48 ` [RFC PATCH 2/3] sha1-array: support appending unsigned char hash Jonathan Tan
2017-07-11 22:06 ` Stefan Beller
2017-07-19 23:56 ` Jonathan Tan
2017-07-20 0:06 ` Stefan Beller
2017-07-11 19:48 ` [RFC PATCH 3/3] sha1_file: add promised blob hook support Jonathan Tan
2017-07-11 22:38 ` Stefan Beller
2017-07-12 17:40 ` Ben Peart
2017-07-12 20:38 ` Jonathan Nieder
2017-07-16 15:23 ` [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs") Philip Oakley
2017-07-17 17:43 ` Ben Peart
2017-07-25 20:48 ` Philip Oakley
2017-07-17 18:03 ` Jonathan Nieder
2017-07-29 12:51 ` Philip Oakley [this message]
2017-07-20 0:21 ` [RFC PATCH v2 0/4] Partial clone: promised objects (not only blobs) Jonathan Tan
2017-07-20 0:21 ` [RFC PATCH v2 1/4] object: remove "used" field from struct object Jonathan Tan
2017-07-20 0:36 ` Stefan Beller
2017-07-20 0:55 ` Jonathan Tan
2017-07-20 17:44 ` Ben Peart
2017-07-20 21:20 ` Junio C Hamano
2017-07-20 0:21 ` [RFC PATCH v2 2/4] promised-object, fsck: introduce promised objects Jonathan Tan
2017-07-20 18:07 ` Stefan Beller
2017-07-20 19:17 ` Jonathan Tan
2017-07-20 19:58 ` Ben Peart
2017-07-20 21:13 ` Jonathan Tan
2017-07-21 16:24 ` Ben Peart
2017-07-21 20:33 ` Jonathan Tan
2017-07-25 15:10 ` Ben Peart
2017-07-29 13:26 ` Philip Oakley
2017-07-20 0:21 ` [RFC PATCH v2 3/4] sha1-array: support appending unsigned char hash Jonathan Tan
2017-07-20 0:21 ` [RFC PATCH v2 4/4] sha1_file: support promised object hook Jonathan Tan
2017-07-20 18:23 ` Stefan Beller
2017-07-20 20:58 ` Ben Peart
2017-07-20 21:18 ` Jonathan Tan
2017-07-21 16:27 ` Ben Peart
-- strict thread matches above, loose matches on Subject: below --
2022-09-17 23:56 [RFC PATCH 0/3] Partial clone: promised blobs (formerly "missing blobs") Вероника Кулешова
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8EE0108BA72B42EA9494B571DDE2005D@PhilipOakley \
--to=philipoakley@iee.org \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=jrnieder@gmail.com \
--cc=peartben@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).