From: Jonathan Nieder <jrnieder@gmail.com>
To: Vitaly Arbuzov <vit@uber.com>
Cc: Philip Oakley <philipoakley@iee.org>,
Jeff Hostetler <git@jeffhostetler.com>,
Git List <git@vger.kernel.org>
Subject: Re: How hard would it be to implement sparse fetching/pulling?
Date: Thu, 30 Nov 2017 18:51:06 -0800 [thread overview]
Message-ID: <20171201025106.GD20640@aiede.mtv.corp.google.com> (raw)
In-Reply-To: <CANxXvsNuEmo+uaRY8t44csqzXAk3rS+D9E=LMvaLcZeg-aLvRw@mail.gmail.com>
Hi Vitaly,
Vitaly Arbuzov wrote:
> I think it would be great if we high level agree on desired user
> experience, so let me put a few possible use cases here.
I think one thing this thread is pointing to is a lack of overview
documentation about how the 'partial clone' series currently works.
The basic components are:
1. extending git protocol to (1) allow fetching only a subset of the
objects reachable from the commits being fetched and (2) later,
going back and fetching the objects that were left out.
We've also discussed some other protocol changes, e.g. to allow
obtaining the sizes of un-fetched objects without fetching the
objects themselves
2. extending git's on-disk format to allow having some objects not be
present but only be "promised" to be obtainable from a remote
repository. When running a command that requires those objects,
the user can choose to have it either (a) error out ("airplane
mode") or (b) fetch the required objects.
It is still possible to work fully locally in such a repo, make
changes, get useful results out of "git fsck", etc. It is kind of
similar to the existing "shallow clone" feature, except that there
is a more straightforward way to obtain objects that are outside
the "shallow" clone when needed on demand.
3. improving everyday commands to require fewer objects. For
example, if I run "git log -p", then I way to see the history of
most files but I don't necessarily want to download large binary
files just to print 'Binary files differ' for them.
And by the same token, we might want to have a mode for commands
like "git log -p" to default to restricting to a particular
directory, instead of downloading files outside that directory.
There are some fundamental changes to make in this category ---
e.g. modifying the index format to not require entries for files
outside the sparse checkout, to avoid having to download the
trees for them.
The overall goal is to make git scale better.
The existing patches do (1) and (2), though it is possible to do more
in those categories. :) We have plans to work on (3) as well.
These are overall changes that happen at a fairly low level in git.
They mostly don't require changes command-by-command.
Thanks,
Jonathan
next prev parent reply other threads:[~2017-12-01 2:51 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-30 3:16 How hard would it be to implement sparse fetching/pulling? Vitaly Arbuzov
2017-11-30 14:24 ` Jeff Hostetler
2017-11-30 17:01 ` Vitaly Arbuzov
2017-11-30 17:44 ` Vitaly Arbuzov
2017-11-30 20:03 ` Jonathan Nieder
2017-12-01 16:03 ` Jeff Hostetler
2017-12-01 18:16 ` Jonathan Nieder
2017-11-30 23:43 ` Philip Oakley
2017-12-01 1:27 ` Vitaly Arbuzov
2017-12-01 1:51 ` Vitaly Arbuzov
2017-12-01 2:51 ` Jonathan Nieder [this message]
2017-12-01 3:37 ` Vitaly Arbuzov
2017-12-02 16:59 ` Philip Oakley
2017-12-01 14:30 ` Jeff Hostetler
2017-12-02 16:30 ` Philip Oakley
2017-12-04 15:36 ` Jeff Hostetler
2017-12-05 23:46 ` Philip Oakley
2017-12-02 15:04 ` Philip Oakley
2017-12-01 17:23 ` Jeff Hostetler
2017-12-01 18:24 ` Jonathan Nieder
2017-12-04 15:53 ` Jeff Hostetler
2017-12-02 18:24 ` Philip Oakley
2017-12-05 19:14 ` Jeff Hostetler
2017-12-05 20:07 ` Jonathan Nieder
2017-12-01 15:28 ` Jeff Hostetler
2017-12-01 14:50 ` Jeff Hostetler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171201025106.GD20640@aiede.mtv.corp.google.com \
--to=jrnieder@gmail.com \
--cc=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=philipoakley@iee.org \
--cc=vit@uber.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).