git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Vitaly Arbuzov <vit@uber.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Philip Oakley <philipoakley@iee.org>,
	Jeff Hostetler <git@jeffhostetler.com>,
	Git List <git@vger.kernel.org>
Subject: Re: How hard would it be to implement sparse fetching/pulling?
Date: Thu, 30 Nov 2017 19:37:24 -0800	[thread overview]
Message-ID: <CANxXvsM4MNuXAgy51ke09u1HZqwZfmhS4-yM1bvAKc+ZniRadg@mail.gmail.com> (raw)
In-Reply-To: <20171201025106.GD20640@aiede.mtv.corp.google.com>

Makes sense, I think this perfectly aligns with our needs too.
Let me dive deeper into those patches and previous discussions, that
you've kindly shared above, so I better understand details.

I'm very excited about what you guys already did, it's a big deal for
the community!


On Thu, Nov 30, 2017 at 6:51 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi Vitaly,
>
> Vitaly Arbuzov wrote:
>
>> I think it would be great if we high level agree on desired user
>> experience, so let me put a few possible use cases here.
>
> I think one thing this thread is pointing to is a lack of overview
> documentation about how the 'partial clone' series currently works.
> The basic components are:
>
>  1. extending git protocol to (1) allow fetching only a subset of the
>     objects reachable from the commits being fetched and (2) later,
>     going back and fetching the objects that were left out.
>
>     We've also discussed some other protocol changes, e.g. to allow
>     obtaining the sizes of un-fetched objects without fetching the
>     objects themselves
>
>  2. extending git's on-disk format to allow having some objects not be
>     present but only be "promised" to be obtainable from a remote
>     repository.  When running a command that requires those objects,
>     the user can choose to have it either (a) error out ("airplane
>     mode") or (b) fetch the required objects.
>
>     It is still possible to work fully locally in such a repo, make
>     changes, get useful results out of "git fsck", etc.  It is kind of
>     similar to the existing "shallow clone" feature, except that there
>     is a more straightforward way to obtain objects that are outside
>     the "shallow" clone when needed on demand.
>
>  3. improving everyday commands to require fewer objects.  For
>     example, if I run "git log -p", then I way to see the history of
>     most files but I don't necessarily want to download large binary
>     files just to print 'Binary files differ' for them.
>
>     And by the same token, we might want to have a mode for commands
>     like "git log -p" to default to restricting to a particular
>     directory, instead of downloading files outside that directory.
>
>     There are some fundamental changes to make in this category ---
>     e.g. modifying the index format to not require entries for files
>     outside the sparse checkout, to avoid having to download the
>     trees for them.
>
> The overall goal is to make git scale better.
>
> The existing patches do (1) and (2), though it is possible to do more
> in those categories. :)  We have plans to work on (3) as well.
>
> These are overall changes that happen at a fairly low level in git.
> They mostly don't require changes command-by-command.
>
> Thanks,
> Jonathan

  reply	other threads:[~2017-12-01  3:37 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-30  3:16 How hard would it be to implement sparse fetching/pulling? Vitaly Arbuzov
2017-11-30 14:24 ` Jeff Hostetler
2017-11-30 17:01   ` Vitaly Arbuzov
2017-11-30 17:44     ` Vitaly Arbuzov
2017-11-30 20:03       ` Jonathan Nieder
2017-12-01 16:03         ` Jeff Hostetler
2017-12-01 18:16           ` Jonathan Nieder
2017-11-30 23:43       ` Philip Oakley
2017-12-01  1:27         ` Vitaly Arbuzov
2017-12-01  1:51           ` Vitaly Arbuzov
2017-12-01  2:51             ` Jonathan Nieder
2017-12-01  3:37               ` Vitaly Arbuzov [this message]
2017-12-02 16:59               ` Philip Oakley
2017-12-01 14:30             ` Jeff Hostetler
2017-12-02 16:30               ` Philip Oakley
2017-12-04 15:36                 ` Jeff Hostetler
2017-12-05 23:46                   ` Philip Oakley
2017-12-02 15:04           ` Philip Oakley
2017-12-01 17:23         ` Jeff Hostetler
2017-12-01 18:24           ` Jonathan Nieder
2017-12-04 15:53             ` Jeff Hostetler
2017-12-02 18:24           ` Philip Oakley
2017-12-05 19:14             ` Jeff Hostetler
2017-12-05 20:07               ` Jonathan Nieder
2017-12-01 15:28       ` Jeff Hostetler
2017-12-01 14:50     ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANxXvsM4MNuXAgy51ke09u1HZqwZfmhS4-yM1bvAKc+ZniRadg@mail.gmail.com \
    --to=vit@uber.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=philipoakley@iee.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).