Re: [git-users] How hard would it be to implement sparse fetching/pulling?

git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed

* Re: [git-users] How hard would it be to implement sparse fetching/pulling?
       [not found] <bb89278c-9e79-489d-a19d-681b4e231d10@googlegroups.com>
@ 2017-11-30  8:12 ` Konstantin Khomoutov
  2017-11-30 15:48   ` Jeff Hostetler
  0 siblings, 1 reply; 2+ messages in thread
From: Konstantin Khomoutov @ 2017-11-30  8:12 UTC (permalink / raw)
  To: vit via Git for human beings; +Cc: git, Vitaly Arbuzov

On Wed, Nov 29, 2017 at 06:42:54PM -0800, vit via Git for human beings wrote:

> I'm looking for ways to improve fetch/pull/clone time for large git 
> (mono)repositories with unrelated source trees (that span across multiple 
> services).
> I've found sparse checkout approach appealing and helpful for most of 
> client-side operations (e.g. status, reset, commit, add etc)
> The problem is that there is no feature like sparse fetch/pull in git, this 
> means that ALL objects in unrelated trees are always fetched.
> It takes a lot of time for large repositories and results in some practical 
> scalability limits for git.
> This forced some large companies like Facebook and Google to move to 
> Mercurial as they were unable to improve client-side experience with git 
> and Microsoft has developed GVFS which seems to be a step back to CVCS 
> world.
[...]

(To anyone interested, there's a cross-post to the main Git list which
Vitaly failed to mention: [1]. I think it could spark some interesting
discussion.)

As to the essence of the question, I think you blame GVFS for no real
reason. While Microsoft is being Microsoft — their implementation of
GVFS is written in .NET and *requires* Windows 10 (this one is beyond
me), it's based on an open protocol [2] which basically assumes the
presence of a RESTful HTTP endpoint at the "Git server side" and
apparently designed to work well with the repository format the current
stock Git uses which makes it implementable on both sides by anyone
interested.

The second hint I have is that the idea of fetching data lazily
is being circulated among the Git developers for some time already, and
something is really being done in this venue so you could check and see
what's there [3, 4] and maybe trial it and help out those who works on this
stuff.

1. https://public-inbox.org/git/CANxXvsMbpBOSRKaAi8iVUikfxtQp=kofZ60N0pHXs+R+q1k3_Q@mail.gmail.com/
2. https://github.com/Microsoft/GVFS/blob/master/Protocol.md
3. https://public-inbox.org/git/?q=lazy+fetch
4. https://public-inbox.org/git/?q=partial+clone

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [git-users] How hard would it be to implement sparse fetching/pulling?
  2017-11-30  8:12 ` [git-users] How hard would it be to implement sparse fetching/pulling? Konstantin Khomoutov
@ 2017-11-30 15:48   ` Jeff Hostetler
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Hostetler @ 2017-11-30 15:48 UTC (permalink / raw)
  To: Konstantin Khomoutov, vit via Git for human beings; +Cc: git, Vitaly Arbuzov



On 11/30/2017 3:12 AM, Konstantin Khomoutov wrote:
> On Wed, Nov 29, 2017 at 06:42:54PM -0800, vit via Git for human beings wrote:
> 
>> I'm looking for ways to improve fetch/pull/clone time for large git
>> (mono)repositories with unrelated source trees (that span across multiple
>> services).
>> I've found sparse checkout approach appealing and helpful for most of
>> client-side operations (e.g. status, reset, commit, add etc)
>> The problem is that there is no feature like sparse fetch/pull in git, this
>> means that ALL objects in unrelated trees are always fetched.
>> It takes a lot of time for large repositories and results in some practical
>> scalability limits for git.
>> This forced some large companies like Facebook and Google to move to
>> Mercurial as they were unable to improve client-side experience with git
>> and Microsoft has developed GVFS which seems to be a step back to CVCS
>> world.
> [...]
> 
> (To anyone interested, there's a cross-post to the main Git list which
> Vitaly failed to mention: [1]. I think it could spark some interesting
> discussion.)
> 
> As to the essence of the question, I think you blame GVFS for no real
> reason. While Microsoft is being Microsoft — their implementation of
> GVFS is written in .NET and *requires* Windows 10 (this one is beyond
> me), it's based on an open protocol [2] which basically assumes the
> presence of a RESTful HTTP endpoint at the "Git server side" and
> apparently designed to work well with the repository format the current
> stock Git uses which makes it implementable on both sides by anyone
> interested.
> 
> The second hint I have is that the idea of fetching data lazily
> is being circulated among the Git developers for some time already, and
> something is really being done in this venue so you could check and see
> what's there [3, 4] and maybe trial it and help out those who works on this
> stuff.
> 
> 1. https://public-inbox.org/git/CANxXvsMbpBOSRKaAi8iVUikfxtQp=kofZ60N0pHXs+R+q1k3_Q@mail.gmail.com/
> 2. https://github.com/Microsoft/GVFS/blob/master/Protocol.md
> 3. https://public-inbox.org/git/?q=lazy+fetch
> 4. https://public-inbox.org/git/?q=partial+clone
> 

For completeness with the git-users mailing list.
Here is info on the work-in-progress for this feature.

https://public-inbox.org/git/e2d5470b-9252-07b4-f3cf-57076d103a17@jeffhostetler.com/

Jeff


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-11-30 15:48 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bb89278c-9e79-489d-a19d-681b4e231d10@googlegroups.com>
2017-11-30  8:12 ` [git-users] How hard would it be to implement sparse fetching/pulling? Konstantin Khomoutov
2017-11-30 15:48   ` Jeff Hostetler

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).