git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>,
	markbt@efaref.net, git@jeffhostetler.com
Subject: Re: Proposal for "fetch-any-blob Git protocol" and server design
Date: Thu, 16 Mar 2017 10:31:09 -0700	[thread overview]
Message-ID: <90381e66-d91f-6412-6294-701f5f780645@google.com> (raw)
In-Reply-To: <xmqqinnafml4.fsf@gitster.mtv.corp.google.com>

On 03/15/2017 10:59 AM, Junio C Hamano wrote:
> By "SHA-1s for which it wants blobs", you mean that "want" only
> allows one exact blob object name?  I think it is necessary to
> support that mode of operation as a base case, and it is a good
> starting point.
>
> When you know
>
>  - you have a "partial" clone that initially asked to contain only
>    blobs that are smaller than 10MB, and
>
>  - you are now trying to do a "git checkout v1.0 -- this/directory"
>    so that the directory is fully populated
>
> instead of enumerating all the missing blobs from the output of
> "ls-tree -r v1.0 this/directory" on separate "want" requests, you
> may want to say "I want all the blobs that are not smaller than 10MB
> in this tree object $(git rev-parse v1.0:this/directory)".
>
> I am not saying that you should add something like this right away,
> but I am wondering how you would extend the proposed system to do
> so.  Would you add "fetch-size-limited-blob-in-tree-pack" that runs
> parallel to "fetch-blob-pack" request?  Would you add a new type of
> request packet "want-blob-with-expression" for fbp-request, which is
> protected by some "protocol capability" exchange?
>
> If the former, how does a client discover if a particular server
> already supports the new "fetch-size-limited-blob-in-tree-pack"
> request, so that it does not have to send a bunch of "want" request
> by enumerating the blobs itself?  If the latter, how does a client
> discover if a particular server's "fetch-blob-pack" already supports
> the new "want-blob-with-expression" request packet?

I'm not sure if that use case is something we need to worry about (if 
you're downloading x * 10MB, uploading x * 50B shouldn't be a problem, I 
think), but if we want to handle that use case in the future, I agree 
that extending this system would be difficult.

The best way I can think of right now is for the client to send a 
fetch-blob-pack request with no "want" lines and at least one 
"want-tree" line, and then if there is an error (which will happen if 
the server is old, and therefore sees that there is not at least "want" 
line), to retry with the "want" lines. This allows us to add alternative 
ways of specifying blobs later (if we want to), but also means that 
upgrading a client without upgrading the corresponding server incurs a 
round-trip penalty.

Alternatively we could add rudimentary support for trees now and add 
filter-by-size later (so that such requests made to old servers will 
download extra blobs, but at least it works), but it still doesn't solve 
the general problem of specifying blobs by some other rule than its own 
SHA-1 or its tree's SHA-1.


  reply	other threads:[~2017-03-16 17:31 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-14 22:57 Proposal for "fetch-any-blob Git protocol" and server design Jonathan Tan
2017-03-15 17:59 ` Junio C Hamano
2017-03-16 17:31   ` Jonathan Tan [this message]
2017-03-16 21:17     ` Junio C Hamano
2017-03-16 22:48       ` Jonathan Tan
2017-03-28 23:19 ` Stefan Beller
     [not found]   ` <00bf01d2aed7$b13492a0$139db7e0$@gmail.com>
2017-04-12 22:02     ` Kevin David
2017-04-13 20:12       ` Jonathan Tan
2017-04-21 16:41         ` Kevin David
2017-04-26 22:51           ` Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=90381e66-d91f-6412-6294-701f5f780645@google.com \
    --to=jonathantanmy@google.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=markbt@efaref.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).