git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, peff@peff.net, jonathantanmy@google.com,
	Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH] partial-clone: design doc
Date: Wed, 13 Dec 2017 17:34:41 -0500	[thread overview]
Message-ID: <f3750af4-1343-65a6-2913-a21840e4db89@jeffhostetler.com> (raw)
In-Reply-To: <xmqqzi6t6kpe.fsf@gitster.mtv.corp.google.com>



On 12/8/2017 3:14 PM, Junio C Hamano wrote:
> Jeff Hostetler <git@jeffhostetler.com> writes:
> 
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> First draft of design document for partial clone feature.
>>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
>> ---
> 
> Thanks.
> 
>> +Non-Goals
>> +---------
>> +
>> +Partial clone is independent of and not intended to conflict with
>> +shallow-clone, refspec, or limited-ref mechanisms since these all operate
>> +at the DAG level whereas partial clone and fetch works *within* the set
>> +of commits already chosen for download.
> 
> It probably is not a huge deal (simply because it is about
> "Non-Goals") but I have no idea what "refspec" and "limited-ref
> mechanism" refer to in the above sentence, and I suspect many others
> share the same puzzlement.

I'll reword this.  There was a question on the list earlier about
having a filter for commits in addition to ones for blobs and trees.

I just wanted to emphasize that we already have ways to filter or
limit commits using --shallow-* or --single-branch in clone and 1 or
more '<refspec>' args in fetch.

  
>> +An object may be missing due to a partial clone or fetch, or missing due
>> +to repository corruption. To differentiate these cases, the local
>> +repository specially indicates packfiles obtained from the promisor
>> +remote. These "promisor packfiles" consist of a "<name>.promisor" file
>> +with arbitrary contents (like the "<name>.keep" files), in addition to
>> +their "<name>.pack" and "<name>.idx" files. (In the future, this ability
>> +may be extended to loose objects[a].)
>> + ...
>> +Foot Notes
>> +----------
>> +
>> +[a] Remembering that loose objects are promisor objects is mainly
>> +    important for trees, since they may refer to promisor blobs that
>> +    the user does not have.  We do not need to mark loose blobs as
>> +    promisor because they do not refer to other objects.
> 
> I fail to see any logical link between the "loose" and "tree".
> Putting it differently, I do not see why "tree" is so special.
> 
> A promisor pack that contains a tree but lacks blobs the tree refers
> to would be sufficient to let us remember that these missing blobs
> are not corruption.  A loose commit or a tag that is somehow marked
> as obtained from a promisor, if it can serve just like a commit or a
> tag in a promisor pack to promise its direct pointee, would equally
> be useful (if very inefficient).
> 
> In any case, I suspect "since they may refer to promisor blobs" is a
> typo of "since they may refer to promised blobs".

right. good point. i was only thinking about the tree==>blob
relationship.


> 
>> +- Currently, dynamic object fetching invokes fetch-pack for each item
>> +  because most algorithms stumble upon a missing object and need to have
>> +  it resolved before continuing their work.  This may incur significant
>> +  overhead -- and multiple authentication requests -- if many objects are
>> +  needed.
>> +
>> +  We need to investigate use of a long-running process, such as proposed
>> +  in [5,6] to reduce process startup and overhead costs.
> 
> Also perhaps in some operations we can enumerate the objects we will
> need upfront and ask for them in one go (e.g. "git log -p A..B" may
> internally want to do "rev-list --objects A..B" to enumerate trees
> and blobs that we may lack upfront).  I do not think having the
> other side guess is a good idea, though.

right.

> 
>> +- We currently only promisor packfiles.  We need to add support for
>> +  promisor loose objects as described earlier.
> 
> The earlier description was not convincing enough to feel the need
> to me; at least not yet.

It seems like we need it if a promisor packfile gets unpacked for any
reason.  But right, I'm not sure how urgent it is.


Thanks
Jeff



  reply	other threads:[~2017-12-13 22:34 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-12-08 19:26 [PATCH] Partial clone design document Jeff Hostetler
2017-12-08 19:26 ` [PATCH] partial-clone: design doc Jeff Hostetler
2017-12-08 20:14   ` Junio C Hamano
2017-12-13 22:34     ` Jeff Hostetler [this message]
2017-12-12 23:31   ` Philip Oakley
2017-12-12 23:57     ` Junio C Hamano
2017-12-13 13:17       ` Philip Oakley
2017-12-14 20:46         ` Jeff Hostetler
2017-12-14 20:32     ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f3750af4-1343-65a6-2913-a21840e4db89@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jeffhost@microsoft.com \
    --cc=jonathantanmy@google.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).