From: Jeff Hostetler <git@jeffhostetler.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, peff@peff.net, jonathantanmy@google.com,
Jeff Hostetler <jeffhost@microsoft.com>
Subject: Re: [PATCH] partial-clone: design doc
Date: Wed, 13 Dec 2017 17:34:41 -0500 [thread overview]
Message-ID: <f3750af4-1343-65a6-2913-a21840e4db89@jeffhostetler.com> (raw)
In-Reply-To: <xmqqzi6t6kpe.fsf@gitster.mtv.corp.google.com>
On 12/8/2017 3:14 PM, Junio C Hamano wrote:
> Jeff Hostetler <git@jeffhostetler.com> writes:
>
>> From: Jeff Hostetler <jeffhost@microsoft.com>
>>
>> First draft of design document for partial clone feature.
>>
>> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
>> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
>> ---
>
> Thanks.
>
>> +Non-Goals
>> +---------
>> +
>> +Partial clone is independent of and not intended to conflict with
>> +shallow-clone, refspec, or limited-ref mechanisms since these all operate
>> +at the DAG level whereas partial clone and fetch works *within* the set
>> +of commits already chosen for download.
>
> It probably is not a huge deal (simply because it is about
> "Non-Goals") but I have no idea what "refspec" and "limited-ref
> mechanism" refer to in the above sentence, and I suspect many others
> share the same puzzlement.
I'll reword this. There was a question on the list earlier about
having a filter for commits in addition to ones for blobs and trees.
I just wanted to emphasize that we already have ways to filter or
limit commits using --shallow-* or --single-branch in clone and 1 or
more '<refspec>' args in fetch.
>> +An object may be missing due to a partial clone or fetch, or missing due
>> +to repository corruption. To differentiate these cases, the local
>> +repository specially indicates packfiles obtained from the promisor
>> +remote. These "promisor packfiles" consist of a "<name>.promisor" file
>> +with arbitrary contents (like the "<name>.keep" files), in addition to
>> +their "<name>.pack" and "<name>.idx" files. (In the future, this ability
>> +may be extended to loose objects[a].)
>> + ...
>> +Foot Notes
>> +----------
>> +
>> +[a] Remembering that loose objects are promisor objects is mainly
>> + important for trees, since they may refer to promisor blobs that
>> + the user does not have. We do not need to mark loose blobs as
>> + promisor because they do not refer to other objects.
>
> I fail to see any logical link between the "loose" and "tree".
> Putting it differently, I do not see why "tree" is so special.
>
> A promisor pack that contains a tree but lacks blobs the tree refers
> to would be sufficient to let us remember that these missing blobs
> are not corruption. A loose commit or a tag that is somehow marked
> as obtained from a promisor, if it can serve just like a commit or a
> tag in a promisor pack to promise its direct pointee, would equally
> be useful (if very inefficient).
>
> In any case, I suspect "since they may refer to promisor blobs" is a
> typo of "since they may refer to promised blobs".
right. good point. i was only thinking about the tree==>blob
relationship.
>
>> +- Currently, dynamic object fetching invokes fetch-pack for each item
>> + because most algorithms stumble upon a missing object and need to have
>> + it resolved before continuing their work. This may incur significant
>> + overhead -- and multiple authentication requests -- if many objects are
>> + needed.
>> +
>> + We need to investigate use of a long-running process, such as proposed
>> + in [5,6] to reduce process startup and overhead costs.
>
> Also perhaps in some operations we can enumerate the objects we will
> need upfront and ask for them in one go (e.g. "git log -p A..B" may
> internally want to do "rev-list --objects A..B" to enumerate trees
> and blobs that we may lack upfront). I do not think having the
> other side guess is a good idea, though.
right.
>
>> +- We currently only promisor packfiles. We need to add support for
>> + promisor loose objects as described earlier.
>
> The earlier description was not convincing enough to feel the need
> to me; at least not yet.
It seems like we need it if a promisor packfile gets unpacked for any
reason. But right, I'm not sure how urgent it is.
Thanks
Jeff
next prev parent reply other threads:[~2017-12-13 22:34 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-08 19:26 [PATCH] Partial clone design document Jeff Hostetler
2017-12-08 19:26 ` [PATCH] partial-clone: design doc Jeff Hostetler
2017-12-08 20:14 ` Junio C Hamano
2017-12-13 22:34 ` Jeff Hostetler [this message]
2017-12-12 23:31 ` Philip Oakley
2017-12-12 23:57 ` Junio C Hamano
2017-12-13 13:17 ` Philip Oakley
2017-12-14 20:46 ` Jeff Hostetler
2017-12-14 20:32 ` Jeff Hostetler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f3750af4-1343-65a6-2913-a21840e4db89@jeffhostetler.com \
--to=git@jeffhostetler.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jeffhost@microsoft.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).