git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Kevin Wern <kevin.m.wern@gmail.com>
Cc: git@vger.kernel.org, Duy Nguyen <pclouds@gmail.com>
Subject: Re: Resumable clone
Date: Tue, 08 Mar 2016 09:07:00 -0800	[thread overview]
Message-ID: <xmqq4mcgnbkb.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <CANtyZjRZzXabeWEhwCrwN_q_Zsrm1f_d+j2uDhTZeEjv3LjxaA@mail.gmail.com> (Kevin Wern's message of "Mon, 7 Mar 2016 19:33:40 -0800")

Kevin Wern <kevin.m.wern@gmail.com> writes:

> From what I understand, a pattern exists in clone to download a
> packfile when a desired object isn't found as a resource. In this
> case, if no alternative is listed in http-alternatives, the client
> automatically checks the pack index(es) to see which packfile contains
> the object it needs.

You sound as if you are describing how a fetch over the dumb commit
walker http transport works.  That does not have anything to do with
the discussion of resumable clone, though, so I am not sure where
you are going with this.

> What I believe *doesn't* exist is a
> way for the server to say, "I have a resource, in this case a
> full-history packfile, and I *prefer* you get that file instead of
> attempting to traverse the object tree." This should be implemented in
> a way that is extensible to other resource types moving forward.

Yes, that is very close to what I said in the "what remains?"
section, but with a crucial difference in a detail.  Perhaps reading
the message you are respoinding to again more carefully will clear
the confusion.  This is what we want to allow the server to say
(from the message you are responding to, but rephrased slightly,
hoping that it would help unconfuse you):

    I prefer not to serve a full clone to you in the usual route if
    I can avoid it.  You can help me by populate your history first
    with something else (which would bring you to a state as if you
    cloned potentially a bit older version of me) and then coming
    back to me for an additional fetch to complete the history.

That "something else" does not have to be, and is not expected to
be, the "full" history of the current state.  As long as it can be
used to bring the cloner to a reasonably recent state, sufficient to
make a follow up incremental fetch inexpesive enough, it is
appropriate.

> I'm not sure how the server should determine the returned resource. A
> packfile alone does not guarantee the full repo history, and I'm not
> positive checking the idx file for HEAD's commit hash ensures every
> sub-object is in that file (though I feel it should, because it is
> delta-compressed).

The above reasoning does not make much technical sense.  delta
compression does not ensure connectivity in the commit history and
commit->tree->blob containment.  Again I am not sure where you are
going with this.

> With that in mind, my best guess at the server
> logic for packfiles is something like:
>
> Do I have a full history packfile, and am I configured to return one?
> - If yes, then return an answer specifying the file url and type (packfile)
> - Otherwise, return some other answer indicating the client must go
> through the original cloning process (or possibly return a different
> kind of file and type, once we expand that capability)

Roughly speaking, yes.

> Which leaves me with questions on how to test the above condition. Is
> there an expected place, such as config, where the user will specify
> the type of alternate resource, and should we assume some default if
> it isn't specified? Can the user optionally specify the exact file to
> use (I can't see why because it only invites more errors)? Should the
> specification of this option change git's behavior on update, such as
> making sure the full history is compressed? Does the existence of the
> HEAD object in the packfile ensure the repo's entire history is
> contained in that file?

Those (except for your assumption that no follow-up fetch is
allowed, which requires you to limit yourself to "full" history,
which is an unnecessary requirement) are good points one should be
making design decisions on when building this part of the system.

  parent reply	other threads:[~2016-03-08 17:07 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-05  9:23 Resumable clone Kevin Wern
2016-03-05  9:40 ` Duy Nguyen
2016-03-05 18:31   ` Junio C Hamano
2016-03-05 18:40     ` Junio C Hamano
2016-03-06  7:59     ` Johannes Schindelin
2016-03-06  8:49       ` Duy Nguyen
2016-03-06  8:52         ` Duy Nguyen
2016-03-06 19:48         ` Junio C Hamano
2016-03-07  3:55       ` Junio C Hamano
2016-03-08  3:33     ` Kevin Wern
2016-03-08 11:11       ` Duy Nguyen
2016-03-08 17:25         ` Junio C Hamano
2016-03-08 17:07       ` Junio C Hamano [this message]
2016-03-09  2:04         ` Kevin Wern

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq4mcgnbkb.fsf@gitster.mtv.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=kevin.m.wern@gmail.com \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).