From: Josh Triplett <josh@joshtriplett.org>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Stefan Beller <sbeller@google.com>,
"git@vger.kernel.org" <git@vger.kernel.org>,
sarah@thesharps.us
Subject: Re: Resumable git clone?
Date: Tue, 1 Mar 2016 23:54:37 -0800 [thread overview]
Message-ID: <20160302075437.GA8024@x> (raw)
In-Reply-To: <CACsJy8DcNrOmrKKPibV6GuSqspovBmHzUv_mRB6fZyLjw5wWzQ@mail.gmail.com>
On Wed, Mar 02, 2016 at 02:37:53PM +0700, Duy Nguyen wrote:
> On Wed, Mar 2, 2016 at 1:31 PM, Junio C Hamano <gitster@pobox.com> wrote:
> > Al Viro <viro@ZenIV.linux.org.uk> writes:
> >
> >> FWIW, I wasn't proposing to recreate the remaining bits of that _pack_;
> >> just do the normal pull with one addition: start with sending the list
> >> of sha1 of objects you are about to send and let the recepient reply
> >> with "I already have <set of sha1>, don't bother with those". And exclude
> >> those from the transfer.
> >
> > I did a quick-and-dirty unscientific experiment.
> >
> > I had a clone of Linus's repository that was about a week old, whose
> > tip was at 4de8ebef (Merge tag 'trace-fixes-v4.5-rc5' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace,
> > 2016-02-22). To bring it up to date (i.e. a pull about a week's
> > worth of progress) to f691b77b (Merge branch 'for-linus' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs, 2016-03-01):
> >
> > $ git rev-list --objects 4de8ebef..f691b77b1fc | wc -l
> > 1396
> > $ git rev-parse 4de8ebef..f691b77b1fc |
> > git pack-objects --revs --delta-base-offset --stdout |
> > wc -c
> > 2444127
> >
> > So in order to salvage some transfer out of 2.4MB, the hypothetical
> > Al protocol would first have the upload-pack give 20*1396 = 28kB
>
> It could be 10*1396 or less. If the server calculates the shortest
> unambiguous SHA-1 length (quite cheap on fully packed repo) and sends
> it to the client, the client can just sends short SHA-1 instead. It's
> racy though because objects are being added to the server and abbrev
> length may go up. But we can check ambiguity for all SHA-1 sent by
> client and ask for resend for ambiguous ones.
>
> On my linux-2.6.git, 10 letters (so 5 bytes) are needed for
> unambiguous short SHA-1. But we can even go optimistic and ask the
> client for shorter SHA-1 with hope that resend won't be many.
I don't think it's worth the trouble and ambiguity to send abbreviated
object names over the wire. I think several simpler optimizations seem
preferable, such as binary object names, and abbreviating complete
object sets ("I have these commits/trees and everything they need
recursively; I also have this stack of random objects.").
That would work especially well for resumable pull, or for the case of
optimizing pull during the merge window.
- Josh Triplett
next prev parent reply other threads:[~2016-03-02 7:54 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-02 1:30 Resumable git clone? Josh Triplett
2016-03-02 1:40 ` Stefan Beller
2016-03-02 2:30 ` Al Viro
2016-03-02 6:31 ` Junio C Hamano
2016-03-02 7:37 ` Duy Nguyen
2016-03-02 7:44 ` Duy Nguyen
2016-03-02 7:54 ` Josh Triplett [this message]
2016-03-02 8:31 ` Junio C Hamano
2016-03-02 9:28 ` Duy Nguyen
2016-03-02 16:41 ` Josh Triplett
2016-03-02 8:13 ` Josh Triplett
2016-03-02 8:22 ` Duy Nguyen
2016-03-02 8:32 ` Jeff King
2016-03-02 10:47 ` Bhavik Bavishi
2016-03-02 16:40 ` Josh Triplett
2016-03-02 8:14 ` Duy Nguyen
2016-03-02 1:45 ` Duy Nguyen
2016-03-02 8:41 ` Junio C Hamano
2016-03-02 15:51 ` Konstantin Ryabitsev
2016-03-02 16:49 ` Josh Triplett
2016-03-02 17:57 ` Junio C Hamano
2016-03-24 8:00 ` Philip Oakley
2016-03-24 15:53 ` Junio C Hamano
2016-03-24 21:08 ` Philip Oakley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160302075437.GA8024@x \
--to=josh@joshtriplett.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
--cc=sarah@thesharps.us \
--cc=sbeller@google.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).