git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Sergey Vlasov <vsu@altlinux.ru>
To: Junio C Hamano <junkio@cox.net>
Cc: Alexandre Julliard <julliard@winehq.org>,
	"Aneesh Kumar K.V" <aneesh.kumar@gmail.com>,
	git@vger.kernel.org
Subject: Re: Shallow clone
Date: Sun, 12 Nov 2006 20:59:09 +0300	[thread overview]
Message-ID: <20061112205909.f8951300.vsu@altlinux.ru> (raw)
In-Reply-To: <7vu015f5av.fsf@assigned-by-dhcp.cox.net>

[-- Attachment #1: Type: text/plain, Size: 2619 bytes --]

On Sun, 12 Nov 2006 00:16:40 -0800 Junio C Hamano wrote:

> Alexandre Julliard <julliard@winehq.org> writes:
>
> > There's also a problem with the packing, a clone --depth 1 currently
> > results in a pack that's about 3 times as large as it should be.
>
> That's interesting.
>
>   : gitster; git clone -n --depth 1 git://127.0.0.1/git.git victim-001
[...]
>   -r--r--r-- 1 junio src 9.5M 2006-11-11 23:52 pack-f5f88d83....pack
>
> Repacking immediately after cloning brings it down to what is
> expected.
>
>   : gitster; git repack -a -d -f
[...]
>   -rw-rw-r-- 1 junio src 2.6M 2006-11-11 23:53 pack-f5f88d83....pack

This is due to optimization in builtin-pack-objects.c:try_delta():

	/*
	 * We do not bother to try a delta that we discarded
	 * on an earlier try, but only when reusing delta data.
	 */
	if (!no_reuse_delta && trg_entry->in_pack &&
	    trg_entry->in_pack == src_entry->in_pack)
		return 0;

After removing this part the shallow pack after clone is 2.6M, as it
should be.

The problem with this optimization is that it is only valid if we are
repacking either the same set of objects as we did earlier, or its
superset.  But if we are packing a subset of objects, there will be some
objects in that subset which were delta-compressed in the original pack,
but base objects for that deltas are not included in our subset -
therefore we will be unable to reuse existing deltas, and with that
optimization we will never try to use delta compression for these
objects.  (The optimization assumes that if we will try to use delta
compression, we will try mostly the same base objects as we have tried
when we made the existing pack, and therefore will likely get the same
result - which is close to the truth when we are doing "repack -a", but
is badly wrong when we are doing "git-upload-pack" with a large number
of common commits, and therefore are excluding a lot of objects.)

So any partial fetch (shallow or not) from a mostly packed repository
currently results in a suboptimal pack.  In fact, the fresh "repack -a
-d -f" is probably the worst case for subsequent fetch (not initial
clone) from that repository - objects for the most recent commit are
most likely to be stored without delta compression, and even if deltas
are used, they are likely in the wrong direction for someone who has an
older version and wants to update it.


> In any case, after this "shallow" stuff, repeated "fetch --depth
> 99" seems to fetch 0 object and 3400 objects alternately, and
> the shallow file alternates between 900 bytes and 11000 bytes.

I confirm this - different numbers, but the same problem...

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

  reply	other threads:[~2006-11-12 17:59 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-08  3:21 What's in git.git Junio C Hamano
2006-11-08  4:13 ` David Lang
2006-11-08 16:40   ` Shallow clone [Was Re: What's in git.git ] Aneesh Kumar K.V
2006-11-08 17:59     ` Aneesh Kumar K.V
2006-11-09  4:04       ` Shallow clone Junio C Hamano
2006-11-09  4:17         ` Aneesh Kumar
2006-11-11 13:57         ` Alexandre Julliard
2006-11-12  8:16           ` Junio C Hamano
2006-11-12 17:59             ` Sergey Vlasov [this message]
2006-11-12 21:59               ` Junio C Hamano
2006-11-13  5:29                 ` Junio C Hamano
2006-11-12 13:12       ` Shallow clone [Was Re: What's in git.git ] Petr Baudis
2006-11-12 20:04         ` Shallow clone Junio C Hamano
2006-11-09  2:28   ` What's in git.git Horst H. von Brand
2006-11-09  2:54     ` Junio C Hamano
2006-11-09  3:04       ` Junio C Hamano
2006-11-09  3:45       ` Dave Dillow
2006-11-12 22:25   ` Johannes Schindelin
2006-11-08  7:40 ` Jakub Narebski
2006-11-08  7:59   ` Junio C Hamano
2006-11-08  7:58 ` Jakub Narebski
2006-11-08  8:26   ` Junio C Hamano
2006-11-08 14:51 ` Petr Baudis
2006-11-09  0:02 ` Junio C Hamano
     [not found] <CAEfjWpHhLKpghGRFtzstndk_vYMkLSLAGfXx8agoQmakC-6Otg@mail.gmail.com>
2014-08-19 11:11 ` Fwd: Shallow clone Steven Evergreen
2014-08-19 12:01   ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061112205909.f8951300.vsu@altlinux.ru \
    --to=vsu@altlinux.ru \
    --cc=aneesh.kumar@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=julliard@winehq.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).