From: Junio C Hamano <junkio@cox.net>
To: Sergey Vlasov <vsu@altlinux.ru>
Cc: Alexandre Julliard <julliard@winehq.org>,
"Aneesh Kumar K.V" <aneesh.kumar@gmail.com>,
git@vger.kernel.org
Subject: Re: Shallow clone
Date: Sun, 12 Nov 2006 13:59:15 -0800 [thread overview]
Message-ID: <7vd57scong.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: <20061112205909.f8951300.vsu@altlinux.ru> (Sergey Vlasov's message of "Sun, 12 Nov 2006 20:59:09 +0300")
Sergey Vlasov <vsu@altlinux.ru> writes:
> This is due to optimization in builtin-pack-objects.c:try_delta():
>
> /*
> * We do not bother to try a delta that we discarded
> * on an earlier try, but only when reusing delta data.
> */
> if (!no_reuse_delta && trg_entry->in_pack &&
> trg_entry->in_pack == src_entry->in_pack)
> return 0;
>
> After removing this part the shallow pack after clone is 2.6M, as it
> should be.
>
> The problem with this optimization is that it is only valid if we are
> repacking either the same set of objects as we did earlier, or its
> superset. But if we are packing a subset of objects, there will be some
> objects in that subset which were delta-compressed in the original pack,
> but base objects for that deltas are not included in our subset -
> therefore we will be unable to reuse existing deltas, and with that
> optimization we will never try to use delta compression for these
> objects.
> ...
> So any partial fetch (shallow or not) from a mostly packed repository
> currently results in a suboptimal pack.
That is correct. How about something like this?
I think the determination of "repacking_superset" may need to be
tweaked because existing packs may have overlaps, and the patch
counts them once per pack.
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 69e5dd3..fb25124 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -64,6 +64,7 @@ struct object_entry {
static unsigned char object_list_sha1[20];
static int non_empty;
static int no_reuse_delta;
+static int repacking_superset;
static int local;
static int incremental;
static int allow_ofs_delta;
@@ -1172,10 +1173,13 @@ static int try_delta(struct unpacked *tr
return -1;
/*
- * We do not bother to try a delta that we discarded
- * on an earlier try, but only when reusing delta data.
+ * When we are packing the superset of objects we have already
+ * packed, we do not bother to try a delta that we discarded
+ * on an earlier try. This heuristic of course should not
+ * kick in when we are not reusing delta, or we know we are
+ * sending a subset of objects from a repository.
*/
- if (!no_reuse_delta && trg_entry->in_pack &&
+ if (!no_reuse_delta && repacking_superset && trg_entry->in_pack &&
trg_entry->in_pack == src_entry->in_pack)
return 0;
@@ -1493,6 +1497,16 @@ static void get_object_list(int ac, cons
traverse_commit_list(&revs, show_commit, show_object);
}
+static int count_packed_objects(void)
+{
+ struct packed_git *p;
+ int cnt = 0;
+
+ for (p = packed_git; p; p = p->next)
+ cnt += num_packed_objects(p);
+ return cnt;
+}
+
int cmd_pack_objects(int argc, const char **argv, const char *prefix)
{
SHA_CTX ctx;
@@ -1631,6 +1645,8 @@ int cmd_pack_objects(int argc, const cha
if (non_empty && !nr_result)
return 0;
+ repacking_superset = count_packed_objects() < nr_result;
+
SHA1_Init(&ctx);
list = sorted_by_sha;
for (i = 0; i < nr_result; i++) {
next prev parent reply other threads:[~2006-11-12 21:59 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-08 3:21 What's in git.git Junio C Hamano
2006-11-08 4:13 ` David Lang
2006-11-08 16:40 ` Shallow clone [Was Re: What's in git.git ] Aneesh Kumar K.V
2006-11-08 17:59 ` Aneesh Kumar K.V
2006-11-09 4:04 ` Shallow clone Junio C Hamano
2006-11-09 4:17 ` Aneesh Kumar
2006-11-11 13:57 ` Alexandre Julliard
2006-11-12 8:16 ` Junio C Hamano
2006-11-12 17:59 ` Sergey Vlasov
2006-11-12 21:59 ` Junio C Hamano [this message]
2006-11-13 5:29 ` Junio C Hamano
2006-11-12 13:12 ` Shallow clone [Was Re: What's in git.git ] Petr Baudis
2006-11-12 20:04 ` Shallow clone Junio C Hamano
2006-11-09 2:28 ` What's in git.git Horst H. von Brand
2006-11-09 2:54 ` Junio C Hamano
2006-11-09 3:04 ` Junio C Hamano
2006-11-09 3:45 ` Dave Dillow
2006-11-12 22:25 ` Johannes Schindelin
2006-11-08 7:40 ` Jakub Narebski
2006-11-08 7:59 ` Junio C Hamano
2006-11-08 7:58 ` Jakub Narebski
2006-11-08 8:26 ` Junio C Hamano
2006-11-08 14:51 ` Petr Baudis
2006-11-09 0:02 ` Junio C Hamano
[not found] <CAEfjWpHhLKpghGRFtzstndk_vYMkLSLAGfXx8agoQmakC-6Otg@mail.gmail.com>
2014-08-19 11:11 ` Fwd: Shallow clone Steven Evergreen
2014-08-19 12:01 ` Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7vd57scong.fsf@assigned-by-dhcp.cox.net \
--to=junkio@cox.net \
--cc=aneesh.kumar@gmail.com \
--cc=git@vger.kernel.org \
--cc=julliard@winehq.org \
--cc=vsu@altlinux.ru \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).