From: Jiang Xin <worldhello.net@gmail.com>
To: "Junio C Hamano" <gitster@pobox.com>,
"Git List" <git@vger.kernel.org>,
"SZEDER Gábor" <szeder.dev@gmail.com>
Cc: Jiang Xin <worldhello.net@gmail.com>, Sun Chao <sunchao9@huawei.com>
Subject: [PATCH v6 0/5] pack-redundant: new algorithm to find min packs
Date: Sat, 12 Jan 2019 17:17:49 +0800 [thread overview]
Message-ID: <20190112091754.30985-1-worldhello.net@gmail.com> (raw)
In-Reply-To: <20190110120142.22271-1-worldhello.net@gmail.com>
> Sun Chao (my former colleague at Huawei) found a bug of
> git-pack-redundant. If there are too many packs and many of them
> overlap each other, running `git pack-redundant --all` will
> exhaust all memories and the process will be killed by kernel.
>
> There is a script in commit log of commit 2/5, which can be used to
> create a repository with lots of redundant packs. Running `git
> pack-redundant --all` in it can reproduce this issue.
Junio C Hamano <gitster@pobox.com> 于2019年1月12日周六 上午2:00写道:
> >> Yikes. Can't "git pack-objects" get the input directly without
> >> overlong printf, something along the lines of...
> >>
> >> P1=$(git -C .git/objects/pack pack-objects pack <<-EOF
> >> $A
> >> $B
> >> $C
> >> ...
> >> $R
> >> EOF
> >> )
> >
> > Find that no space before <OID>, because git-pack-objects not allow that,
> > and mached parentheses should in the same line.
> > So Will write like this:
> >
> > create_pack_1() {
> > P1=$(git -C .git/objects/pack pack-objects pack <<-EOF) &&
> > $T
>
> Isn't the whole point of <<-EOF (notice the leading dash) to allow
> us to indent the here-doc with horizontal tab?
The reason that indents are not stripped even with `<<-EOF` is I mixed
tabs and spaces to make a better align.
If put the heredoc outside the parentheses, it will failed on MacOS, so
use the syntax Junio previously suggested.
SZEDER Gábor <szeder.dev@gmail.com> 于2019年1月11日周五 上午9:19写道:
> I see that the last patch in this series removes those three
> unused functions, but that patch should be squashed into this one to
> keep Git buildable with '-Werror' or DEVELOPER=1.
>
> Furthermore, after building this patch (without '-Werror'), several
> tests in 't5323-pack-redundant.sh' fail. To avoid the test failure I
> think the fourth patch ensuring a consistent sort order should be
> squashed in as well.
Patch 3/5 to 5/5 can be squashed to patch 2/5.
## Changes since reroll v5
1: 40fea5d67f ! 1: 7e4e703083 t5323: test cases for git-pack-redundant
@@ -22,8 +22,7 @@
+
+. ./test-lib.sh
+
-+create_commits()
-+{
++create_commits() {
+ parent=
+ for name in A B C D E F G H I J K L M N O P Q R
+ do
@@ -39,54 +38,98 @@
+ parent=$oid ||
+ return 1
+ done
-+ git update-ref refs/heads/master $M
++ git update-ref refs/heads/master $R
+}
+
-+create_pack_1()
-+{
-+ P1=$(cd .git/objects/pack; printf "$T\n$A\n$B\n$C\n$D\n$E\n$F\n$R\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_1() {
++ P1=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $T
++ $A
++ $B
++ $C
++ $D
++ $E
++ $F
++ $R
++ EOF
++ ) &&
+ eval P$P1=P1:$P1
+}
+
-+create_pack_2()
-+{
-+ P2=$(cd .git/objects/pack; printf "$B\n$C\n$D\n$E\n$G\n$H\n$I\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_2() {
++ P2=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $B
++ $C
++ $D
++ $E
++ $G
++ $H
++ $I
++ EOF
++ ) &&
+ eval P$P2=P2:$P2
+}
+
-+create_pack_3()
-+{
-+ P3=$(cd .git/objects/pack; printf "$F\n$I\n$J\n$K\n$L\n$M\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_3() {
++ P3=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $F
++ $I
++ $J
++ $K
++ $L
++ $M
++ EOF
++ ) &&
+ eval P$P3=P3:$P3
+}
+
-+create_pack_4()
-+{
-+ P4=$(cd .git/objects/pack; printf "$J\n$K\n$L\n$M\n$P\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_4() {
++ P4=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $J
++ $K
++ $L
++ $M
++ $P
++ EOF
++ ) &&
+ eval P$P4=P4:$P4
+}
+
-+create_pack_5()
-+{
-+ P5=$(cd .git/objects/pack; printf "$G\n$H\n$N\n$O\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_5() {
++ P5=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $G
++ $H
++ $N
++ $O
++ EOF
++ ) &&
+ eval P$P5=P5:$P5
+}
+
-+create_pack_6()
-+{
-+ P6=$(cd .git/objects/pack; printf "$N\n$O\n$Q\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_6() {
++ P6=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $N
++ $O
++ $Q
++ EOF
++ ) &&
+ eval P$P6=P6:$P6
+}
+
-+create_pack_7()
-+{
-+ P7=$(cd .git/objects/pack; printf "$P\n$Q\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_7() {
++ P7=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $P
++ $Q
++ EOF
++ ) &&
+ eval P$P7=P7:$P7
+}
+
-+create_pack_8()
-+{
-+ P8=$(cd .git/objects/pack; printf "$A\n" | git pack-objects pack 2>/dev/null) &&
++create_pack_8() {
++ P8=$(git -C .git/objects/pack pack-objects -q pack <<-EOF
++ $A
++ EOF
++ ) &&
+ eval P$P8=P8:$P8
+}
+
@@ -110,10 +153,12 @@
+
+test_expect_success 'one of pack-2/pack-3 is redundant' '
+ git pack-redundant --all >out &&
-+ sed -E -e "s#.*/pack-(.*)\.(idx|pack)#\1#" out | \
-+ sort -u | \
-+ while read p; do eval echo "\${P$p}"; done | \
-+ sort >actual && \
++ sed \
++ -e "s#.*/pack-\(.*\)\.idx#\1#" \
++ -e "s#.*/pack-\(.*\)\.pack#\1#" out |
++ sort -u |
++ while read p; do eval echo "\${P$p}"; done |
++ sort >actual &&
+ test_cmp expected actual
+'
+
@@ -121,6 +166,7 @@
+ create_pack_6 && create_pack_7
+'
+
++# Only after calling create_pack_6, we can use $P6 variable.
+cat >expected <<EOF
+P2:$P2
+P4:$P4
@@ -129,10 +175,12 @@
+
+test_expect_success 'pack 2, 4, and 6 are redundant' '
+ git pack-redundant --all >out &&
-+ sed -E -e "s#.*/pack-(.*)\.(idx|pack)#\1#" out | \
-+ sort -u | \
-+ while read p; do eval echo "\${P$p}"; done | \
-+ sort >actual && \
++ sed \
++ -e "s#.*/pack-\(.*\)\.idx#\1#" \
++ -e "s#.*/pack-\(.*\)\.pack#\1#" out |
++ sort -u |
++ while read p; do eval echo "\${P$p}"; done |
++ sort >actual &&
+ test_cmp expected actual
+'
+
@@ -147,24 +195,26 @@
+P8:$P8
+EOF
+
-+test_expect_success 'pack-8, subset of pack-1, is also redundant' '
++test_expect_success 'pack-8 (subset of pack-1) is also redundant' '
+ git pack-redundant --all >out &&
-+ sed -E -e "s#.*/pack-(.*)\.(idx|pack)#\1#" out | \
-+ sort -u | \
-+ while read p; do eval echo "\${P$p}"; done | \
-+ sort >actual && \
++ sed \
++ -e "s#.*/pack-\(.*\)\.idx#\1#" \
++ -e "s#.*/pack-\(.*\)\.pack#\1#" out |
++ sort -u |
++ while read p; do eval echo "\${P$p}"; done |
++ sort >actual &&
+ test_cmp expected actual
+'
+
-+test_expect_success 'clear loose objects' '
++test_expect_success 'clean loose objects' '
+ git prune-packed &&
+ find .git/objects -type f | sed -e "/objects\/pack\//d" >out &&
+ test_must_be_empty out
+'
+
-+test_expect_success 'remove redundant packs' '
++test_expect_success 'remove redundant packs and pass fsck' '
+ git pack-redundant --all | xargs rm &&
-+ git fsck &&
++ git fsck --no-progress &&
+ git pack-redundant --all >out &&
+ test_must_be_empty out
+'
2: 50cd5a5b47 ! 2: 51a9c2d8a5 pack-redundant: new algorithm to find min packs
@@ -67,7 +67,7 @@
Original PR and discussions: https://github.com/jiangxin/git/pull/25
Signed-off-by: Sun Chao <sunchao9@huawei.com>
- Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
+ Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
5: b7ccdea1ad ! 3: c5eb21c23c pack-redundant: remove unused functions
@@ -6,14 +6,14 @@
`pll_free`, etc.
Signed-off-by: Sun Chao <sunchao9@huawei.com>
- Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
+ Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff --git a/builtin/pack-redundant.c b/builtin/pack-redundant.c
--- a/builtin/pack-redundant.c
+++ b/builtin/pack-redundant.c
@@
- size_t all_objects_size;
+ struct llist *all_objects;
} *local_packs = NULL, *altodb_packs = NULL;
-struct pll {
@@ -105,7 +105,7 @@
- diff = llist_copy(list);
-
- while (pl) {
-- llist_sorted_difference_inplace(diff, pl->remaining_objects);
+- llist_sorted_difference_inplace(diff, pl->all_objects);
- if (diff->size == 0) { /* we're done */
- llist_free(diff);
- return 1;
3: 6338c6fad4 ! 4: 1acdd0af1e pack-redundant: rename pack_list.all_objects
@@ -18,16 +18,7 @@
+ struct llist *remaining_objects;
} *local_packs = NULL, *altodb_packs = NULL;
- struct pll {
-@@
- diff = llist_copy(list);
-
- while (pl) {
-- llist_sorted_difference_inplace(diff, pl->all_objects);
-+ llist_sorted_difference_inplace(diff, pl->remaining_objects);
- if (diff->size == 0) { /* we're done */
- llist_free(diff);
- return 1;
+ static struct llist_item *free_nodes;
@@
{
struct pack_list *pl_a = *((struct pack_list **)a);
4: 734f4d8a8b ! 5: 306d515cda pack-redundant: consistent sort method
@@ -26,7 +26,7 @@
+ size_t all_objects_size;
} *local_packs = NULL, *altodb_packs = NULL;
- struct pll {
+ static struct llist_item *free_nodes;
@@
return ret;
}
@@ -42,20 +42,24 @@
- if (sz_a == sz_b)
- return 0;
- else if (sz_a < sz_b)
-+ /* if have the same remaining_objects, big pack first */
-+ if (pl_a->remaining_objects->size == pl_b->remaining_objects->size)
++ if (pl_a->remaining_objects->size == pl_b->remaining_objects->size) {
++ /* have the same remaining_objects, big pack first */
+ if (pl_a->all_objects_size == pl_b->all_objects_size)
+ return 0;
+ else if (pl_a->all_objects_size < pl_b->all_objects_size)
+ return 1;
+ else
+ return -1;
-+
-+ /* sort according to remaining objects, more remaining objects first */
-+ if (pl_a->remaining_objects->size < pl_b->remaining_objects->size)
++ } else if (pl_a->remaining_objects->size < pl_b->remaining_objects->size) {
++ /* sort by remaining objects, more objects first */
return 1;
- else
+- else
++ } else {
return -1;
++ }
+ }
+
+ /* Sort pack_list, greater size of remaining_objects first */
@@
for (n = 0, p = *pl; p; p = p->next)
ary[n++] = p;
## This reroll has the following commits:
Jiang Xin (3):
t5323: test cases for git-pack-redundant
pack-redundant: rename pack_list.all_objects
pack-redundant: consistent sort method
Sun Chao (2):
pack-redundant: new algorithm to find min packs
pack-redundant: remove unused functions
builtin/pack-redundant.c | 221 +++++++++++++++-----------------------
t/t5323-pack-redundant.sh | 207 +++++++++++++++++++++++++++++++++++
2 files changed, 292 insertions(+), 136 deletions(-)
create mode 100755 t/t5323-pack-redundant.sh
--
2.20.0.3.gc45e608566
next prev parent reply other threads:[~2019-01-12 9:19 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-18 9:58 [PATCH 1/2] pack-redundant: new algorithm to find min packs Jiang Xin
2018-12-18 9:58 ` [PATCH 2/2] pack-redundant: remove unused functions Jiang Xin
2018-12-19 12:14 ` [PATCH v2 0/3] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-02 4:34 ` [PATCH v3 " Jiang Xin
2019-01-02 4:34 ` [PATCH v3 1/3] t5323: test cases for git-pack-redundant Jiang Xin
2019-01-09 12:56 ` SZEDER Gábor
2019-01-09 16:47 ` SZEDER Gábor
2019-01-10 12:01 ` [PATCH v5 0/5] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-12 9:17 ` Jiang Xin [this message]
2019-01-30 11:47 ` [PATCH v7 0/6] " Jiang Xin
2019-02-01 16:21 ` [PATCH v9 " Jiang Xin
2019-02-01 16:21 ` [PATCH v9 1/6] t5323: test cases for git-pack-redundant Jiang Xin
2019-02-01 19:42 ` Eric Sunshine
2019-02-01 21:03 ` Junio C Hamano
2019-02-01 21:49 ` Eric Sunshine
2019-02-02 13:30 ` [PATCH v10 0/6] pack-redundant: new algorithm to find min packs Jiang Xin
2019-02-02 13:30 ` [PATCH v10 1/6] t5323: test cases for git-pack-redundant Jiang Xin
2019-02-02 13:30 ` [PATCH v10 2/6] pack-redundant: delay creation of unique_objects Jiang Xin
2019-02-02 13:30 ` [PATCH v10 3/6] pack-redundant: delete redundant code Jiang Xin
2019-02-02 13:30 ` [PATCH v10 4/6] pack-redundant: new algorithm to find min packs Jiang Xin
2019-02-02 13:30 ` [PATCH v10 5/6] pack-redundant: rename pack_list.all_objects Jiang Xin
2019-02-02 13:30 ` [PATCH v10 6/6] pack-redundant: consistent sort method Jiang Xin
2019-02-01 16:21 ` [PATCH v9 2/6] pack-redundant: delay creation of unique_objects Jiang Xin
2019-02-01 16:21 ` [PATCH v9 3/6] pack-redundant: delete redundant code Jiang Xin
2019-02-01 16:21 ` [PATCH v9 4/6] pack-redundant: new algorithm to find min packs Jiang Xin
2019-02-01 16:21 ` [PATCH v9 5/6] pack-redundant: rename pack_list.all_objects Jiang Xin
2019-02-01 16:21 ` [PATCH v9 6/6] pack-redundant: consistent sort method Jiang Xin
2019-01-30 11:47 ` [PATCH v7 1/6] t5323: test cases for git-pack-redundant Jiang Xin
2019-01-31 21:44 ` Junio C Hamano
2019-02-01 5:44 ` Jiang Xin
2019-02-01 6:11 ` Eric Sunshine
2019-02-01 7:23 ` Jiang Xin
2019-02-01 7:25 ` Jiang Xin
2019-02-01 9:51 ` Jiang Xin
2019-01-30 11:47 ` [PATCH v7 2/6] pack-redundant: delay creation of unique_objects Jiang Xin
2019-01-30 11:47 ` [PATCH v7 3/6] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-31 19:30 ` Junio C Hamano
2019-02-01 9:55 ` Jiang Xin
2019-01-30 11:47 ` [PATCH v7 4/6] pack-redundant: remove unused functions Jiang Xin
2019-01-30 15:03 ` [PATCH v8 1/1] pack-redundant: delete redundant code 16657101987
2019-01-30 11:47 ` [PATCH v7 5/6] pack-redundant: rename pack_list.all_objects Jiang Xin
2019-01-30 11:47 ` [PATCH v7 6/6] pack-redundant: consistent sort method Jiang Xin
2019-01-12 9:17 ` [PATCH v6 1/5] t5323: test cases for git-pack-redundant Jiang Xin
2019-01-12 9:17 ` [PATCH v6 2/5] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-12 9:17 ` [PATCH v6 3/5] pack-redundant: remove unused functions Jiang Xin
2019-01-12 9:17 ` [PATCH v6 4/5] pack-redundant: rename pack_list.all_objects Jiang Xin
2019-01-12 9:17 ` [PATCH v6 5/5] pack-redundant: consistent sort method Jiang Xin
2019-01-10 12:01 ` [PATCH v5 1/5] t5323: test cases for git-pack-redundant Jiang Xin
2019-01-10 21:11 ` Junio C Hamano
2019-01-11 1:59 ` Jiang Xin
2019-01-11 18:00 ` Junio C Hamano
2019-01-10 12:01 ` [PATCH v5 2/5] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-11 1:19 ` SZEDER Gábor
2019-01-10 12:01 ` [PATCH v5 3/5] pack-redundant: rename pack_list.all_objects Jiang Xin
2019-01-10 12:01 ` [PATCH v5 4/5] pack-redundant: consistent sort method Jiang Xin
2019-01-10 20:05 ` SZEDER Gábor
2019-01-10 12:01 ` [PATCH v5 5/5] pack-redundant: remove unused functions Jiang Xin
2019-01-10 3:28 ` [PATCH v3 1/3] t5323: test cases for git-pack-redundant Jiang Xin
2019-01-10 7:11 ` Johannes Sixt
2019-01-10 11:57 ` SZEDER Gábor
2019-01-10 12:25 ` Torsten Bögershausen
2019-01-10 17:36 ` Junio C Hamano
2019-01-15 20:30 ` [PATCH/RFC v1 1/1] test-lint: sed -E (or -a, -l) are not portable tboegi
2019-01-15 21:09 ` Eric Sunshine
2019-01-16 11:24 ` Ævar Arnfjörð Bjarmason
2019-01-20 7:53 ` [PATCH/RFC v2 1/1] test-lint: Only use only sed [-n] [-e command] [-f command_file] tboegi
2019-01-22 19:47 ` Junio C Hamano
2019-01-22 20:00 ` Torsten Bögershausen
2019-01-22 21:15 ` Eric Sunshine
2019-01-23 6:35 ` Torsten Bögershausen
2019-01-23 17:54 ` Junio C Hamano
2019-01-25 19:12 ` Torsten Bögershausen
2019-01-27 22:34 ` Junio C Hamano
2019-01-02 4:34 ` [PATCH v3 2/3] pack-redundant: new algorithm to find min packs Jiang Xin
2019-01-02 4:34 ` [PATCH v3 3/3] pack-redundant: remove unused functions Jiang Xin
2019-01-08 16:40 ` [PATCH v4 0/1] " 16657101987
2019-01-08 19:30 ` Junio C Hamano
2019-01-09 0:29 ` 16657101987
2019-01-08 16:43 ` [PATCH v4 1/1] " 16657101987
2019-01-08 16:45 ` [PATCH v4 0/1] " 16657101987
2018-12-19 12:14 ` [PATCH v2 1/3] t5322: test cases for git-pack-redundant Jiang Xin
2018-12-19 12:14 ` [PATCH v2 2/3] pack-redundant: new algorithm to find min packs Jiang Xin
2018-12-19 12:14 ` [PATCH v2 3/3] pack-redundant: remove unused functions Jiang Xin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190112091754.30985-1-worldhello.net@gmail.com \
--to=worldhello.net@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=sunchao9@huawei.com \
--cc=szeder.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).