* Something is broken in repack
@ 2007-12-07 23:05 Jon Smirl
2007-12-08 0:37 ` Linus Torvalds
` (4 more replies)
0 siblings, 5 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-07 23:05 UTC (permalink / raw
To: Git Mailing List
Using this config:
[pack]
threads = 4
deltacachesize = 256M
deltacachelimit = 0
And the 330MB gcc pack for input
git repack -a -d -f --depth=250 --window=250
complete seconds RAM
10% 47 1GB
20% 29 1Gb
30% 24 1Gb
40% 18 1GB
50% 110 1.2GB
60% 85 1.4GB
70% 195 1.5GB
80% 186 2.5GB
90% 489 3.8GB
95% 800 4.8GB
I killed it because it started swapping
The mmaps are only about 400MB in this case.
At the end the git process had 4.4GB of physical RAM allocated.
Starting from a highly compressed pack greatly aggravates the problem.
Starting with a 2GB pack of the same data my process size only grew to
3GB with 2GB of mmaps.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-07 23:05 Something is broken in repack Jon Smirl
@ 2007-12-08 0:37 ` Linus Torvalds
2007-12-08 1:27 ` [PATCH] pack-objects: fix delta cache size accounting Nicolas Pitre
2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre
` (3 subsequent siblings)
4 siblings, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2007-12-08 0:37 UTC (permalink / raw
To: Jon Smirl, Nicolas Pitre; +Cc: Git Mailing List
On Fri, 7 Dec 2007, Jon Smirl wrote:
>
> Using this config:
> [pack]
> threads = 4
> deltacachesize = 256M
I think deltacachesize is broken.
The code in try_delta() that replaces a delta cache entry with another one
seems very buggy wrt that whole "delta_cache_size" update. It does
delta_cache_size -= trg_entry->delta_size;
to account for the old delta going away, but it does this *after* having
already replaced trg_entry->delta_size with the new delta entry.
I suspect there are other issues going on too, but that's the one that I
noticed from a quick look-through.
Nico? I think this one is yours..
Linus
^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH] pack-objects: fix delta cache size accounting
2007-12-08 0:37 ` Linus Torvalds
@ 2007-12-08 1:27 ` Nicolas Pitre
0 siblings, 0 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-08 1:27 UTC (permalink / raw
To: Junio C Hamano; +Cc: Linus Torvalds, Jon Smirl, Git Mailing List
The wrong value was substracted from delta_cache_size when replacing
a cached delta, as trg_entry->delta_size was used after the old size
had been replaced by the new size.
Noticed by Linus.
Signed-off-by: Nicolas Pitre <nico@cam.org>
---
On Fri, 7 Dec 2007, Linus Torvalds wrote:
> The code in try_delta() that replaces a delta cache entry with another one
> seems very buggy wrt that whole "delta_cache_size" update. It does
>
> delta_cache_size -= trg_entry->delta_size;
>
> to account for the old delta going away, but it does this *after* having
> already replaced trg_entry->delta_size with the new delta entry.
Doh! Mea culpa.
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 4f44658..350ece4 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -1422,10 +1422,6 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
}
}
- trg_entry->delta = src_entry;
- trg_entry->delta_size = delta_size;
- trg->depth = src->depth + 1;
-
/*
* Handle memory allocation outside of the cache
* accounting lock. Compiler will optimize the strangeness
@@ -1439,7 +1435,7 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
trg_entry->delta_data = NULL;
}
if (delta_cacheable(src_size, trg_size, delta_size)) {
- delta_cache_size += trg_entry->delta_size;
+ delta_cache_size += delta_size;
cache_unlock();
trg_entry->delta_data = xrealloc(delta_buf, delta_size);
} else {
@@ -1447,6 +1443,10 @@ static int try_delta(struct unpacked *trg, struct unpacked *src,
free(delta_buf);
}
+ trg_entry->delta = src_entry;
+ trg_entry->delta_size = delta_size;
+ trg->depth = src->depth + 1;
+
return 1;
}
^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-07 23:05 Something is broken in repack Jon Smirl
2007-12-08 0:37 ` Linus Torvalds
@ 2007-12-08 1:46 ` Nicolas Pitre
2007-12-08 2:04 ` Jon Smirl
` (3 more replies)
2007-12-08 2:56 ` David Brown
` (2 subsequent siblings)
4 siblings, 4 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-08 1:46 UTC (permalink / raw
To: Jon Smirl; +Cc: Git Mailing List
On Fri, 7 Dec 2007, Jon Smirl wrote:
> Using this config:
> [pack]
> threads = 4
> deltacachesize = 256M
> deltacachelimit = 0
Since you have a different result according to the source pack used then
those cache settings, even if there was a bug with them, are not
significant.
> And the 330MB gcc pack for input
> git repack -a -d -f --depth=250 --window=250
>
> complete seconds RAM
> 10% 47 1GB
> 20% 29 1Gb
> 30% 24 1Gb
> 40% 18 1GB
> 50% 110 1.2GB
> 60% 85 1.4GB
> 70% 195 1.5GB
> 80% 186 2.5GB
> 90% 489 3.8GB
> 95% 800 4.8GB
> I killed it because it started swapping
>
> The mmaps are only about 400MB in this case.
> At the end the git process had 4.4GB of physical RAM allocated.
That's really bad.
> Starting from a highly compressed pack greatly aggravates the problem.
That is really interesting though.
> Starting with a 2GB pack of the same data my process size only grew to
> 3GB with 2GB of mmaps.
Which is quite reasonable, even if the same issue might still be there.
So the problem seems to be related to the pack access code and not the
repack code. And it must have something to do with the number of deltas
being replayed. And because the repack is attempting delta compression
roughly from newest to oldest, and because old objects are typically in
a deeper delta chain, then this might explain the logarithmic slowdown.
So something must be wrong with the delta cache in sha1_file.c somehow.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre
@ 2007-12-08 2:04 ` Jon Smirl
2007-12-08 2:28 ` Nicolas Pitre
2007-12-08 2:22 ` Jon Smirl
` (2 subsequent siblings)
3 siblings, 1 reply; 82+ messages in thread
From: Jon Smirl @ 2007-12-08 2:04 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Git Mailing List
On 12/7/07, Nicolas Pitre <nico@cam.org> wrote:
> On Fri, 7 Dec 2007, Jon Smirl wrote:
>
> > Using this config:
> > [pack]
> > threads = 4
> > deltacachesize = 256M
> > deltacachelimit = 0
>
> Since you have a different result according to the source pack used then
> those cache settings, even if there was a bug with them, are not
> significant.
>
> > And the 330MB gcc pack for input
> > git repack -a -d -f --depth=250 --window=250
> >
> > complete seconds RAM
> > 10% 47 1GB
> > 20% 29 1Gb
> > 30% 24 1Gb
> > 40% 18 1GB
> > 50% 110 1.2GB
> > 60% 85 1.4GB
> > 70% 195 1.5GB
> > 80% 186 2.5GB
> > 90% 489 3.8GB
> > 95% 800 4.8GB
> > I killed it because it started swapping
> >
> > The mmaps are only about 400MB in this case.
> > At the end the git process had 4.4GB of physical RAM allocated.
>
> That's really bad.
>
> > Starting from a highly compressed pack greatly aggravates the problem.
>
> That is really interesting though.
>
> > Starting with a 2GB pack of the same data my process size only grew to
> > 3GB with 2GB of mmaps.
>
> Which is quite reasonable, even if the same issue might still be there.
>
> So the problem seems to be related to the pack access code and not the
> repack code. And it must have something to do with the number of deltas
> being replayed. And because the repack is attempting delta compression
> roughly from newest to oldest, and because old objects are typically in
> a deeper delta chain, then this might explain the logarithmic slowdown.
>
> So something must be wrong with the delta cache in sha1_file.c somehow.
I applied the delta accounting patch. It took about 200MB of from the
memory use but that doesn't make a dent in 4GB of allocations.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre
2007-12-08 2:04 ` Jon Smirl
@ 2007-12-08 2:22 ` Jon Smirl
2007-12-08 3:44 ` Harvey Harrison
2007-12-08 22:18 ` Junio C Hamano
3 siblings, 0 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-08 2:22 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Git Mailing List
On 12/7/07, Nicolas Pitre <nico@cam.org> wrote:
> So the problem seems to be related to the pack access code and not the
> repack code. And it must have something to do with the number of deltas
> being replayed. And because the repack is attempting delta compression
> roughly from newest to oldest, and because old objects are typically in
> a deeper delta chain, then this might explain the logarithmic slowdown.
What could be wrongly allocating 4GB of memory? Figure that out and
you should have your answer. The slow down may be coming from having
to search through more and more objects in memory.
Memory consumption seem to be correlated to the depth of the delta
chain being accessed. It blows up tremendously right at the end. It
may even be a square of the length of the chain length. For the normal
default case the square didn't hurt, but 250*250 = 62,500 which would
eat a huge amount of memory.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 2:04 ` Jon Smirl
@ 2007-12-08 2:28 ` Nicolas Pitre
2007-12-08 3:29 ` Jon Smirl
0 siblings, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-08 2:28 UTC (permalink / raw
To: Jon Smirl; +Cc: Git Mailing List
On Fri, 7 Dec 2007, Jon Smirl wrote:
> On 12/7/07, Nicolas Pitre <nico@cam.org> wrote:
> > On Fri, 7 Dec 2007, Jon Smirl wrote:
> >
> > > git repack -a -d -f --depth=250 --window=250
> > >
> > > complete seconds RAM
> > > 10% 47 1GB
> > > 20% 29 1Gb
> > > 30% 24 1Gb
> > > 40% 18 1GB
> > > 50% 110 1.2GB
> > > 60% 85 1.4GB
> > > 70% 195 1.5GB
> > > 80% 186 2.5GB
> > > 90% 489 3.8GB
> > > 95% 800 4.8GB
> > > I killed it because it started swapping
> > >
> > > The mmaps are only about 400MB in this case.
> > > At the end the git process had 4.4GB of physical RAM allocated.
> >
> > That's really bad.
> >
> > > Starting from a highly compressed pack greatly aggravates the problem.
> >
> > That is really interesting though.
> >
> > > Starting with a 2GB pack of the same data my process size only grew to
> > > 3GB with 2GB of mmaps.
> >
> > Which is quite reasonable, even if the same issue might still be there.
> >
> > So the problem seems to be related to the pack access code and not the
> > repack code. And it must have something to do with the number of deltas
> > being replayed. And because the repack is attempting delta compression
> > roughly from newest to oldest, and because old objects are typically in
> > a deeper delta chain, then this might explain the logarithmic slowdown.
> >
> > So something must be wrong with the delta cache in sha1_file.c somehow.
Staring at the cache code I don't see anything wrong with it.
> I applied the delta accounting patch. It took about 200MB of from the
> memory use but that doesn't make a dent in 4GB of allocations.
Right. I didn't expect much from that fix.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-07 23:05 Something is broken in repack Jon Smirl
2007-12-08 0:37 ` Linus Torvalds
2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre
@ 2007-12-08 2:56 ` David Brown
2007-12-10 19:56 ` Nicolas Pitre
2007-12-11 2:25 ` Jon Smirl
4 siblings, 0 replies; 82+ messages in thread
From: David Brown @ 2007-12-08 2:56 UTC (permalink / raw
To: Jon Smirl; +Cc: Git Mailing List
On Fri, Dec 07, 2007 at 06:05:38PM -0500, Jon Smirl wrote:
>Using this config:
>[pack]
> threads = 4
> deltacachesize = 256M
> deltacachelimit = 0
Just out of curiousity, does adding
[pack]
windowmemory = 256M
help. I've found this to grow very large when there are large blobs.
Dave
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 2:28 ` Nicolas Pitre
@ 2007-12-08 3:29 ` Jon Smirl
2007-12-08 3:37 ` David Brown
2007-12-08 3:48 ` Harvey Harrison
0 siblings, 2 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-08 3:29 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Git Mailing List
The kernel repo has the same problem but not nearly as bad.
Starting from a default pack
git repack -a -d -f --depth=1000 --window=1000
Uses 1GB of physical memory
Now do the command again.
git repack -a -d -f --depth=1000 --window=1000
Uses 1.3GB of physical memory
I suspect the gcc repo has much longer revision chains than the kernel
one since the kernel repo is only a few years old. The Mozilla repo
contained revision chains with over 2,000 revisions. Longer revision
chains result in longer delta chains.
So what is allocating the extra memory? Either a function of the
number of entries in the chain, or related to accessing the chain
since a chain with more entries will need to be accessed more times.
I have a 168MB kernel pack now after 15 minutes of four cores at 100%.
Here's another observation, the gcc objects are larger. Kernel has
650K objects in 190MB, gcc has 870K objects in 330MB. Average gcc
object is 30% larger. How should the average kernel developer
interpret this?
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 3:29 ` Jon Smirl
@ 2007-12-08 3:37 ` David Brown
2007-12-08 4:22 ` Jon Smirl
2007-12-08 3:48 ` Harvey Harrison
1 sibling, 1 reply; 82+ messages in thread
From: David Brown @ 2007-12-08 3:37 UTC (permalink / raw
To: Jon Smirl; +Cc: Nicolas Pitre, Git Mailing List
On Fri, Dec 07, 2007 at 10:29:31PM -0500, Jon Smirl wrote:
>The kernel repo has the same problem but not nearly as bad.
>
>Starting from a default pack
> git repack -a -d -f --depth=1000 --window=1000
>Uses 1GB of physical memory
>
>Now do the command again.
> git repack -a -d -f --depth=1000 --window=1000
>Uses 1.3GB of physical memory
With my repo that contains a bunch of 50MB tarfiles, I've found I must
specify --window-memory as well to keep repack from using nearly unbounded
amounts of memory. Perhaps it is the larger files found in gcc that
provokes this.
A window size of 1000 can take a lot of memory if the objects are large.
Dave
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre
2007-12-08 2:04 ` Jon Smirl
2007-12-08 2:22 ` Jon Smirl
@ 2007-12-08 3:44 ` Harvey Harrison
2007-12-08 22:18 ` Junio C Hamano
3 siblings, 0 replies; 82+ messages in thread
From: Harvey Harrison @ 2007-12-08 3:44 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Jon Smirl, Git Mailing List
On Fri, 2007-12-07 at 20:46 -0500, Nicolas Pitre wrote:
> On Fri, 7 Dec 2007, Jon Smirl wrote:
> > And the 330MB gcc pack for input
> > git repack -a -d -f --depth=250 --window=250
> >
> > complete seconds RAM
> > 10% 47 1GB
> > 20% 29 1Gb
> > 30% 24 1Gb
> > 40% 18 1GB
> > 50% 110 1.2GB
> > 60% 85 1.4GB
> > 70% 195 1.5GB
> > 80% 186 2.5GB
> > 90% 489 3.8GB
> > 95% 800 4.8GB
> > I killed it because it started swapping
> >
> > The mmaps are only about 400MB in this case.
> > At the end the git process had 4.4GB of physical RAM allocated.
> > Starting with a 2GB pack of the same data my process size only grew to
> > 3GB with 2GB of mmaps.
>
> Which is quite reasonable, even if the same issue might still be there.
>
> So the problem seems to be related to the pack access code and not the
> repack code. And it must have something to do with the number of deltas
> being replayed. And because the repack is attempting delta compression
> roughly from newest to oldest, and because old objects are typically in
> a deeper delta chain, then this might explain the logarithmic slowdown.
>
> So something must be wrong with the delta cache in sha1_file.c somehow.
All I have is a qualitative observation, but during the process of
creating the pack, there was a _huge_ slowdown between 10-15%
(hundreds/dozens per second to single object per second and a
corresponding increase in process size). Didn't keep any numbers
at the time, but it was noticable.
I wonder if there are a bunch of huge objects somewhere in gcc's
history?
Harvey
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 3:29 ` Jon Smirl
2007-12-08 3:37 ` David Brown
@ 2007-12-08 3:48 ` Harvey Harrison
1 sibling, 0 replies; 82+ messages in thread
From: Harvey Harrison @ 2007-12-08 3:48 UTC (permalink / raw
To: Jon Smirl; +Cc: Nicolas Pitre, Git Mailing List
On Fri, 2007-12-07 at 22:29 -0500, Jon Smirl wrote:
> The kernel repo has the same problem but not nearly as bad.
>
> Starting from a default pack
> git repack -a -d -f --depth=1000 --window=1000
> Uses 1GB of physical memory
>
> Now do the command again.
> git repack -a -d -f --depth=1000 --window=1000
> Uses 1.3GB of physical memory
>
> I suspect the gcc repo has much longer revision chains than the kernel
> one since the kernel repo is only a few years old. The Mozilla repo
> contained revision chains with over 2,000 revisions. Longer revision
> chains result in longer delta chains.
I sent out a partial delta breakdown for the gcc repo earlier, here's
the whole list.
breakdown of the gcc packfile:
Total objects
1017922
ChainLength Objects Cumulative
1: 103817 103817
2: 67332 171149
3: 57520 228669
4: 52570 281239
5: 43910 325149
6: 37520 362669
7: 35248 397917
8: 29819 427736
9: 27619 455355
10: 22656 478011
11: 21073 499084
12: 18738 517822
13: 16674 534496
14: 14882 549378
15: 14424 563802
16: 12765 576567
17: 11662 588229
18: 11845 600074
19: 11694 611768
20: 9625 621393
21: 9031 630424
22: 8437 638861
23: 8217 647078
24: 7927 655005
25: 7955 662960
26: 7092 670052
27: 7004 677056
28: 6724 683780
29: 6626 690406
30: 5875 696281
31: 5970 702251
32: 5726 707977
33: 6025 714002
34: 5354 719356
35: 6413 725769
36: 4933 730702
37: 4888 735590
38: 4561 740151
39: 4366 744517
40: 4166 748683
41: 4531 753214
42: 4029 757243
43: 3701 760944
44: 3647 764591
45: 3553 768144
46: 3509 771653
47: 3473 775126
48: 3442 778568
49: 3379 781947
50: 3395 785342
51: 3315 788657
52: 3168 791825
53: 3345 795170
54: 3166 798336
55: 3237 801573
56: 2795 804368
57: 2768 807136
58: 2666 809802
59: 2723 812525
60: 2547 815072
61: 2565 817637
62: 2622 820259
63: 2521 822780
64: 2492 825272
65: 2529 827801
66: 2566 830367
67: 2685 833052
68: 2458 835510
69: 2457 837967
70: 2440 840407
71: 2410 842817
72: 2337 845154
73: 2301 847455
74: 2201 849656
75: 2127 851783
76: 2256 854039
77: 2038 856077
78: 1925 858002
79: 1965 859967
80: 1929 861896
81: 1890 863786
82: 1873 865659
83: 1964 867623
84: 1898 869521
85: 1839 871360
86: 1933 873293
87: 1876 875169
88: 1851 877020
89: 1789 878809
90: 1790 880599
91: 1804 882403
92: 1696 884099
93: 1863 885962
94: 1889 887851
95: 1766 889617
96: 1731 891348
97: 1775 893123
98: 1750 894873
99: 1767 896640
100: 1644 898284
101: 1642 899926
102: 1489 901415
103: 1532 902947
104: 1564 904511
105: 1477 905988
106: 1461 907449
107: 1383 908832
108: 1422 910254
109: 1316 911570
110: 1480 913050
111: 1329 914379
112: 1375 915754
113: 1292 917046
114: 1224 918270
115: 1123 919393
116: 1216 920609
117: 1252 921861
118: 1252 923113
119: 1346 924459
120: 1320 925779
121: 1277 927056
122: 1234 928290
123: 1200 929490
124: 1255 930745
125: 1206 931951
126: 1155 933106
127: 1246 934352
128: 1226 935578
129: 1194 936772
130: 1268 938040
131: 1334 939374
132: 1146 940520
133: 1220 941740
134: 1055 942795
135: 1110 943905
136: 1095 945000
137: 1294 946294
138: 1204 947498
139: 1218 948716
140: 1101 949817
141: 993 950810
142: 975 951785
143: 1014 952799
144: 968 953767
145: 957 954724
146: 1069 955793
147: 996 956789
148: 967 957756
149: 964 958720
150: 954 959674
151: 949 960623
152: 1001 961624
153: 1042 962666
154: 1057 963723
155: 948 964671
156: 966 965637
157: 833 966470
158: 959 967429
159: 907 968336
160: 854 969190
161: 847 970037
162: 836 970873
163: 769 971642
164: 747 972389
165: 755 973144
166: 707 973851
167: 774 974625
168: 777 975402
169: 783 976185
170: 707 976892
171: 738 977630
172: 775 978405
173: 781 979186
174: 698 979884
175: 801 980685
176: 712 981397
177: 679 982076
178: 775 982851
179: 696 983547
180: 760 984307
181: 740 985047
182: 752 985799
183: 704 986503
184: 683 987186
185: 690 987876
186: 741 988617
187: 642 989259
188: 672 989931
189: 679 990610
190: 691 991301
191: 648 991949
192: 703 992652
193: 675 993327
194: 687 994014
195: 625 994639
196: 607 995246
197: 583 995829
198: 632 996461
199: 540 997001
200: 652 997653
201: 600 998253
202: 628 998881
203: 624 999505
204: 582 1000087
205: 548 1000635
206: 520 1001155
207: 648 1001803
208: 556 1002359
209: 563 1002922
210: 508 1003430
211: 570 1004000
212: 530 1004530
213: 575 1005105
214: 527 1005632
215: 521 1006153
216: 515 1006668
217: 513 1007181
218: 460 1007641
219: 491 1008132
220: 474 1008606
221: 471 1009077
222: 482 1009559
223: 485 1010044
224: 439 1010483
225: 385 1010868
226: 385 1011253
227: 403 1011656
228: 380 1012036
229: 376 1012412
230: 377 1012789
231: 415 1013204
232: 394 1013598
233: 362 1013960
234: 334 1014294
235: 366 1014660
236: 317 1014977
237: 362 1015339
238: 343 1015682
239: 392 1016074
240: 317 1016391
241: 305 1016696
242: 319 1017015
243: 276 1017291
244: 247 1017538
245: 179 1017717
246: 111 1017828
247: 61 1017889
248: 27 1017916
249: 6 1017922
Harvey
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 3:37 ` David Brown
@ 2007-12-08 4:22 ` Jon Smirl
2007-12-08 4:30 ` Nicolas Pitre
0 siblings, 1 reply; 82+ messages in thread
From: Jon Smirl @ 2007-12-08 4:22 UTC (permalink / raw
To: David Brown, Nicolas Pitre, Git Mailing List
On 12/7/07, David Brown <git@davidb.org> wrote:
> On Fri, Dec 07, 2007 at 10:29:31PM -0500, Jon Smirl wrote:
> >The kernel repo has the same problem but not nearly as bad.
> >
> >Starting from a default pack
> > git repack -a -d -f --depth=1000 --window=1000
> >Uses 1GB of physical memory
> >
> >Now do the command again.
> > git repack -a -d -f --depth=1000 --window=1000
> >Uses 1.3GB of physical memory
>
> With my repo that contains a bunch of 50MB tarfiles, I've found I must
> specify --window-memory as well to keep repack from using nearly unbounded
> amounts of memory. Perhaps it is the larger files found in gcc that
> provokes this.
>
> A window size of 1000 can take a lot of memory if the objects are large.
This is a partial solution to the problem. Adding window size =256M
took memory consumption down from 4.8GB to 2.8GB. It took an hour to
run the test.
It not the complete solution since my git process is still using 2.4GB
physical memory. I also still experiencing a lot of slow down in the
last 10%.
Does the gcc repo contain some giant objects? Why wasn't the memory
freed after their chain was processed?
Most of the last 10% is being done on a single CPU. There must be a
chain of giant objects that is unbalancing everything.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 4:22 ` Jon Smirl
@ 2007-12-08 4:30 ` Nicolas Pitre
2007-12-08 5:01 ` Jon Smirl
0 siblings, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-08 4:30 UTC (permalink / raw
To: Jon Smirl; +Cc: David Brown, Git Mailing List
On Fri, 7 Dec 2007, Jon Smirl wrote:
> Does the gcc repo contain some giant objects? Why wasn't the memory
> freed after their chain was processed?
It should be.
> Most of the last 10% is being done on a single CPU. There must be a
> chain of giant objects that is unbalancing everything.
I'm about to send a patch to fix the thread balancing for real this
time.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 4:30 ` Nicolas Pitre
@ 2007-12-08 5:01 ` Jon Smirl
2007-12-08 5:12 ` Nicolas Pitre
0 siblings, 1 reply; 82+ messages in thread
From: Jon Smirl @ 2007-12-08 5:01 UTC (permalink / raw
To: Nicolas Pitre; +Cc: David Brown, Git Mailing List
On 12/7/07, Nicolas Pitre <nico@cam.org> wrote:
> On Fri, 7 Dec 2007, Jon Smirl wrote:
>
> > Does the gcc repo contain some giant objects? Why wasn't the memory
> > freed after their chain was processed?
>
> It should be.
>
> > Most of the last 10% is being done on a single CPU. There must be a
> > chain of giant objects that is unbalancing everything.
>
> I'm about to send a patch to fix the thread balancing for real this
> time.
Something is really broken in the last 5% of that repo. I have been
processing at 97% for 30 minutes without moving to 98%.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 5:01 ` Jon Smirl
@ 2007-12-08 5:12 ` Nicolas Pitre
0 siblings, 0 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-08 5:12 UTC (permalink / raw
To: Jon Smirl; +Cc: Git Mailing List
On Sat, 8 Dec 2007, Jon Smirl wrote:
> On 12/7/07, Nicolas Pitre <nico@cam.org> wrote:
> > On Fri, 7 Dec 2007, Jon Smirl wrote:
> >
> > > Does the gcc repo contain some giant objects? Why wasn't the memory
> > > freed after their chain was processed?
> >
> > It should be.
> >
> > > Most of the last 10% is being done on a single CPU. There must be a
> > > chain of giant objects that is unbalancing everything.
> >
> > I'm about to send a patch to fix the thread balancing for real this
> > time.
>
> Something is really broken in the last 5% of that repo. I have been
> processing at 97% for 30 minutes without moving to 98%.
This is a clear sign of a problem, indeed.
I'll be away for the weekend, so here's a few things to try out if you
feel like it:
1) Make sure the problem occurs with the thread code disabled. That
would eliminate one variable, and will help for #2.
2) Try bissecting the issue. If you can find an old Git version where
the issue doesn't appear then simply run "git bissect" to find the
exact commit causing the problem. Best with a repo that doesn't take
ages to repack.
3) Compile Git against the dmalloc library in order to identify where
the huge memory leak is happening.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre
` (2 preceding siblings ...)
2007-12-08 3:44 ` Harvey Harrison
@ 2007-12-08 22:18 ` Junio C Hamano
2007-12-09 8:05 ` Junio C Hamano
2007-12-10 2:49 ` Nicolas Pitre
3 siblings, 2 replies; 82+ messages in thread
From: Junio C Hamano @ 2007-12-08 22:18 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Jon Smirl, Git Mailing List
Nicolas Pitre <nico@cam.org> writes:
> On Fri, 7 Dec 2007, Jon Smirl wrote:
>
>> Starting with a 2GB pack of the same data my process size only grew to
>> 3GB with 2GB of mmaps.
>
> Which is quite reasonable, even if the same issue might still be there.
>
> So the problem seems to be related to the pack access code and not the
> repack code. And it must have something to do with the number of deltas
> being replayed. And because the repack is attempting delta compression
> roughly from newest to oldest, and because old objects are typically in
> a deeper delta chain, then this might explain the logarithmic slowdown.
>
> So something must be wrong with the delta cache in sha1_file.c somehow.
I was reaching the same conclusion but haven't managed to spot anything
blatantly wrong in that area. Will need to dig more.
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 22:18 ` Junio C Hamano
@ 2007-12-09 8:05 ` Junio C Hamano
2007-12-09 15:19 ` Jon Smirl
2007-12-09 18:25 ` Jon Smirl
2007-12-10 2:49 ` Nicolas Pitre
1 sibling, 2 replies; 82+ messages in thread
From: Junio C Hamano @ 2007-12-09 8:05 UTC (permalink / raw
To: Jon Smirl; +Cc: Nicolas Pitre, Git Mailing List
Junio C Hamano <gitster@pobox.com> writes:
> Nicolas Pitre <nico@cam.org> writes:
>
>> On Fri, 7 Dec 2007, Jon Smirl wrote:
>>
>>> Starting with a 2GB pack of the same data my process size only grew to
>>> 3GB with 2GB of mmaps.
>>
>> Which is quite reasonable, even if the same issue might still be there.
>>
>> So the problem seems to be related to the pack access code and not the
>> repack code. And it must have something to do with the number of deltas
>> being replayed. And because the repack is attempting delta compression
>> roughly from newest to oldest, and because old objects are typically in
>> a deeper delta chain, then this might explain the logarithmic slowdown.
>>
>> So something must be wrong with the delta cache in sha1_file.c somehow.
>
> I was reaching the same conclusion but haven't managed to spot anything
> blatantly wrong in that area. Will need to dig more.
Does this problem have correlation with the use of threads? Do you see
the same bloat with or without THREADED_DELTA_SEARCH defined?
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-09 8:05 ` Junio C Hamano
@ 2007-12-09 15:19 ` Jon Smirl
2007-12-09 18:25 ` Jon Smirl
1 sibling, 0 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-09 15:19 UTC (permalink / raw
To: Junio C Hamano; +Cc: Nicolas Pitre, Git Mailing List
On 12/9/07, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Nicolas Pitre <nico@cam.org> writes:
> >
> >> On Fri, 7 Dec 2007, Jon Smirl wrote:
> >>
> >>> Starting with a 2GB pack of the same data my process size only grew to
> >>> 3GB with 2GB of mmaps.
> >>
> >> Which is quite reasonable, even if the same issue might still be there.
> >>
> >> So the problem seems to be related to the pack access code and not the
> >> repack code. And it must have something to do with the number of deltas
> >> being replayed. And because the repack is attempting delta compression
> >> roughly from newest to oldest, and because old objects are typically in
> >> a deeper delta chain, then this might explain the logarithmic slowdown.
> >>
> >> So something must be wrong with the delta cache in sha1_file.c somehow.
> >
> > I was reaching the same conclusion but haven't managed to spot anything
> > blatantly wrong in that area. Will need to dig more.
>
> Does this problem have correlation with the use of threads? Do you see
> the same bloat with or without THREADED_DELTA_SEARCH defined?
>
I just started a non-threaded one. It will be four or five hours
before it finishes.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-09 8:05 ` Junio C Hamano
2007-12-09 15:19 ` Jon Smirl
@ 2007-12-09 18:25 ` Jon Smirl
2007-12-10 1:07 ` Nicolas Pitre
1 sibling, 1 reply; 82+ messages in thread
From: Jon Smirl @ 2007-12-09 18:25 UTC (permalink / raw
To: Junio C Hamano; +Cc: Nicolas Pitre, Git Mailing List
On 12/9/07, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
> > Nicolas Pitre <nico@cam.org> writes:
> >
> >> On Fri, 7 Dec 2007, Jon Smirl wrote:
> >>
> >>> Starting with a 2GB pack of the same data my process size only grew to
> >>> 3GB with 2GB of mmaps.
> >>
> >> Which is quite reasonable, even if the same issue might still be there.
> >>
> >> So the problem seems to be related to the pack access code and not the
> >> repack code. And it must have something to do with the number of deltas
> >> being replayed. And because the repack is attempting delta compression
> >> roughly from newest to oldest, and because old objects are typically in
> >> a deeper delta chain, then this might explain the logarithmic slowdown.
> >>
> >> So something must be wrong with the delta cache in sha1_file.c somehow.
> >
> > I was reaching the same conclusion but haven't managed to spot anything
> > blatantly wrong in that area. Will need to dig more.
>
> Does this problem have correlation with the use of threads? Do you see
> the same bloat with or without THREADED_DELTA_SEARCH defined?
>
Something else seems to be wrong.
With threading turned off, 5000 CPU seconds and 13% done.
With threading turned on, threads = 1, 5000 CPU seconds, 13%
With threading turned on, threads = 2, 180 CPU seconds, 13%
With threading turned on, threads = 4, 150 CPU seconds, 13%
This can't be right, four cores are not 40x one core. So maybe the
observed logarithmic slow down is because the percent complete is
being reported wrong in the threaded case. If that's the case we may
be looking in the wrong place for problems.
The times are only approximate, I'm using the CPU for other things.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-09 18:25 ` Jon Smirl
@ 2007-12-10 1:07 ` Nicolas Pitre
0 siblings, 0 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-10 1:07 UTC (permalink / raw
To: Jon Smirl; +Cc: Junio C Hamano, Git Mailing List
On Sun, 9 Dec 2007, Jon Smirl wrote:
> On 12/9/07, Junio C Hamano <gitster@pobox.com> wrote:
> > Junio C Hamano <gitster@pobox.com> writes:
> >
> > > Nicolas Pitre <nico@cam.org> writes:
> > >
> > >> On Fri, 7 Dec 2007, Jon Smirl wrote:
> > >>
> > >>> Starting with a 2GB pack of the same data my process size only grew to
> > >>> 3GB with 2GB of mmaps.
> > >>
> > >> Which is quite reasonable, even if the same issue might still be there.
> > >>
> > >> So the problem seems to be related to the pack access code and not the
> > >> repack code. And it must have something to do with the number of deltas
> > >> being replayed. And because the repack is attempting delta compression
> > >> roughly from newest to oldest, and because old objects are typically in
> > >> a deeper delta chain, then this might explain the logarithmic slowdown.
> > >>
> > >> So something must be wrong with the delta cache in sha1_file.c somehow.
> > >
> > > I was reaching the same conclusion but haven't managed to spot anything
> > > blatantly wrong in that area. Will need to dig more.
> >
> > Does this problem have correlation with the use of threads? Do you see
> > the same bloat with or without THREADED_DELTA_SEARCH defined?
> >
>
> Something else seems to be wrong.
>
> With threading turned off, 5000 CPU seconds and 13% done.
> With threading turned on, threads = 1, 5000 CPU seconds, 13%
> With threading turned on, threads = 2, 180 CPU seconds, 13%
> With threading turned on, threads = 4, 150 CPU seconds, 13%
>
> This can't be right, four cores are not 40x one core.
It may be right. The object list to apply delta compression on doesn't
necessarily require a uniform amount of cycles throughout. When using
multiple threads, the list is broken in parts for each thread, and later
parts might end up being simply much easier to process, therefore
changing the percentage figure.
> So maybe the observed logarithmic slow down is because the percent
> complete is being reported wrong in the threaded case. If that's the
> case we may be looking in the wrong place for problems.
I really doubt it.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-08 22:18 ` Junio C Hamano
2007-12-09 8:05 ` Junio C Hamano
@ 2007-12-10 2:49 ` Nicolas Pitre
1 sibling, 0 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-10 2:49 UTC (permalink / raw
To: Junio C Hamano; +Cc: Jon Smirl, Git Mailing List
On Sat, 8 Dec 2007, Junio C Hamano wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > On Fri, 7 Dec 2007, Jon Smirl wrote:
> >
> >> Starting with a 2GB pack of the same data my process size only grew to
> >> 3GB with 2GB of mmaps.
> >
> > Which is quite reasonable, even if the same issue might still be there.
> >
> > So the problem seems to be related to the pack access code and not the
> > repack code. And it must have something to do with the number of deltas
> > being replayed. And because the repack is attempting delta compression
> > roughly from newest to oldest, and because old objects are typically in
> > a deeper delta chain, then this might explain the logarithmic slowdown.
> >
> > So something must be wrong with the delta cache in sha1_file.c somehow.
>
> I was reaching the same conclusion but haven't managed to spot anything
> blatantly wrong in that area. Will need to dig more.
I didn't find anything wrong there either. I'll have to run some more
gcc repacking tests myself, despite not having a blazingly fast machine
making for rather long turnarounds.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-07 23:05 Something is broken in repack Jon Smirl
` (2 preceding siblings ...)
2007-12-08 2:56 ` David Brown
@ 2007-12-10 19:56 ` Nicolas Pitre
2007-12-10 20:05 ` Jon Smirl
2007-12-11 2:25 ` Jon Smirl
4 siblings, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-10 19:56 UTC (permalink / raw
To: Jon Smirl; +Cc: Git Mailing List
On Fri, 7 Dec 2007, Jon Smirl wrote:
> Using this config:
> [pack]
> threads = 4
> deltacachesize = 256M
> deltacachelimit = 0
>
> And the 330MB gcc pack for input
> git repack -a -d -f --depth=250 --window=250
>
> complete seconds RAM
> 10% 47 1GB
> 20% 29 1Gb
> 30% 24 1Gb
> 40% 18 1GB
> 50% 110 1.2GB
> 60% 85 1.4GB
> 70% 195 1.5GB
> 80% 186 2.5GB
> 90% 489 3.8GB
> 95% 800 4.8GB
> I killed it because it started swapping
>
> The mmaps are only about 400MB in this case.
> At the end the git process had 4.4GB of physical RAM allocated.
>
> Starting from a highly compressed pack greatly aggravates the problem.
> Starting with a 2GB pack of the same data my process size only grew to
> 3GB with 2GB of mmaps.
You said having reproduced the issue, albeit not as severe, with the
Linux kernel repo. I did just that:
# to get the default pack:
$ git repack -a -f -d
# first measurement with a repack from a default pack
$ /usr/bin/time git repack -a -f --window=256 --depth=256
2572.17user 5.87system 22:46.80elapsed 188%CPU (0avgtext+0avgdata 0maxresident)k
15720inputs+356640outputs (71major+264376minor)pagefaults 0swaps
# do it again to start from a highly packed pack
$ /usr/bin/time git repack -a -f --window=256 --depth=256
2573.53user 5.62system 22:45.60elapsed 188%CPU (0avgtext+0avgdata 0maxresident)k
29176inputs+356664outputs (210major+274887minor)pagefaults 0swaps
This is with pack.threads=2 on a P4 with HT, and I'm using the machine
for other tasks as well, but all measured time is sensibly the same for
both cases. Virtual memory allocation never reached 700MB in both cases
either.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-10 19:56 ` Nicolas Pitre
@ 2007-12-10 20:05 ` Jon Smirl
2007-12-10 20:16 ` Morten Welinder
0 siblings, 1 reply; 82+ messages in thread
From: Jon Smirl @ 2007-12-10 20:05 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Git Mailing List
On 12/10/07, Nicolas Pitre <nico@cam.org> wrote:
> On Fri, 7 Dec 2007, Jon Smirl wrote:
>
> > Using this config:
> > [pack]
> > threads = 4
> > deltacachesize = 256M
> > deltacachelimit = 0
> >
> > And the 330MB gcc pack for input
> > git repack -a -d -f --depth=250 --window=250
> >
> > complete seconds RAM
> > 10% 47 1GB
> > 20% 29 1Gb
> > 30% 24 1Gb
> > 40% 18 1GB
> > 50% 110 1.2GB
> > 60% 85 1.4GB
> > 70% 195 1.5GB
> > 80% 186 2.5GB
> > 90% 489 3.8GB
> > 95% 800 4.8GB
> > I killed it because it started swapping
> >
> > The mmaps are only about 400MB in this case.
> > At the end the git process had 4.4GB of physical RAM allocated.
> >
> > Starting from a highly compressed pack greatly aggravates the problem.
> > Starting with a 2GB pack of the same data my process size only grew to
> > 3GB with 2GB of mmaps.
>
> You said having reproduced the issue, albeit not as severe, with the
> Linux kernel repo. I did just that:
>
> # to get the default pack:
> $ git repack -a -f -d
>
> # first measurement with a repack from a default pack
> $ /usr/bin/time git repack -a -f --window=256 --depth=256
> 2572.17user 5.87system 22:46.80elapsed 188%CPU (0avgtext+0avgdata 0maxresident)k
> 15720inputs+356640outputs (71major+264376minor)pagefaults 0swaps
>
> # do it again to start from a highly packed pack
> $ /usr/bin/time git repack -a -f --window=256 --depth=256
> 2573.53user 5.62system 22:45.60elapsed 188%CPU (0avgtext+0avgdata 0maxresident)k
> 29176inputs+356664outputs (210major+274887minor)pagefaults 0swaps
>
> This is with pack.threads=2 on a P4 with HT, and I'm using the machine
> for other tasks as well, but all measured time is sensibly the same for
> both cases. Virtual memory allocation never reached 700MB in both cases
> either.
>
This is the mail about the kernel pack, the one you quoted is a gcc run.
The kernel repo has the same problem but not nearly as bad.
Starting from a default pack
git repack -a -d -f --depth=1000 --window=1000
Uses 1GB of physical memory
Now do the command again.
git repack -a -d -f --depth=1000 --window=1000
Uses 1.3GB of physical memory
I suspect the gcc repo has much longer revision chains than the kernel
one since the kernel repo is only a few years old. The Mozilla repo
contained revision chains with over 2,000 revisions. Longer revision
chains result in longer delta chains.
So what is allocating the extra memory? Either a function of the
number of entries in the chain, or related to accessing the chain
since a chain with more entries will need to be accessed more times.
I have a 168MB kernel pack now after 15 minutes of four cores at 100%.
Here's another observation, the gcc objects are larger. Kernel has
650K objects in 190MB, gcc has 870K objects in 330MB. Average gcc
object is 30% larger. How should the average kernel developer
interpret this?
>
> Nicolas
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-10 20:05 ` Jon Smirl
@ 2007-12-10 20:16 ` Morten Welinder
0 siblings, 0 replies; 82+ messages in thread
From: Morten Welinder @ 2007-12-10 20:16 UTC (permalink / raw
To: Jon Smirl; +Cc: Nicolas Pitre, Git Mailing List
> Here's another observation, the gcc objects are larger. Kernel has
> 650K objects in 190MB, gcc has 870K objects in 330MB. Average gcc
> object is 30% larger. How should the average kernel developer
> interpret this?
Could this be explained by the ChangeLog file? It's large; it has tons of
revisions; it is a prime candidate for delta compression.
Morten
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-07 23:05 Something is broken in repack Jon Smirl
` (3 preceding siblings ...)
2007-12-10 19:56 ` Nicolas Pitre
@ 2007-12-11 2:25 ` Jon Smirl
2007-12-11 2:55 ` Junio C Hamano
2007-12-11 3:49 ` Nicolas Pitre
4 siblings, 2 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-11 2:25 UTC (permalink / raw
To: Git Mailing List, Nicolas Pitre
New run using same configuration. With the addition of the more
efficient load balancing patches and delta cache accounting.
Seconds are wall clock time. They are lower since the patch made
threading better at using all four cores. I am stuck at 380-390% CPU
utilization for the git process.
complete seconds RAM
10% 60 900M (includes counting)
20% 15 900M
30% 15 900M
40% 50 1.2G
50% 80 1.3G
60% 70 1.7G
70% 140 1.8G
80% 180 2.0G
90% 280 2.2G
95% 530 2.8G - 1,420 total to here, previous was 1,983
100% 1390 2.85G
During the writing phase RAM fell to 1.6G
What is being freed in the writing phase??
I have no explanation for the change in RAM usage. Two guesses come to
mind. Memory fragmentation. Or the change in the way the work was
split up altered RAM usage.
Total CPU time was 195 minutes in 70 minutes clock time. About 70%
efficient. During the compress phase all four cores were active until
the last 90 seconds. Writing the objects took over 23 minutes CPU
bound on one core.
New pack file is: 270,594,853
Old one was: 344,543,752
It still has 828,660 objects
On 12/7/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> Using this config:
> [pack]
> threads = 4
> deltacachesize = 256M
> deltacachelimit = 0
>
> And the 330MB gcc pack for input
> git repack -a -d -f --depth=250 --window=250
>
> complete seconds RAM
> 10% 47 1GB
> 20% 29 1Gb
> 30% 24 1Gb
> 40% 18 1GB
> 50% 110 1.2GB
> 60% 85 1.4GB
> 70% 195 1.5GB
> 80% 186 2.5GB
> 90% 489 3.8GB
> 95% 800 4.8GB
> I killed it because it started swapping
>
> The mmaps are only about 400MB in this case.
> At the end the git process had 4.4GB of physical RAM allocated.
>
> Starting from a highly compressed pack greatly aggravates the problem.
> Starting with a 2GB pack of the same data my process size only grew to
> 3GB with 2GB of mmaps.
>
> --
> Jon Smirl
> jonsmirl@gmail.com
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 2:25 ` Jon Smirl
@ 2007-12-11 2:55 ` Junio C Hamano
2007-12-11 3:27 ` Nicolas Pitre
2007-12-11 3:49 ` Nicolas Pitre
1 sibling, 1 reply; 82+ messages in thread
From: Junio C Hamano @ 2007-12-11 2:55 UTC (permalink / raw
To: Jon Smirl; +Cc: Git Mailing List, Nicolas Pitre
"Jon Smirl" <jonsmirl@gmail.com> writes:
> 95% 530 2.8G - 1,420 total to here, previous was 1,983
> 100% 1390 2.85G
> During the writing phase RAM fell to 1.6G
> What is being freed in the writing phase??
entry->delta_data is the only thing I can think of that are freed
in the function that have been allocated much earlier before entering
the function.
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 2:55 ` Junio C Hamano
@ 2007-12-11 3:27 ` Nicolas Pitre
2007-12-11 11:08 ` David Kastrup
0 siblings, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-11 3:27 UTC (permalink / raw
To: Junio C Hamano; +Cc: Jon Smirl, Git Mailing List
On Mon, 10 Dec 2007, Junio C Hamano wrote:
> "Jon Smirl" <jonsmirl@gmail.com> writes:
>
> > 95% 530 2.8G - 1,420 total to here, previous was 1,983
> > 100% 1390 2.85G
> > During the writing phase RAM fell to 1.6G
> > What is being freed in the writing phase??
>
> entry->delta_data is the only thing I can think of that are freed
> in the function that have been allocated much earlier before entering
> the function.
Yet all ->delta-data instances are limited to 256MB according to Jon's
config.
Nicolas
>
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 2:25 ` Jon Smirl
2007-12-11 2:55 ` Junio C Hamano
@ 2007-12-11 3:49 ` Nicolas Pitre
2007-12-11 5:25 ` Jon Smirl
1 sibling, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-11 3:49 UTC (permalink / raw
To: Jon Smirl; +Cc: Git Mailing List
On Mon, 10 Dec 2007, Jon Smirl wrote:
> New run using same configuration. With the addition of the more
> efficient load balancing patches and delta cache accounting.
>
> Seconds are wall clock time. They are lower since the patch made
> threading better at using all four cores. I am stuck at 380-390% CPU
> utilization for the git process.
>
> complete seconds RAM
> 10% 60 900M (includes counting)
> 20% 15 900M
> 30% 15 900M
> 40% 50 1.2G
> 50% 80 1.3G
> 60% 70 1.7G
> 70% 140 1.8G
> 80% 180 2.0G
> 90% 280 2.2G
> 95% 530 2.8G - 1,420 total to here, previous was 1,983
> 100% 1390 2.85G
> During the writing phase RAM fell to 1.6G
> What is being freed in the writing phase??
The cached delta results, but you put a cap of 256MB for them.
Could you try again with that cache disabled entirely, with
pack.deltacachesize = 1 (don't use 0 as that means unbounded).
And then, while still keeping the delta cache disabled, could you try
with pack.threads = 2, and pack.threads = 1 ?
I'm sorry to ask you to do this but I don't have enough ram to even
complete a repack with threads=2 so I'm reattempting single threaded at
the moment. But I really wonder if the threading has such an effect on
memory usage.
>
> I have no explanation for the change in RAM usage. Two guesses come to
> mind. Memory fragmentation. Or the change in the way the work was
> split up altered RAM usage.
>
> Total CPU time was 195 minutes in 70 minutes clock time. About 70%
> efficient. During the compress phase all four cores were active until
> the last 90 seconds. Writing the objects took over 23 minutes CPU
> bound on one core.
>
> New pack file is: 270,594,853
> Old one was: 344,543,752
> It still has 828,660 objects
You mean the pack for the gcc repo is now less than 300MB? Wow.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 3:49 ` Nicolas Pitre
@ 2007-12-11 5:25 ` Jon Smirl
2007-12-11 5:29 ` Jon Smirl
2007-12-11 6:01 ` Sean
0 siblings, 2 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-11 5:25 UTC (permalink / raw
To: Nicolas Pitre, Junio C Hamano; +Cc: Git Mailing List
On 12/10/07, Nicolas Pitre <nico@cam.org> wrote:
> On Mon, 10 Dec 2007, Jon Smirl wrote:
>
> > New run using same configuration. With the addition of the more
> > efficient load balancing patches and delta cache accounting.
> >
> > Seconds are wall clock time. They are lower since the patch made
> > threading better at using all four cores. I am stuck at 380-390% CPU
> > utilization for the git process.
> >
> > complete seconds RAM
> > 10% 60 900M (includes counting)
> > 20% 15 900M
> > 30% 15 900M
> > 40% 50 1.2G
> > 50% 80 1.3G
> > 60% 70 1.7G
> > 70% 140 1.8G
> > 80% 180 2.0G
> > 90% 280 2.2G
> > 95% 530 2.8G - 1,420 total to here, previous was 1,983
> > 100% 1390 2.85G
> > During the writing phase RAM fell to 1.6G
> > What is being freed in the writing phase??
>
> The cached delta results, but you put a cap of 256MB for them.
>
> Could you try again with that cache disabled entirely, with
> pack.deltacachesize = 1 (don't use 0 as that means unbounded).
>
> And then, while still keeping the delta cache disabled, could you try
> with pack.threads = 2, and pack.threads = 1 ?
>
> I'm sorry to ask you to do this but I don't have enough ram to even
> complete a repack with threads=2 so I'm reattempting single threaded at
> the moment. But I really wonder if the threading has such an effect on
> memory usage.
I already have a threads = 1 running with this config. Binary and
config were same from threads=4 run.
10% 28min 950M
40% 135min 950M
50% 157min 900M
60% 160min 830M
100% 170min 830M
Something is hurting bad with threads. 170 CPU minutes with one
thread, versus 195 CPU minutes with four threads.
Is there a different memory allocator that can be used when
multithreaded on gcc? This whole problem may be coming from the memory
allocation function. git is hardly interacting at all on the thread
level so it's likely a problem in the C run-time.
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[pack]
threads = 1
deltacachesize = 256M
windowmemory = 256M
deltacachelimit = 0
[remote "origin"]
url = git://git.infradead.org/gcc.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "trunk"]
remote = origin
merge = refs/heads/trunk
>
>
>
> >
> > I have no explanation for the change in RAM usage. Two guesses come to
> > mind. Memory fragmentation. Or the change in the way the work was
> > split up altered RAM usage.
> >
> > Total CPU time was 195 minutes in 70 minutes clock time. About 70%
> > efficient. During the compress phase all four cores were active until
> > the last 90 seconds. Writing the objects took over 23 minutes CPU
> > bound on one core.
> >
> > New pack file is: 270,594,853
> > Old one was: 344,543,752
> > It still has 828,660 objects
>
> You mean the pack for the gcc repo is now less than 300MB? Wow.
>
>
> Nicolas
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 5:25 ` Jon Smirl
@ 2007-12-11 5:29 ` Jon Smirl
2007-12-11 7:01 ` Jon Smirl
2007-12-11 13:31 ` Nicolas Pitre
2007-12-11 6:01 ` Sean
1 sibling, 2 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-11 5:29 UTC (permalink / raw
To: Nicolas Pitre, Junio C Hamano, gcc; +Cc: Git Mailing List
I added the gcc people to the CC, it's their repository. Maybe they
can help up sort this out.
On 12/11/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> On 12/10/07, Nicolas Pitre <nico@cam.org> wrote:
> > On Mon, 10 Dec 2007, Jon Smirl wrote:
> >
> > > New run using same configuration. With the addition of the more
> > > efficient load balancing patches and delta cache accounting.
> > >
> > > Seconds are wall clock time. They are lower since the patch made
> > > threading better at using all four cores. I am stuck at 380-390% CPU
> > > utilization for the git process.
> > >
> > > complete seconds RAM
> > > 10% 60 900M (includes counting)
> > > 20% 15 900M
> > > 30% 15 900M
> > > 40% 50 1.2G
> > > 50% 80 1.3G
> > > 60% 70 1.7G
> > > 70% 140 1.8G
> > > 80% 180 2.0G
> > > 90% 280 2.2G
> > > 95% 530 2.8G - 1,420 total to here, previous was 1,983
> > > 100% 1390 2.85G
> > > During the writing phase RAM fell to 1.6G
> > > What is being freed in the writing phase??
> >
> > The cached delta results, but you put a cap of 256MB for them.
> >
> > Could you try again with that cache disabled entirely, with
> > pack.deltacachesize = 1 (don't use 0 as that means unbounded).
> >
> > And then, while still keeping the delta cache disabled, could you try
> > with pack.threads = 2, and pack.threads = 1 ?
> >
> > I'm sorry to ask you to do this but I don't have enough ram to even
> > complete a repack with threads=2 so I'm reattempting single threaded at
> > the moment. But I really wonder if the threading has such an effect on
> > memory usage.
>
> I already have a threads = 1 running with this config. Binary and
> config were same from threads=4 run.
>
> 10% 28min 950M
> 40% 135min 950M
> 50% 157min 900M
> 60% 160min 830M
> 100% 170min 830M
>
> Something is hurting bad with threads. 170 CPU minutes with one
> thread, versus 195 CPU minutes with four threads.
>
> Is there a different memory allocator that can be used when
> multithreaded on gcc? This whole problem may be coming from the memory
> allocation function. git is hardly interacting at all on the thread
> level so it's likely a problem in the C run-time.
>
> [core]
> repositoryformatversion = 0
> filemode = true
> bare = false
> logallrefupdates = true
> [pack]
> threads = 1
> deltacachesize = 256M
> windowmemory = 256M
> deltacachelimit = 0
> [remote "origin"]
> url = git://git.infradead.org/gcc.git
> fetch = +refs/heads/*:refs/remotes/origin/*
> [branch "trunk"]
> remote = origin
> merge = refs/heads/trunk
>
>
>
>
> >
> >
> >
> > >
> > > I have no explanation for the change in RAM usage. Two guesses come to
> > > mind. Memory fragmentation. Or the change in the way the work was
> > > split up altered RAM usage.
> > >
> > > Total CPU time was 195 minutes in 70 minutes clock time. About 70%
> > > efficient. During the compress phase all four cores were active until
> > > the last 90 seconds. Writing the objects took over 23 minutes CPU
> > > bound on one core.
> > >
> > > New pack file is: 270,594,853
> > > Old one was: 344,543,752
> > > It still has 828,660 objects
> >
> > You mean the pack for the gcc repo is now less than 300MB? Wow.
> >
> >
> > Nicolas
> >
>
>
> --
> Jon Smirl
> jonsmirl@gmail.com
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 5:25 ` Jon Smirl
2007-12-11 5:29 ` Jon Smirl
@ 2007-12-11 6:01 ` Sean
2007-12-11 6:20 ` Jon Smirl
1 sibling, 1 reply; 82+ messages in thread
From: Sean @ 2007-12-11 6:01 UTC (permalink / raw
To: Jon Smirl; +Cc: Nicolas Pitre, Junio C Hamano, Git Mailing List
On Tue, 11 Dec 2007 00:25:55 -0500
"Jon Smirl" <jonsmirl@gmail.com> wrote:
> Something is hurting bad with threads. 170 CPU minutes with one
> thread, versus 195 CPU minutes with four threads.
>
> Is there a different memory allocator that can be used when
> multithreaded on gcc? This whole problem may be coming from the memory
> allocation function. git is hardly interacting at all on the thread
> level so it's likely a problem in the C run-time.
You might want to try Google's malloc, it's basically a drop in replacement
with some optional built-in performance monitoring capabilities. It is said
to be much faster and better at threading than glibc's:
http://code.google.com/p/google-perftools/wiki/GooglePerformanceTools
http://google-perftools.googlecode.com/svn/trunk/doc/tcmalloc.html
You can LD_PRELOAD it or link directly.
Cheers,
Sean
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 6:01 ` Sean
@ 2007-12-11 6:20 ` Jon Smirl
0 siblings, 0 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-11 6:20 UTC (permalink / raw
To: Sean; +Cc: Nicolas Pitre, Junio C Hamano, Git Mailing List
On 12/11/07, Sean <seanlkml@sympatico.ca> wrote:
> On Tue, 11 Dec 2007 00:25:55 -0500
> "Jon Smirl" <jonsmirl@gmail.com> wrote:
>
> > Something is hurting bad with threads. 170 CPU minutes with one
> > thread, versus 195 CPU minutes with four threads.
> >
> > Is there a different memory allocator that can be used when
> > multithreaded on gcc? This whole problem may be coming from the memory
> > allocation function. git is hardly interacting at all on the thread
> > level so it's likely a problem in the C run-time.
>
> You might want to try Google's malloc, it's basically a drop in replacement
> with some optional built-in performance monitoring capabilities. It is said
> to be much faster and better at threading than glibc's:
>
> http://code.google.com/p/google-perftools/wiki/GooglePerformanceTools
> http://google-perftools.googlecode.com/svn/trunk/doc/tcmalloc.html
>
>
> You can LD_PRELOAD it or link directly.
I'm 45 minutes into a run using it. It doesn't seem to be any faster
but it is reducing memory consumption significantly. The run should be
done in another 20 minutes or so.
>
> Cheers,
> Sean
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 5:29 ` Jon Smirl
@ 2007-12-11 7:01 ` Jon Smirl
2007-12-11 7:34 ` Andreas Ericsson
` (3 more replies)
2007-12-11 13:31 ` Nicolas Pitre
1 sibling, 4 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-11 7:01 UTC (permalink / raw
To: Nicolas Pitre, Junio C Hamano, gcc; +Cc: Git Mailing List
Switching to the Google perftools malloc
http://goog-perftools.sourceforge.net/
10% 30 828M
20% 15 831M
30% 10 834M
40% 50 1014M
50% 80 1086M
60% 80 1500M
70% 200 1.53G
80% 200 1.85G
90% 260 1.87G
95% 520 1.97G
100% 1335 2.24G
Google allocator knocked 600MB off from memory use.
Memory consumption did not fall during the write out phase like it did with gcc.
Since all of this is with the same code except for changing the
threading split, those runs where memory consumption went to 4.5GB
with the gcc allocator must have triggered an extreme problem with
fragmentation.
Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of
being faster are not true.
So why does our threaded code take 20 CPU minutes longer (12%) to run
than the same code with a single thread? Clock time is obviously
faster. Are the threads working too close to each other in memory and
bouncing cache lines between the cores? Q6600 is just two E6600s in
the same package, the caches are not shared.
Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc)
with 4 threads? But only need 950MB with one thread? Where's the extra
gigabyte going?
Is there another allocator to try? One that combines Google's
efficiency with gcc's speed?
On 12/11/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> I added the gcc people to the CC, it's their repository. Maybe they
> can help up sort this out.
>
> On 12/11/07, Jon Smirl <jonsmirl@gmail.com> wrote:
> > On 12/10/07, Nicolas Pitre <nico@cam.org> wrote:
> > > On Mon, 10 Dec 2007, Jon Smirl wrote:
> > >
> > > > New run using same configuration. With the addition of the more
> > > > efficient load balancing patches and delta cache accounting.
> > > >
> > > > Seconds are wall clock time. They are lower since the patch made
> > > > threading better at using all four cores. I am stuck at 380-390% CPU
> > > > utilization for the git process.
> > > >
> > > > complete seconds RAM
> > > > 10% 60 900M (includes counting)
> > > > 20% 15 900M
> > > > 30% 15 900M
> > > > 40% 50 1.2G
> > > > 50% 80 1.3G
> > > > 60% 70 1.7G
> > > > 70% 140 1.8G
> > > > 80% 180 2.0G
> > > > 90% 280 2.2G
> > > > 95% 530 2.8G - 1,420 total to here, previous was 1,983
> > > > 100% 1390 2.85G
> > > > During the writing phase RAM fell to 1.6G
> > > > What is being freed in the writing phase??
> > >
> > > The cached delta results, but you put a cap of 256MB for them.
> > >
> > > Could you try again with that cache disabled entirely, with
> > > pack.deltacachesize = 1 (don't use 0 as that means unbounded).
> > >
> > > And then, while still keeping the delta cache disabled, could you try
> > > with pack.threads = 2, and pack.threads = 1 ?
> > >
> > > I'm sorry to ask you to do this but I don't have enough ram to even
> > > complete a repack with threads=2 so I'm reattempting single threaded at
> > > the moment. But I really wonder if the threading has such an effect on
> > > memory usage.
> >
> > I already have a threads = 1 running with this config. Binary and
> > config were same from threads=4 run.
> >
> > 10% 28min 950M
> > 40% 135min 950M
> > 50% 157min 900M
> > 60% 160min 830M
> > 100% 170min 830M
> >
> > Something is hurting bad with threads. 170 CPU minutes with one
> > thread, versus 195 CPU minutes with four threads.
> >
> > Is there a different memory allocator that can be used when
> > multithreaded on gcc? This whole problem may be coming from the memory
> > allocation function. git is hardly interacting at all on the thread
> > level so it's likely a problem in the C run-time.
> >
> > [core]
> > repositoryformatversion = 0
> > filemode = true
> > bare = false
> > logallrefupdates = true
> > [pack]
> > threads = 1
> > deltacachesize = 256M
> > windowmemory = 256M
> > deltacachelimit = 0
> > [remote "origin"]
> > url = git://git.infradead.org/gcc.git
> > fetch = +refs/heads/*:refs/remotes/origin/*
> > [branch "trunk"]
> > remote = origin
> > merge = refs/heads/trunk
> >
> >
> >
> >
> > >
> > >
> > >
> > > >
> > > > I have no explanation for the change in RAM usage. Two guesses come to
> > > > mind. Memory fragmentation. Or the change in the way the work was
> > > > split up altered RAM usage.
> > > >
> > > > Total CPU time was 195 minutes in 70 minutes clock time. About 70%
> > > > efficient. During the compress phase all four cores were active until
> > > > the last 90 seconds. Writing the objects took over 23 minutes CPU
> > > > bound on one core.
> > > >
> > > > New pack file is: 270,594,853
> > > > Old one was: 344,543,752
> > > > It still has 828,660 objects
> > >
> > > You mean the pack for the gcc repo is now less than 300MB? Wow.
> > >
> > >
> > > Nicolas
> > >
> >
> >
> > --
> > Jon Smirl
> > jonsmirl@gmail.com
> >
>
>
> --
> Jon Smirl
> jonsmirl@gmail.com
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 7:01 ` Jon Smirl
@ 2007-12-11 7:34 ` Andreas Ericsson
2007-12-11 13:49 ` Nicolas Pitre
` (2 subsequent siblings)
3 siblings, 0 replies; 82+ messages in thread
From: Andreas Ericsson @ 2007-12-11 7:34 UTC (permalink / raw
To: Jon Smirl; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List
Jon Smirl wrote:
> Switching to the Google perftools malloc
> http://goog-perftools.sourceforge.net/
>
> Google allocator knocked 600MB off from memory use.
> Memory consumption did not fall during the write out phase like it did with gcc.
>
> Since all of this is with the same code except for changing the
> threading split, those runs where memory consumption went to 4.5GB
> with the gcc allocator must have triggered an extreme problem with
> fragmentation.
>
> Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of
> being faster are not true.
>
Did you use the tcmalloc with heap checker/profiler, or tcmalloc_minimal?
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 3:27 ` Nicolas Pitre
@ 2007-12-11 11:08 ` David Kastrup
2007-12-11 12:08 ` Pierre Habouzit
0 siblings, 1 reply; 82+ messages in thread
From: David Kastrup @ 2007-12-11 11:08 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Junio C Hamano, Jon Smirl, Git Mailing List
Nicolas Pitre <nico@cam.org> writes:
> On Mon, 10 Dec 2007, Junio C Hamano wrote:
>
>> "Jon Smirl" <jonsmirl@gmail.com> writes:
>>
>> > 95% 530 2.8G - 1,420 total to here, previous was 1,983
>> > 100% 1390 2.85G
>> > During the writing phase RAM fell to 1.6G
>> > What is being freed in the writing phase??
>>
>> entry->delta_data is the only thing I can think of that are freed
>> in the function that have been allocated much earlier before entering
>> the function.
>
> Yet all ->delta-data instances are limited to 256MB according to Jon's
> config.
Maybe address space fragmentation is involved here? malloc/free for
large areas works using mmap in glibc. There must be enough
_contiguous_ space for a new allocation to succeed.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 11:08 ` David Kastrup
@ 2007-12-11 12:08 ` Pierre Habouzit
2007-12-11 12:18 ` David Kastrup
0 siblings, 1 reply; 82+ messages in thread
From: Pierre Habouzit @ 2007-12-11 12:08 UTC (permalink / raw
To: David Kastrup; +Cc: Nicolas Pitre, Junio C Hamano, Jon Smirl, Git Mailing List
[-- Attachment #1: Type: text/plain, Size: 1509 bytes --]
On Tue, Dec 11, 2007 at 11:08:47AM +0000, David Kastrup wrote:
> Nicolas Pitre <nico@cam.org> writes:
>
> > On Mon, 10 Dec 2007, Junio C Hamano wrote:
> >
> >> "Jon Smirl" <jonsmirl@gmail.com> writes:
> >>
> >> > 95% 530 2.8G - 1,420 total to here, previous was 1,983
> >> > 100% 1390 2.85G
> >> > During the writing phase RAM fell to 1.6G
> >> > What is being freed in the writing phase??
> >>
> >> entry->delta_data is the only thing I can think of that are freed
> >> in the function that have been allocated much earlier before entering
> >> the function.
> >
> > Yet all ->delta-data instances are limited to 256MB according to Jon's
> > config.
>
> Maybe address space fragmentation is involved here? malloc/free for
> large areas works using mmap in glibc. There must be enough
> _contiguous_ space for a new allocation to succeed.
Well, that's interesting, but there is a way to know for sure instead
of taking bets. Just use valgrind --tool=massif and look at the pretty
picture, it'll tell what was going on very accurately.
Note that I find your explanation unlikely: glibc uses mmap for sizes
over 128k by default (IIRC), and as soon as you use mmaps, that's the
kernel that deals with the address space, and it's not necessarily
contiguous, that's only true for the heap.
--
·O· Pierre Habouzit
··O madcoder@debian.org
OOO http://www.madism.org
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 12:08 ` Pierre Habouzit
@ 2007-12-11 12:18 ` David Kastrup
0 siblings, 0 replies; 82+ messages in thread
From: David Kastrup @ 2007-12-11 12:18 UTC (permalink / raw
To: Pierre Habouzit
Cc: Nicolas Pitre, Junio C Hamano, Jon Smirl, Git Mailing List
Pierre Habouzit <madcoder@artemis.madism.org> writes:
> On Tue, Dec 11, 2007 at 11:08:47AM +0000, David Kastrup wrote:
>
>> Maybe address space fragmentation is involved here? malloc/free for
>> large areas works using mmap in glibc. There must be enough
>> _contiguous_ space for a new allocation to succeed.
>
> Note that I find your explanation unlikely: glibc uses mmap for
> sizes over 128k by default (IIRC), and as soon as you use mmaps,
> that's the kernel that deals with the address space, and it's not
> necessarily contiguous, that's only true for the heap.
Every single allocation needs to be contiguous in virtual address space
and must not collide with existing virtual address space allocations.
So fragmentation is at least a logistical issue.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 5:29 ` Jon Smirl
2007-12-11 7:01 ` Jon Smirl
@ 2007-12-11 13:31 ` Nicolas Pitre
1 sibling, 0 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-11 13:31 UTC (permalink / raw
To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List
On Tue, 11 Dec 2007, Jon Smirl wrote:
> I added the gcc people to the CC, it's their repository. Maybe they
> can help up sort this out.
Unless there is a Git expert amongst the gcc crowd, I somehow doubt it.
And gcc people with an interest in Git internals are probably already on
the Git mailing list.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 7:01 ` Jon Smirl
2007-12-11 7:34 ` Andreas Ericsson
@ 2007-12-11 13:49 ` Nicolas Pitre
2007-12-11 15:00 ` Nicolas Pitre
2007-12-11 16:33 ` Linus Torvalds
2007-12-11 17:28 ` Daniel Berlin
3 siblings, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-11 13:49 UTC (permalink / raw
To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List
On Tue, 11 Dec 2007, Jon Smirl wrote:
> Switching to the Google perftools malloc
> http://goog-perftools.sourceforge.net/
>
> 10% 30 828M
> 20% 15 831M
> 30% 10 834M
> 40% 50 1014M
> 50% 80 1086M
> 60% 80 1500M
> 70% 200 1.53G
> 80% 200 1.85G
> 90% 260 1.87G
> 95% 520 1.97G
> 100% 1335 2.24G
>
> Google allocator knocked 600MB off from memory use.
> Memory consumption did not fall during the write out phase like it did with gcc.
>
> Since all of this is with the same code except for changing the
> threading split, those runs where memory consumption went to 4.5GB
> with the gcc allocator must have triggered an extreme problem with
> fragmentation.
Did you mean the glibc allocator?
> Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of
> being faster are not true.
>
> So why does our threaded code take 20 CPU minutes longer (12%) to run
> than the same code with a single thread? Clock time is obviously
> faster. Are the threads working too close to each other in memory and
> bouncing cache lines between the cores? Q6600 is just two E6600s in
> the same package, the caches are not shared.
Of course there'll always be a certain amount of wasted cycles when
threaded. The locking overhead, the extra contention for IO, etc. So
12% overhead (3% per thread) when using 4 threads is not that bad I
would say.
> Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc)
> with 4 threads? But only need 950MB with one thread? Where's the extra
> gigabyte going?
I really don't know.
Did you try with pack.deltacachesize set to 1 ?
And yet, this is still missing the actual issue. The issue being that
the 2.1GB pack as a _source_ doesn't cause as much memory to be
allocated even if the _result_ pack ends up being the same.
I was able to repack the 2.1GB pack on my machine which has 1GB of ram.
Now that it has been repacked, I can't repack it anymore, even when
single threaded, as it start crowling into swap fairly quickly. It is
really non intuitive and actually senseless that Git would require twice
as much RAM to deal with a pack that is 7 times smaller.
Nicolas (still puzzled)
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 13:49 ` Nicolas Pitre
@ 2007-12-11 15:00 ` Nicolas Pitre
2007-12-11 15:36 ` Jon Smirl
2007-12-11 16:20 ` Nicolas Pitre
0 siblings, 2 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-11 15:00 UTC (permalink / raw
To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List
On Tue, 11 Dec 2007, Nicolas Pitre wrote:
> And yet, this is still missing the actual issue. The issue being that
> the 2.1GB pack as a _source_ doesn't cause as much memory to be
> allocated even if the _result_ pack ends up being the same.
>
> I was able to repack the 2.1GB pack on my machine which has 1GB of ram.
> Now that it has been repacked, I can't repack it anymore, even when
> single threaded, as it start crowling into swap fairly quickly. It is
> really non intuitive and actually senseless that Git would require twice
> as much RAM to deal with a pack that is 7 times smaller.
OK, here's something else for you to try:
core.deltabasecachelimit=0
pack.threads=2
pack.deltacachesize=1
With that I'm able to repack the small gcc pack on my machine with 1GB
of ram using:
git repack -a -f -d --window=250 --depth=250
and top reports a ~700m virt and ~500m res without hitting swap at all.
It is only at 25% so far, but I was unable to get that far before.
Would be curious to know what you get with 4 threads on your machine.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 15:00 ` Nicolas Pitre
@ 2007-12-11 15:36 ` Jon Smirl
2007-12-11 16:20 ` Nicolas Pitre
1 sibling, 0 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-11 15:36 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Junio C Hamano, gcc, Git Mailing List
On 12/11/07, Nicolas Pitre <nico@cam.org> wrote:
> On Tue, 11 Dec 2007, Nicolas Pitre wrote:
>
> > And yet, this is still missing the actual issue. The issue being that
> > the 2.1GB pack as a _source_ doesn't cause as much memory to be
> > allocated even if the _result_ pack ends up being the same.
> >
> > I was able to repack the 2.1GB pack on my machine which has 1GB of ram.
> > Now that it has been repacked, I can't repack it anymore, even when
> > single threaded, as it start crowling into swap fairly quickly. It is
> > really non intuitive and actually senseless that Git would require twice
> > as much RAM to deal with a pack that is 7 times smaller.
>
> OK, here's something else for you to try:
>
> core.deltabasecachelimit=0
> pack.threads=2
> pack.deltacachesize=1
>
> With that I'm able to repack the small gcc pack on my machine with 1GB
> of ram using:
>
> git repack -a -f -d --window=250 --depth=250
>
> and top reports a ~700m virt and ~500m res without hitting swap at all.
> It is only at 25% so far, but I was unable to get that far before.
>
> Would be curious to know what you get with 4 threads on your machine.
Changing those parameters really slowed down counting the objects. I
used to be able to count in 45 seconds now it took 130 seconds. I am
still have the Google allocator linked in.
4 threads, cumulative clock time
25% 200 seconds, 820/627M
55% 510 seconds, 1240/1000M - little late recording
75% 15 minutes, 1658/1500M
90% 22 minutes, 1974/1800M
it's still running but there is no significant change.
Are two types of allocations being mixed?
1) long term, global objects kept until the end of everything
2) volatile, private objects allocated only while the object is being
compressed and then freed
Separating these would make a big difference to the fragmentation
problem. Single threading probably wouldn't see a fragmentation
problem from mixing the allocation types.
When a thread is created it could allocated a private 20MB (or
whatever) pool. The volatile, private objects would come from that
pool. Long term objects would stay in the global pool. Since they are
long term they will just get laid down sequentially in memory.
Separating these allocation types make things way easier for malloc.
CPU time would be helped by removing some of the locking if possible.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 15:00 ` Nicolas Pitre
2007-12-11 15:36 ` Jon Smirl
@ 2007-12-11 16:20 ` Nicolas Pitre
2007-12-11 16:21 ` Jon Smirl
1 sibling, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-11 16:20 UTC (permalink / raw
To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List
On Tue, 11 Dec 2007, Nicolas Pitre wrote:
> OK, here's something else for you to try:
>
> core.deltabasecachelimit=0
> pack.threads=2
> pack.deltacachesize=1
>
> With that I'm able to repack the small gcc pack on my machine with 1GB
> of ram using:
>
> git repack -a -f -d --window=250 --depth=250
>
> and top reports a ~700m virt and ~500m res without hitting swap at all.
> It is only at 25% so far, but I was unable to get that far before.
Well, around 55% memory usage skyrocketed to 1.6GB and the system went
deep into swap. So I restarted it with no threads.
Nicolas (even more puzzled)
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 16:20 ` Nicolas Pitre
@ 2007-12-11 16:21 ` Jon Smirl
2007-12-12 5:12 ` Nicolas Pitre
0 siblings, 1 reply; 82+ messages in thread
From: Jon Smirl @ 2007-12-11 16:21 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Junio C Hamano, gcc, Git Mailing List
On 12/11/07, Nicolas Pitre <nico@cam.org> wrote:
> On Tue, 11 Dec 2007, Nicolas Pitre wrote:
>
> > OK, here's something else for you to try:
> >
> > core.deltabasecachelimit=0
> > pack.threads=2
> > pack.deltacachesize=1
> >
> > With that I'm able to repack the small gcc pack on my machine with 1GB
> > of ram using:
> >
> > git repack -a -f -d --window=250 --depth=250
> >
> > and top reports a ~700m virt and ~500m res without hitting swap at all.
> > It is only at 25% so far, but I was unable to get that far before.
>
> Well, around 55% memory usage skyrocketed to 1.6GB and the system went
> deep into swap. So I restarted it with no threads.
>
> Nicolas (even more puzzled)
On the plus side you are seeing what I see, so it proves I am not imagining it.
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 7:01 ` Jon Smirl
2007-12-11 7:34 ` Andreas Ericsson
2007-12-11 13:49 ` Nicolas Pitre
@ 2007-12-11 16:33 ` Linus Torvalds
2007-12-11 17:21 ` Nicolas Pitre
2007-12-11 18:43 ` Jon Smirl
2007-12-11 17:28 ` Daniel Berlin
3 siblings, 2 replies; 82+ messages in thread
From: Linus Torvalds @ 2007-12-11 16:33 UTC (permalink / raw
To: Jon Smirl; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List
On Tue, 11 Dec 2007, Jon Smirl wrote:
>
> So why does our threaded code take 20 CPU minutes longer (12%) to run
> than the same code with a single thread?
Threaded code *always* takes more CPU time. The only thing you can hope
for is a wall-clock reduction. You're seeing probably a combination of
(a) more cache misses
(b) bigger dataset active at a time
and a probably fairly miniscule
(c) threading itself tends to have some overheads.
> Q6600 is just two E6600s in the same package, the caches are not shared.
Sure they are shared. They're just not *entirely* shared. But they are
shared between each two cores, so each thread essentially has only half
the cache they had with the non-threaded version.
Threading is *not* a magic solution to all problems. It gives you
potentially twice the CPU power, but there are real downsides that you
should keep in mind.
> Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc)
> with 4 threads? But only need 950MB with one thread? Where's the extra
> gigabyte going?
I suspect that it's really simple: you have a few rather big files in the
gcc history, with deep delta chains. And what happens when you have four
threads running at the same time is that they all need to keep all those
objects that they are working on - and their hash state - in memory at the
same time!
So if you want to use more threads, that _forces_ you to have a bigger
memory footprint, simply because you have more "live" objects that you
work on. Normally, that isn't much of a problem, since most source files
are small, but if you have a few deep delta chains on big files, both the
delta chain itself is going to use memory (you may have limited the size
of the cache, but it's still needed for the actual delta generation, so
it's not like the memory usage went away).
That said, I suspect there are a few things fighting you:
- threading is hard. I haven't looked a lot at the changes Nico did to do
a threaded object packer, but what I've seen does not convince me it is
correct. The "trg_entry" accesses are *mostly* protected with
"cache_lock", but nothing else really seems to be, so quite frankly, I
wouldn't trust the threaded version very much. It's off by default, and
for a good reason, I think.
For example: the packing code does this:
if (!src->data) {
read_lock();
src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz);
read_unlock();
...
and that's racy. If two threads come in at roughly the same time and
see a NULL src->data, theÿ́'ll both get the lock, and they'll both
(serially) try to fill it in. It will all *work*, but one of them will
have done unnecessary work, and one of them will have their result
thrown away and leaked.
Are you hitting issues like this? I dunno. The object sorting means
that different threads normally shouldn't look at the same objects (not
even the sources), so probably not, but basically, I wouldn't trust the
threading 100%. It needs work, and it needs to stay off by default.
- you're working on a problem that isn't really even worth optimizing
that much. The *normal* case is to re-use old deltas, which makes all
of the issues you are fighting basically go away (because you only have
a few _incremental_ objects that need deltaing).
In other words: the _real_ optimizations have already been done, and
are done elsewhere, and are much smarter (the best way to optimize X is
not to make X run fast, but to avoid doing X in the first place!). The
thing you are trying to work with is the one-time-only case where you
explicitly disable that big and important optimization, and then you
complain about the end result being slow!
It's like saying that you're compiling with extreme debugging and no
optimizations, and then complaining that the end result doesn't run as
fast as if you used -O2. Except this is a hundred times worse, because
you literally asked git to do the really expensive thing that it really
really doesn't want to do ;)
> Is there another allocator to try? One that combines Google's
> efficiency with gcc's speed?
See above: I'd look around at threading-related bugs and check the way we
lock (or don't) accesses.
Linus
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 16:33 ` Linus Torvalds
@ 2007-12-11 17:21 ` Nicolas Pitre
2007-12-11 17:24 ` David Miller
2007-12-11 18:43 ` Jon Smirl
1 sibling, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-11 17:21 UTC (permalink / raw
To: Linus Torvalds; +Cc: Jon Smirl, Junio C Hamano, gcc, Git Mailing List
[-- Attachment #1: Type: TEXT/PLAIN, Size: 4146 bytes --]
On Tue, 11 Dec 2007, Linus Torvalds wrote:
> That said, I suspect there are a few things fighting you:
>
> - threading is hard. I haven't looked a lot at the changes Nico did to do
> a threaded object packer, but what I've seen does not convince me it is
> correct. The "trg_entry" accesses are *mostly* protected with
> "cache_lock", but nothing else really seems to be, so quite frankly, I
> wouldn't trust the threaded version very much. It's off by default, and
> for a good reason, I think.
I beg to differ (of course, since I always know precisely what I do, and
like you, my code never has bugs).
Seriously though, the trg_entry has not to be protected at all. Why?
Simply because each thread has its own exclusive set of objects which no
other threads ever mess with. They never overlap.
> For example: the packing code does this:
>
> if (!src->data) {
> read_lock();
> src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz);
> read_unlock();
> ...
>
> and that's racy. If two threads come in at roughly the same time and
> see a NULL src->data, theÿ́'ll both get the lock, and they'll both
> (serially) try to fill it in. It will all *work*, but one of them will
> have done unnecessary work, and one of them will have their result
> thrown away and leaked.
No. Once again, it is impossible for two threads to ever see the same
src->data at all. The lock is there simply because read_sha1_file() is
not reentrant.
> Are you hitting issues like this? I dunno. The object sorting means
> that different threads normally shouldn't look at the same objects (not
> even the sources), so probably not, but basically, I wouldn't trust the
> threading 100%. It needs work, and it needs to stay off by default.
For now it is, but I wouldn't say it really needs significant work at
this point. The latest thread patches were more about tuning than
correctness.
What the threading could be doing, though, is uncovering some other
bugs, like in the pack mmap windowing code for example. Although that
code is serialized by the read lock above, the fact that multiple
threads are hammering on it in turns means that the mmap window is
possibly seeking back and forth much more often than otherwise, possibly
leaking something in the process.
> - you're working on a problem that isn't really even worth optimizing
> that much. The *normal* case is to re-use old deltas, which makes all
> of the issues you are fighting basically go away (because you only have
> a few _incremental_ objects that need deltaing).
>
> In other words: the _real_ optimizations have already been done, and
> are done elsewhere, and are much smarter (the best way to optimize X is
> not to make X run fast, but to avoid doing X in the first place!). The
> thing you are trying to work with is the one-time-only case where you
> explicitly disable that big and important optimization, and then you
> complain about the end result being slow!
>
> It's like saying that you're compiling with extreme debugging and no
> optimizations, and then complaining that the end result doesn't run as
> fast as if you used -O2. Except this is a hundred times worse, because
> you literally asked git to do the really expensive thing that it really
> really doesn't want to do ;)
Linus, please pay attention to the _actual_ important issue here.
Sure I've been tuning the threading code in parallel to the attempt to
debug this memory usage issue.
BUT. The point is that repacking the gcc repo using "git repack -a -f
--window=250" has a radically different memory usage profile whether you
do the repack on the earlier 2.1GB pack or the later 300MB pack.
_That_ is the issue. Ironically, it is the 300MB pack that causes the
repack to blow memory usage out of proportion.
And in both cases, the threading code has to do the same
work whether or not the original pack was densely packed or not since -f
throws away every existing deltas anyway.
So something is fishy elsewhere than in the packing code.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 17:21 ` Nicolas Pitre
@ 2007-12-11 17:24 ` David Miller
2007-12-11 17:44 ` Nicolas Pitre
0 siblings, 1 reply; 82+ messages in thread
From: David Miller @ 2007-12-11 17:24 UTC (permalink / raw
To: nico; +Cc: torvalds, jonsmirl, gitster, gcc, git
From: Nicolas Pitre <nico@cam.org>
Date: Tue, 11 Dec 2007 12:21:11 -0500 (EST)
> BUT. The point is that repacking the gcc repo using "git repack -a -f
> --window=250" has a radically different memory usage profile whether you
> do the repack on the earlier 2.1GB pack or the later 300MB pack.
If you repack on the smaller pack file, git has to expand more stuff
internally in order to search the deltas, whereas with the larger pack
file I bet git has to less often undelta'ify to get base objects blobs
for delta search.
In fact that behavior makes perfect sense to me and I don't understand
GIT internals very well :-)
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 7:01 ` Jon Smirl
` (2 preceding siblings ...)
2007-12-11 16:33 ` Linus Torvalds
@ 2007-12-11 17:28 ` Daniel Berlin
3 siblings, 0 replies; 82+ messages in thread
From: Daniel Berlin @ 2007-12-11 17:28 UTC (permalink / raw
To: Jon Smirl; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List
On 12/11/07, Jon Smirl <jonsmirl@gmail.com> wrote:
>
> Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of
> being faster are not true.
Depends on your allocation patterns. For our apps, it certainly is :)
Of course, i don't know if we've updated the external allocator in a
while, i'll bug the people in charge of it.
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 17:24 ` David Miller
@ 2007-12-11 17:44 ` Nicolas Pitre
2007-12-11 20:26 ` Andreas Ericsson
0 siblings, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-11 17:44 UTC (permalink / raw
To: David Miller; +Cc: Linus Torvalds, jonsmirl, Junio C Hamano, gcc, git
On Tue, 11 Dec 2007, David Miller wrote:
> From: Nicolas Pitre <nico@cam.org>
> Date: Tue, 11 Dec 2007 12:21:11 -0500 (EST)
>
> > BUT. The point is that repacking the gcc repo using "git repack -a -f
> > --window=250" has a radically different memory usage profile whether you
> > do the repack on the earlier 2.1GB pack or the later 300MB pack.
>
> If you repack on the smaller pack file, git has to expand more stuff
> internally in order to search the deltas, whereas with the larger pack
> file I bet git has to less often undelta'ify to get base objects blobs
> for delta search.
Of course. I came to that conclusion two days ago. And despite being
pretty familiar with the involved code (I wrote part of it myself) I
just can't spot anything wrong with it so far.
But somehow the threading code keep distracting people from that issue
since it gets to do the same work whether or not the source pack is
densely packed or not.
Nicolas
(who wish he had access to a much faster machine to investigate this issue)
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 16:33 ` Linus Torvalds
2007-12-11 17:21 ` Nicolas Pitre
@ 2007-12-11 18:43 ` Jon Smirl
2007-12-11 18:57 ` Nicolas Pitre
2007-12-11 19:17 ` Linus Torvalds
1 sibling, 2 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-11 18:43 UTC (permalink / raw
To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List
On 12/11/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Tue, 11 Dec 2007, Jon Smirl wrote:
> >
> > So why does our threaded code take 20 CPU minutes longer (12%) to run
> > than the same code with a single thread?
>
> Threaded code *always* takes more CPU time. The only thing you can hope
> for is a wall-clock reduction. You're seeing probably a combination of
> (a) more cache misses
> (b) bigger dataset active at a time
> and a probably fairly miniscule
> (c) threading itself tends to have some overheads.
>
> > Q6600 is just two E6600s in the same package, the caches are not shared.
>
> Sure they are shared. They're just not *entirely* shared. But they are
> shared between each two cores, so each thread essentially has only half
> the cache they had with the non-threaded version.
>
> Threading is *not* a magic solution to all problems. It gives you
> potentially twice the CPU power, but there are real downsides that you
> should keep in mind.
>
> > Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc)
> > with 4 threads? But only need 950MB with one thread? Where's the extra
> > gigabyte going?
>
> I suspect that it's really simple: you have a few rather big files in the
> gcc history, with deep delta chains. And what happens when you have four
> threads running at the same time is that they all need to keep all those
> objects that they are working on - and their hash state - in memory at the
> same time!
>
> So if you want to use more threads, that _forces_ you to have a bigger
> memory footprint, simply because you have more "live" objects that you
> work on. Normally, that isn't much of a problem, since most source files
> are small, but if you have a few deep delta chains on big files, both the
> delta chain itself is going to use memory (you may have limited the size
> of the cache, but it's still needed for the actual delta generation, so
> it's not like the memory usage went away).
This makes sense. Those runs that blew up to 4.5GB were a combination
of this effect and fragmentation in the gcc allocator. Google
allocator appears to be much better at controlling fragmentation.
Is there a reasonable scheme to force the chains to only be loaded
once and then shared between worker threads? The memory blow up
appears to be directly correlated with chain length.
>
> That said, I suspect there are a few things fighting you:
>
> - threading is hard. I haven't looked a lot at the changes Nico did to do
> a threaded object packer, but what I've seen does not convince me it is
> correct. The "trg_entry" accesses are *mostly* protected with
> "cache_lock", but nothing else really seems to be, so quite frankly, I
> wouldn't trust the threaded version very much. It's off by default, and
> for a good reason, I think.
>
> For example: the packing code does this:
>
> if (!src->data) {
> read_lock();
> src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz);
> read_unlock();
> ...
>
> and that's racy. If two threads come in at roughly the same time and
> see a NULL src->data, theÿ́'ll both get the lock, and they'll both
> (serially) try to fill it in. It will all *work*, but one of them will
> have done unnecessary work, and one of them will have their result
> thrown away and leaked.
That may account for the threaded version needing an extra 20 minutes
CPU time. An extra 12% of CPU seems like too much overhead for
threading. Just letting a couple of those long chain compressions be
done twice
>
> Are you hitting issues like this? I dunno. The object sorting means
> that different threads normally shouldn't look at the same objects (not
> even the sources), so probably not, but basically, I wouldn't trust the
> threading 100%. It needs work, and it needs to stay off by default.
>
> - you're working on a problem that isn't really even worth optimizing
> that much. The *normal* case is to re-use old deltas, which makes all
> of the issues you are fighting basically go away (because you only have
> a few _incremental_ objects that need deltaing).
I agree, this problem only occurs when people import giant
repositories. But every time someone hits these problems they declare
git to be screwed up and proceed to thrash it in their blogs.
> In other words: the _real_ optimizations have already been done, and
> are done elsewhere, and are much smarter (the best way to optimize X is
> not to make X run fast, but to avoid doing X in the first place!). The
> thing you are trying to work with is the one-time-only case where you
> explicitly disable that big and important optimization, and then you
> complain about the end result being slow!
>
> It's like saying that you're compiling with extreme debugging and no
> optimizations, and then complaining that the end result doesn't run as
> fast as if you used -O2. Except this is a hundred times worse, because
> you literally asked git to do the really expensive thing that it really
> really doesn't want to do ;)
>
> > Is there another allocator to try? One that combines Google's
> > efficiency with gcc's speed?
>
> See above: I'd look around at threading-related bugs and check the way we
> lock (or don't) accesses.
>
> Linus
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 18:43 ` Jon Smirl
@ 2007-12-11 18:57 ` Nicolas Pitre
2007-12-11 19:17 ` Linus Torvalds
1 sibling, 0 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-11 18:57 UTC (permalink / raw
To: Jon Smirl; +Cc: Linus Torvalds, Junio C Hamano, gcc, Git Mailing List
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3040 bytes --]
On Tue, 11 Dec 2007, Jon Smirl wrote:
> This makes sense. Those runs that blew up to 4.5GB were a combination
> of this effect and fragmentation in the gcc allocator.
I disagree. This is insane.
> Google allocator appears to be much better at controlling fragmentation.
Indeed. And if fragmentation is indeed wasting half of Git's memory
usage then we'll have to come with a custom memory allocator.
> Is there a reasonable scheme to force the chains to only be loaded
> once and then shared between worker threads? The memory blow up
> appears to be directly correlated with chain length.
No. That would be the equivalent of holding each revision of all files
uncompressed all at once in memory.
> > That said, I suspect there are a few things fighting you:
> >
> > - threading is hard. I haven't looked a lot at the changes Nico did to do
> > a threaded object packer, but what I've seen does not convince me it is
> > correct. The "trg_entry" accesses are *mostly* protected with
> > "cache_lock", but nothing else really seems to be, so quite frankly, I
> > wouldn't trust the threaded version very much. It's off by default, and
> > for a good reason, I think.
> >
> > For example: the packing code does this:
> >
> > if (!src->data) {
> > read_lock();
> > src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz);
> > read_unlock();
> > ...
> >
> > and that's racy. If two threads come in at roughly the same time and
> > see a NULL src->data, theÿ́'ll both get the lock, and they'll both
> > (serially) try to fill it in. It will all *work*, but one of them will
> > have done unnecessary work, and one of them will have their result
> > thrown away and leaked.
>
> That may account for the threaded version needing an extra 20 minutes
> CPU time. An extra 12% of CPU seems like too much overhead for
> threading. Just letting a couple of those long chain compressions be
> done twice
No it may not. This theory is wrong as explained before.
> >
> > Are you hitting issues like this? I dunno. The object sorting means
> > that different threads normally shouldn't look at the same objects (not
> > even the sources), so probably not, but basically, I wouldn't trust the
> > threading 100%. It needs work, and it needs to stay off by default.
> >
> > - you're working on a problem that isn't really even worth optimizing
> > that much. The *normal* case is to re-use old deltas, which makes all
> > of the issues you are fighting basically go away (because you only have
> > a few _incremental_ objects that need deltaing).
>
> I agree, this problem only occurs when people import giant
> repositories. But every time someone hits these problems they declare
> git to be screwed up and proceed to thrash it in their blogs.
It's not only for repack. Someone just reported git-blame being
unusable too due to insane memory usage, which I suspect is due to the
same issue.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 18:43 ` Jon Smirl
2007-12-11 18:57 ` Nicolas Pitre
@ 2007-12-11 19:17 ` Linus Torvalds
2007-12-11 19:40 ` Junio C Hamano
1 sibling, 1 reply; 82+ messages in thread
From: Linus Torvalds @ 2007-12-11 19:17 UTC (permalink / raw
To: Jon Smirl; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List
On Tue, 11 Dec 2007, Jon Smirl wrote:
> >
> > So if you want to use more threads, that _forces_ you to have a bigger
> > memory footprint, simply because you have more "live" objects that you
> > work on. Normally, that isn't much of a problem, since most source files
> > are small, but if you have a few deep delta chains on big files, both the
> > delta chain itself is going to use memory (you may have limited the size
> > of the cache, but it's still needed for the actual delta generation, so
> > it's not like the memory usage went away).
>
> This makes sense. Those runs that blew up to 4.5GB were a combination
> of this effect and fragmentation in the gcc allocator. Google
> allocator appears to be much better at controlling fragmentation.
Yes. I think we do have some case where we simply keep a lot of objects
around, and if we are talking reasonably large deltas, we'll have the
whole delta-chain in memory just to unpack one single object.
The delta cache size limits kick in only when we explicitly cache old
delta results (in case they will be re-used, which is rather common), it
doesn't affect the normal "I'm using this data right now" case at all.
And then fragmentation makes it much much worse. Since the allocation
patterns aren't nice (they are pretty random and depend on just the sizes
of the objects), and the lifetimes aren't always nicely nested _either_
(they become more so when you disable the cache entirely, but that's just
death for performance), I'm not surprised that there can be memory
allocators that end up having some issues.
> Is there a reasonable scheme to force the chains to only be loaded
> once and then shared between worker threads? The memory blow up
> appears to be directly correlated with chain length.
The worker threads explicitly avoid touching the same objects, and no, you
definitely don't want to explode the chains globally once, because the
whole point is that we do fit 15 years worth of history into 300MB of
pack-file thanks to having a very dense representation. The "loaded once"
part is the mmap'ing of the pack-file into memory, but if you were to
actually then try to expand the chains, you'd be talking about many *many*
more gigabytes of memory than you already see used ;)
So what you actually want to do is to just re-use already packed delta
chains directly, which is what we normally do. But you are explicitly
looking at the "--no-reuse-delta" (aka "git repack -f") case, which is why
it then blows up.
I'm sure we can find places to improve. But I would like to re-iterate the
statement that you're kind of doing a "don't do that then" case which is
really - by design - meant to be done once and never again, and is using
resources - again, pretty much by design - wildly inappropriately just to
get an initial packing done.
> That may account for the threaded version needing an extra 20 minutes
> CPU time. An extra 12% of CPU seems like too much overhead for
> threading. Just letting a couple of those long chain compressions be
> done twice
Well, Nico pointed out that those things should all be thread-private
data, so no, the race isn't there (unless there's some other bug there).
> I agree, this problem only occurs when people import giant
> repositories. But every time someone hits these problems they declare
> git to be screwed up and proceed to thrash it in their blogs.
Sure. I'd love to do global packing without paying the cost, but it really
was a design decision. Thanks to doing off-line packing ("let it run
overnight on some beefy machine") we can get better results. It's
expensive, yes. But it was pretty much meant to be expensive. It's a very
efficient compression algorithm, after all, and you're turning it up to
eleven ;)
I also suspect that the gcc archive makes things more interesting thanks
to having some rather large files. The ChangeLog is probably the worst
case (large file with *lots* of edits), but I suspect the *.po files
aren't wonderful either.
Linus
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 19:17 ` Linus Torvalds
@ 2007-12-11 19:40 ` Junio C Hamano
2007-12-11 20:34 ` Andreas Ericsson
0 siblings, 1 reply; 82+ messages in thread
From: Junio C Hamano @ 2007-12-11 19:40 UTC (permalink / raw
To: Linus Torvalds; +Cc: Jon Smirl, Nicolas Pitre, gcc, Git Mailing List
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Tue, 11 Dec 2007, Jon Smirl wrote:
>> >
>> > So if you want to use more threads, that _forces_ you to have a bigger
>> > memory footprint, simply because you have more "live" objects that you
>> > work on. Normally, that isn't much of a problem, since most source files
>> > are small, but if you have a few deep delta chains on big files, both the
>> > delta chain itself is going to use memory (you may have limited the size
>> > of the cache, but it's still needed for the actual delta generation, so
>> > it's not like the memory usage went away).
>>
>> This makes sense. Those runs that blew up to 4.5GB were a combination
>> of this effect and fragmentation in the gcc allocator. Google
>> allocator appears to be much better at controlling fragmentation.
>
> Yes. I think we do have some case where we simply keep a lot of objects
> around, and if we are talking reasonably large deltas, we'll have the
> whole delta-chain in memory just to unpack one single object.
Eh, excuse me. unpack_delta_entry()
- first unpacks the base object (this goes recursive);
- uncompresses the delta;
- applies the delta to the base to obtain the target object;
- frees delta;
- frees (but allows it to be cached) the base object;
- returns the result
So no matter how deep a chain is, you keep only one delta at a time in
core, not whole delta-chain in core.
> So what you actually want to do is to just re-use already packed delta
> chains directly, which is what we normally do. But you are explicitly
> looking at the "--no-reuse-delta" (aka "git repack -f") case, which is why
> it then blows up.
While that does not explain, as Nico pointed out, the huge difference
between the two repack runs that have different starting pack, I would
say it is a fair thing to say. If you have a suboptimal pack (i.e. not
enough reusable deltas, as in the 2.1GB pack case), do run "repack -f",
but if you have a good pack (i.e. 300MB pack), don't.
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 17:44 ` Nicolas Pitre
@ 2007-12-11 20:26 ` Andreas Ericsson
0 siblings, 0 replies; 82+ messages in thread
From: Andreas Ericsson @ 2007-12-11 20:26 UTC (permalink / raw
To: Nicolas Pitre
Cc: David Miller, Linus Torvalds, jonsmirl, Junio C Hamano, gcc, git
Nicolas Pitre wrote:
> On Tue, 11 Dec 2007, David Miller wrote:
>
>> From: Nicolas Pitre <nico@cam.org>
>> Date: Tue, 11 Dec 2007 12:21:11 -0500 (EST)
>>
>>> BUT. The point is that repacking the gcc repo using "git repack -a -f
>>> --window=250" has a radically different memory usage profile whether you
>>> do the repack on the earlier 2.1GB pack or the later 300MB pack.
>> If you repack on the smaller pack file, git has to expand more stuff
>> internally in order to search the deltas, whereas with the larger pack
>> file I bet git has to less often undelta'ify to get base objects blobs
>> for delta search.
>
> Of course. I came to that conclusion two days ago. And despite being
> pretty familiar with the involved code (I wrote part of it myself) I
> just can't spot anything wrong with it so far.
>
> But somehow the threading code keep distracting people from that issue
> since it gets to do the same work whether or not the source pack is
> densely packed or not.
>
> Nicolas
> (who wish he had access to a much faster machine to investigate this issue)
If it's still an issue next week, we'll have a 16 core (8 dual-core cpu's)
machine with some 32gb of ram in that'll be free for about two days.
You'll have to remind me about it though, as I've got a lot on my mind
these days.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 19:40 ` Junio C Hamano
@ 2007-12-11 20:34 ` Andreas Ericsson
0 siblings, 0 replies; 82+ messages in thread
From: Andreas Ericsson @ 2007-12-11 20:34 UTC (permalink / raw
To: Junio C Hamano
Cc: Linus Torvalds, Jon Smirl, Nicolas Pitre, gcc, Git Mailing List
Junio C Hamano wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> So what you actually want to do is to just re-use already packed delta
>> chains directly, which is what we normally do. But you are explicitly
>> looking at the "--no-reuse-delta" (aka "git repack -f") case, which is why
>> it then blows up.
>
> While that does not explain, as Nico pointed out, the huge difference
> between the two repack runs that have different starting pack, I would
> say it is a fair thing to say. If you have a suboptimal pack (i.e. not
> enough reusable deltas, as in the 2.1GB pack case), do run "repack -f",
> but if you have a good pack (i.e. 300MB pack), don't.
I think this is too much of a mystery for a lot of people to let it go.
Even I started looking into it, and I've got so little spare time just
now that I wouldn't stand much of a chance of making a contribution
even if I had written the code originally.
That being said, I the fact that some git repositories really *can't*
be repacked on some machines (because it eats ALL virtual memory) is
really something that lowers git's reputation among huge projects.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-11 16:21 ` Jon Smirl
@ 2007-12-12 5:12 ` Nicolas Pitre
2007-12-12 8:05 ` David Kastrup
` (2 more replies)
0 siblings, 3 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-12 5:12 UTC (permalink / raw
To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List
On Tue, 11 Dec 2007, Jon Smirl wrote:
> On 12/11/07, Nicolas Pitre <nico@cam.org> wrote:
> > On Tue, 11 Dec 2007, Nicolas Pitre wrote:
> >
> > > OK, here's something else for you to try:
> > >
> > > core.deltabasecachelimit=0
> > > pack.threads=2
> > > pack.deltacachesize=1
> > >
> > > With that I'm able to repack the small gcc pack on my machine with 1GB
> > > of ram using:
> > >
> > > git repack -a -f -d --window=250 --depth=250
> > >
> > > and top reports a ~700m virt and ~500m res without hitting swap at all.
> > > It is only at 25% so far, but I was unable to get that far before.
> >
> > Well, around 55% memory usage skyrocketed to 1.6GB and the system went
> > deep into swap. So I restarted it with no threads.
> >
> > Nicolas (even more puzzled)
>
> On the plus side you are seeing what I see, so it proves I am not imagining it.
Well... This is weird.
It seems that memory fragmentation is really really killing us here.
The fact that the Google allocator did manage to waste quite less memory
is a good indicator already.
I did modify the progress display to show accounted memory that was
allocated vs memory that was freed but still not released to the system.
At least that gives you an idea of memory allocation and fragmentation
with glibc in real time:
diff --git a/progress.c b/progress.c
index d19f80c..46ac9ef 100644
--- a/progress.c
+++ b/progress.c
@@ -8,6 +8,7 @@
* published by the Free Software Foundation.
*/
+#include <malloc.h>
#include "git-compat-util.h"
#include "progress.h"
@@ -94,10 +95,12 @@ static int display(struct progress *progress, unsigned n, const char *done)
if (progress->total) {
unsigned percent = n * 100 / progress->total;
if (percent != progress->last_percent || progress_update) {
+ struct mallinfo m = mallinfo();
progress->last_percent = percent;
- fprintf(stderr, "%s: %3u%% (%u/%u)%s%s",
- progress->title, percent, n,
- progress->total, tp, eol);
+ fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s",
+ progress->title, percent, n, progress->total,
+ m.uordblks >> 18, m.fordblks >> 18,
+ tp, eol);
fflush(stderr);
progress_update = 0;
return 1;
This shows that at some point the repack goes into a big memory surge.
I don't have enough RAM to see how fragmented memory gets though, since
it starts swapping around 50% done with 2 threads.
With only 1 thread, memory usage grows significantly at around 11% with
a pretty noticeable slowdown in the progress rate.
So I think the theory goes like this:
There is a block of big objects together in the list somewhere.
Initially, all those big objects are assigned to thread #1 out of 4.
Because those objects are big, they get really slow to delta compress,
and storing them all in a window with 250 slots takes significant
memory.
Threads 2, 3, and 4 have "easy" work loads, so they complete fairly
quicly compared to thread #1. But since the progress display is global
then you won't notice that one thread is actually crawling slowly.
To keep all threads busy until the end, those threads that are done with
their work load will steal some work from another thread, choosing the
one with the largest remaining work. That is most likely thread #1. So
as threads 2, 3, and 4 complete, they will steal from thread 1 and
populate their own window with those big objects too, and get slow too.
And because all threads gets to work on those big objects towards the
end, the progress display will then show a significant slowdown, and
memory usage will almost quadruple.
Add memory fragmentation to that and you have a clogged system.
Solution:
pack.deltacachesize=1
pack.windowmemory=16M
Limiting the window memory to 16MB will automatically shrink the window
size when big objects are encountered, therefore keeping much fewer of
those objects at the same time in memory, which in turn means they will
be processed much more quickly. And somehow that must help with memory
fragmentation as well.
Setting pack.deltacachesize to 1 is simply to disable the caching of
delta results entirely which will only slow down the writing phase, but
I wanted to keep it out of the picture for now.
With the above settings, I'm currently repacking the gcc repo with 2
threads, and memory allocation never exceeded 700m virt and 400m res,
while the mallinfo shows about 350MB, and progress has reached 90% which
has never occurred on this machine with the 300MB source pack so far.
Nicolas
^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 5:12 ` Nicolas Pitre
@ 2007-12-12 8:05 ` David Kastrup
2007-12-14 16:18 ` Wolfram Gloger
2007-12-12 15:48 ` Nicolas Pitre
2007-12-12 16:13 ` Nicolas Pitre
2 siblings, 1 reply; 82+ messages in thread
From: David Kastrup @ 2007-12-12 8:05 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Jon Smirl, Junio C Hamano, gcc, Git Mailing List
Nicolas Pitre <nico@cam.org> writes:
> Well... This is weird.
>
> It seems that memory fragmentation is really really killing us here.
> The fact that the Google allocator did manage to waste quite less memory
> is a good indicator already.
Maybe an malloc/free/mmap wrapper that records the requested sizes and
alloc/free order and dumps them to file so that one can make a compact
git-free standalone test case for the glibc maintainers might be a good
thing.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 5:12 ` Nicolas Pitre
2007-12-12 8:05 ` David Kastrup
@ 2007-12-12 15:48 ` Nicolas Pitre
2007-12-12 16:17 ` Paolo Bonzini
` (2 more replies)
2007-12-12 16:13 ` Nicolas Pitre
2 siblings, 3 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-12 15:48 UTC (permalink / raw
To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List
On Wed, 12 Dec 2007, Nicolas Pitre wrote:
> Add memory fragmentation to that and you have a clogged system.
>
> Solution:
>
> pack.deltacachesize=1
> pack.windowmemory=16M
>
> Limiting the window memory to 16MB will automatically shrink the window
> size when big objects are encountered, therefore keeping much fewer of
> those objects at the same time in memory, which in turn means they will
> be processed much more quickly. And somehow that must help with memory
> fragmentation as well.
OK scrap that.
When I returned to the computer this morning, the repack was
completed... with a 1.3GB pack instead.
So... The gcc repo apparently really needs a large window to efficiently
compress those large objects.
But when those large objects are already well deltified and you repack
again with a large window, somehow the memory allocator is way more
involved, probably even
more so when there are several threads in parallel amplifying the issue,
and things probably get to a point of no return with regard to memory
fragmentation after a while.
So... my conclusion is that the glibc allocator has fragmentation issues
with this work load, given the notable difference with the Google
allocator, which itself might not be completely immune to fragmentation
issues of its own. And because the gcc repo requires a large window of
big objects to get good compression, then you're better not using 4
threads to repack it with -a -f. The fact that the size of the source
pack has such an influence is probably only because the increased usage
of the delta base object cache is playing a role in the global memory
allocation pattern, allowing for the bad fragmentation issue to occur.
If you could run one last test with the mallinfo patch I posted, without
the pack.windowmemory setting, and adding the reported values along with
those from top, then we could formally conclude to memory fragmentation
issues.
So I don't think Git itself is actually bad. The gcc repo most
certainly constitute a nasty use case for memory allocators, but I don't
think there is much we can do about it besides possibly implementing our
own memory allocator with active defragmentation where possible (read
memcpy) at some point to give glibc's allocator some chance to breathe a
bit more.
In the mean time you might have to use only one thread and lots of
memory to repack the gcc repo, or find the perfect memory allocator to
be used with Git. After all, packing the whole gcc history to around
230MB is quite a stunt but it requires sufficient resources to
achieve it. Fortunately, like Linus said, such a wholesale repack is not
something that most users have to do anyway.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 5:12 ` Nicolas Pitre
2007-12-12 8:05 ` David Kastrup
2007-12-12 15:48 ` Nicolas Pitre
@ 2007-12-12 16:13 ` Nicolas Pitre
2007-12-13 7:32 ` Andreas Ericsson
2 siblings, 1 reply; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-12 16:13 UTC (permalink / raw
To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List
On Wed, 12 Dec 2007, Nicolas Pitre wrote:
> I did modify the progress display to show accounted memory that was
> allocated vs memory that was freed but still not released to the system.
> At least that gives you an idea of memory allocation and fragmentation
> with glibc in real time:
>
> diff --git a/progress.c b/progress.c
> index d19f80c..46ac9ef 100644
> --- a/progress.c
> +++ b/progress.c
> @@ -8,6 +8,7 @@
> * published by the Free Software Foundation.
> */
>
> +#include <malloc.h>
> #include "git-compat-util.h"
> #include "progress.h"
>
> @@ -94,10 +95,12 @@ static int display(struct progress *progress, unsigned n, const char *done)
> if (progress->total) {
> unsigned percent = n * 100 / progress->total;
> if (percent != progress->last_percent || progress_update) {
> + struct mallinfo m = mallinfo();
> progress->last_percent = percent;
> - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s",
> - progress->title, percent, n,
> - progress->total, tp, eol);
> + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s",
> + progress->title, percent, n, progress->total,
> + m.uordblks >> 18, m.fordblks >> 18,
> + tp, eol);
Note: I didn't know what unit of memory those blocks represents, so the
shift is most probably wrong.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 15:48 ` Nicolas Pitre
@ 2007-12-12 16:17 ` Paolo Bonzini
2007-12-12 16:37 ` Linus Torvalds
2007-12-13 13:32 ` Nguyen Thai Ngoc Duy
2 siblings, 0 replies; 82+ messages in thread
From: Paolo Bonzini @ 2007-12-12 16:17 UTC (permalink / raw
To: gcc; +Cc: git
> When I returned to the computer this morning, the repack was
> completed... with a 1.3GB pack instead.
>
> So... The gcc repo apparently really needs a large window to efficiently
> compress those large objects.
So, am I right that if you have a very well-done pack (such as gcc's),
you might want to repack in two phases:
- first discarding the old deltas and using a small window, thus
producing a bad pack that can be repacked without humongous amounts of
memory...
- ... then discarding the old deltas and producing another
well-compressed pack?
Paolo
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 15:48 ` Nicolas Pitre
2007-12-12 16:17 ` Paolo Bonzini
@ 2007-12-12 16:37 ` Linus Torvalds
2007-12-12 16:42 ` David Miller
` (2 more replies)
2007-12-13 13:32 ` Nguyen Thai Ngoc Duy
2 siblings, 3 replies; 82+ messages in thread
From: Linus Torvalds @ 2007-12-12 16:37 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Jon Smirl, Junio C Hamano, gcc, Git Mailing List
On Wed, 12 Dec 2007, Nicolas Pitre wrote:
>
> So... my conclusion is that the glibc allocator has fragmentation issues
> with this work load, given the notable difference with the Google
> allocator, which itself might not be completely immune to fragmentation
> issues of its own.
Yes.
Note that delta following involves patterns something like
allocate (small) space for delta
for i in (1..depth) {
allocate large space for base
allocate large space for result
.. apply delta ..
free large space for base
free small space for delta
}
so if you have some stupid heap algorithm that doesn't try to merge and
re-use free'd spaces very aggressively (because that takes CPU time!), you
might have memory usage be horribly inflated by the heap having all those
holes for all the objects that got free'd in the chain that don't get
aggressively re-used.
Threaded memory allocators then make this worse by probably using totally
different heaps for different threads (in order to avoid locking), so they
will *all* have the fragmentation issue.
And if you *really* want to cause trouble for a memory allocator, what you
should try to do is to allocate the memory in one thread, and free it in
another, and then things can really explode (the freeing thread notices
that the allocation is not in its thread-local heap, so instead of really
freeing it, it puts it on a separate list of areas to be freed later by
the original thread when it needs memory - or worse, it adds it to the
local thread list, and makes it effectively totally impossible to then
ever merge different free'd allocations ever again because the freed
things will be on different heap lists!).
I'm not saying that particular case happens in git, I'm just saying that
it's not unheard of. And with the delta cache and the object lookup, it's
not at _all_ impossible that we hit the "allocate in one thread, free in
another" case!
Linus
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 16:37 ` Linus Torvalds
@ 2007-12-12 16:42 ` David Miller
2007-12-12 16:54 ` Linus Torvalds
2007-12-12 17:12 ` Jon Smirl
2007-12-14 16:12 ` Wolfram Gloger
2 siblings, 1 reply; 82+ messages in thread
From: David Miller @ 2007-12-12 16:42 UTC (permalink / raw
To: torvalds; +Cc: nico, jonsmirl, gitster, gcc, git
From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 12 Dec 2007 08:37:10 -0800 (PST)
> I'm not saying that particular case happens in git, I'm just saying that
> it's not unheard of. And with the delta cache and the object lookup, it's
> not at _all_ impossible that we hit the "allocate in one thread, free in
> another" case!
One thing that supports these theories is that, while running
these large repacks, I notice that the RSS is roughly 2/3 of
the amount of virtual address space allocated.
I personally don't think it's unreasonable for GIT to have it's
own customized allocator at least for certain object types.
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 16:42 ` David Miller
@ 2007-12-12 16:54 ` Linus Torvalds
0 siblings, 0 replies; 82+ messages in thread
From: Linus Torvalds @ 2007-12-12 16:54 UTC (permalink / raw
To: David Miller; +Cc: nico, jonsmirl, gitster, gcc, git
On Wed, 12 Dec 2007, David Miller wrote:
>
> I personally don't think it's unreasonable for GIT to have it's
> own customized allocator at least for certain object types.
Well, we actually already *do* have a customized allocator, but currently
only for the actual core "object descriptor" that really just has the SHA1
and object flags in it (and a few extra words depending on object type).
Those are critical for certain loads, and small too (so using the standard
allocator wasted a _lot_ of memory). In addition, they're fixed-size and
never free'd, so a specialized allocator really can do a lot better than
any general-purpose memory allocator ever could.
But the actual object *contents* are currently all allocated with whatever
the standard libc malloc/free allocator is that you compile for (or load
dynamically). Havign a specialized allocator for them is a much more
involved issue, exactly because we do have interesting allocation patterns
etc.
That said, at least those object allocations are all single-threaded (for
right now, at least), so even when git does multi-threaded stuff, the core
sha1_file.c stuff is always run under a single lock, and a simpler
allocator that doesn't care about threads is likely to be much better than
one that tries to have thread-local heaps etc.
I suspect that is what the google allocator does. It probably doesn't have
per-thread heaps, it just uses locking (and quite possibly things like
per-*size* heaps, which is much more memory-efficient and helps avoid some
of the fragmentation problems).
Locking is much slower than per-thread accesses, but it doesn't have the
issues with per-thread-fragmentation and all the problems with one thread
allocating and another one freeing.
Linus
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 16:37 ` Linus Torvalds
2007-12-12 16:42 ` David Miller
@ 2007-12-12 17:12 ` Jon Smirl
2007-12-14 16:12 ` Wolfram Gloger
2 siblings, 0 replies; 82+ messages in thread
From: Jon Smirl @ 2007-12-12 17:12 UTC (permalink / raw
To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List
On 12/12/07, Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>
> On Wed, 12 Dec 2007, Nicolas Pitre wrote:
> >
> > So... my conclusion is that the glibc allocator has fragmentation issues
> > with this work load, given the notable difference with the Google
> > allocator, which itself might not be completely immune to fragmentation
> > issues of its own.
>
> Yes.
>
> Note that delta following involves patterns something like
>
> allocate (small) space for delta
> for i in (1..depth) {
> allocate large space for base
> allocate large space for result
> .. apply delta ..
> free large space for base
> free small space for delta
> }
Is it hard to hack up something that statically allocates a big block
of memory per thread for these two and then just reuses it?
allocate (small) space for delta
allocate large space for base
The alternating between long term and short term allocations
definitely aggravates fragmentation.
>
> so if you have some stupid heap algorithm that doesn't try to merge and
> re-use free'd spaces very aggressively (because that takes CPU time!), you
> might have memory usage be horribly inflated by the heap having all those
> holes for all the objects that got free'd in the chain that don't get
> aggressively re-used.
>
> Threaded memory allocators then make this worse by probably using totally
> different heaps for different threads (in order to avoid locking), so they
> will *all* have the fragmentation issue.
>
> And if you *really* want to cause trouble for a memory allocator, what you
> should try to do is to allocate the memory in one thread, and free it in
> another, and then things can really explode (the freeing thread notices
> that the allocation is not in its thread-local heap, so instead of really
> freeing it, it puts it on a separate list of areas to be freed later by
> the original thread when it needs memory - or worse, it adds it to the
> local thread list, and makes it effectively totally impossible to then
> ever merge different free'd allocations ever again because the freed
> things will be on different heap lists!).
>
> I'm not saying that particular case happens in git, I'm just saying that
> it's not unheard of. And with the delta cache and the object lookup, it's
> not at _all_ impossible that we hit the "allocate in one thread, free in
> another" case!
>
> Linus
>
--
Jon Smirl
jonsmirl@gmail.com
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 16:13 ` Nicolas Pitre
@ 2007-12-13 7:32 ` Andreas Ericsson
2007-12-14 16:03 ` Wolfram Gloger
0 siblings, 1 reply; 82+ messages in thread
From: Andreas Ericsson @ 2007-12-13 7:32 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Jon Smirl, Junio C Hamano, gcc, Git Mailing List
Nicolas Pitre wrote:
> On Wed, 12 Dec 2007, Nicolas Pitre wrote:
>
>> I did modify the progress display to show accounted memory that was
>> allocated vs memory that was freed but still not released to the system.
>> At least that gives you an idea of memory allocation and fragmentation
>> with glibc in real time:
>>
>> diff --git a/progress.c b/progress.c
>> index d19f80c..46ac9ef 100644
>> --- a/progress.c
>> +++ b/progress.c
>> @@ -8,6 +8,7 @@
>> * published by the Free Software Foundation.
>> */
>>
>> +#include <malloc.h>
>> #include "git-compat-util.h"
>> #include "progress.h"
>>
>> @@ -94,10 +95,12 @@ static int display(struct progress *progress, unsigned n, const char *done)
>> if (progress->total) {
>> unsigned percent = n * 100 / progress->total;
>> if (percent != progress->last_percent || progress_update) {
>> + struct mallinfo m = mallinfo();
>> progress->last_percent = percent;
>> - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s",
>> - progress->title, percent, n,
>> - progress->total, tp, eol);
>> + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s",
>> + progress->title, percent, n, progress->total,
>> + m.uordblks >> 18, m.fordblks >> 18,
>> + tp, eol);
>
> Note: I didn't know what unit of memory those blocks represents, so the
> shift is most probably wrong.
>
Me neither, but it appears to me as if hblkhd holds the actual memory
consumed by the process. It seems to store the information in bytes,
which I find a bit dubious unless glibc has some internal multiplier.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 15:48 ` Nicolas Pitre
2007-12-12 16:17 ` Paolo Bonzini
2007-12-12 16:37 ` Linus Torvalds
@ 2007-12-13 13:32 ` Nguyen Thai Ngoc Duy
2007-12-13 15:32 ` Paolo Bonzini
2 siblings, 1 reply; 82+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2007-12-13 13:32 UTC (permalink / raw
To: Nicolas Pitre; +Cc: Jon Smirl, Junio C Hamano, gcc, Git Mailing List
On Dec 12, 2007 10:48 PM, Nicolas Pitre <nico@cam.org> wrote:
> In the mean time you might have to use only one thread and lots of
> memory to repack the gcc repo, or find the perfect memory allocator to
> be used with Git. After all, packing the whole gcc history to around
> 230MB is quite a stunt but it requires sufficient resources to
> achieve it. Fortunately, like Linus said, such a wholesale repack is not
> something that most users have to do anyway.
Is there an alternative to "git repack -a -d" that repacks everything
but the first pack?
--
Duy
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-13 13:32 ` Nguyen Thai Ngoc Duy
@ 2007-12-13 15:32 ` Paolo Bonzini
2007-12-13 16:29 ` Paolo Bonzini
2007-12-13 16:39 ` Johannes Sixt
0 siblings, 2 replies; 82+ messages in thread
From: Paolo Bonzini @ 2007-12-13 15:32 UTC (permalink / raw
To: git; +Cc: gcc
Nguyen Thai Ngoc Duy wrote:
> On Dec 12, 2007 10:48 PM, Nicolas Pitre <nico@cam.org> wrote:
>> In the mean time you might have to use only one thread and lots of
>> memory to repack the gcc repo, or find the perfect memory allocator to
>> be used with Git. After all, packing the whole gcc history to around
>> 230MB is quite a stunt but it requires sufficient resources to
>> achieve it. Fortunately, like Linus said, such a wholesale repack is not
>> something that most users have to do anyway.
>
> Is there an alternative to "git repack -a -d" that repacks everything
> but the first pack?
That would be a pretty good idea for big repositories. If I were to
implement it, I would actually add a .git/config option like
pack.permanent so that more than one pack could be made permanent; then
to repack really really everything you'd need "git repack -a -a -d".
Paolo
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-13 15:32 ` Paolo Bonzini
@ 2007-12-13 16:29 ` Paolo Bonzini
2007-12-13 16:39 ` Johannes Sixt
1 sibling, 0 replies; 82+ messages in thread
From: Paolo Bonzini @ 2007-12-13 16:29 UTC (permalink / raw
Cc: git, gcc
>> Is there an alternative to "git repack -a -d" that repacks everything
>> but the first pack?
>
> That would be a pretty good idea for big repositories. If I were to
> implement it, I would actually add a .git/config option like
> pack.permanent so that more than one pack could be made permanent; then
> to repack really really everything you'd need "git repack -a -a -d".
Actually there is something like this, as seen from the source of
git-repack:
for e in `cd "$PACKDIR" && find . -type f -name '*.pack' \
| sed -e 's/^\.\///' -e 's/\.pack$//'`
do
if [ -e "$PACKDIR/$e.keep" ]; then
: keep
else
args="$args --unpacked=$e.pack"
existing="$existing $e"
fi
done
So, just create a file named as the pack, but with extension ".keep".
Paolo
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-13 15:32 ` Paolo Bonzini
2007-12-13 16:29 ` Paolo Bonzini
@ 2007-12-13 16:39 ` Johannes Sixt
2007-12-14 1:04 ` Jakub Narebski
1 sibling, 1 reply; 82+ messages in thread
From: Johannes Sixt @ 2007-12-13 16:39 UTC (permalink / raw
To: Paolo Bonzini; +Cc: git, gcc
Paolo Bonzini schrieb:
> Nguyen Thai Ngoc Duy wrote:
>> On Dec 12, 2007 10:48 PM, Nicolas Pitre <nico@cam.org> wrote:
>>> In the mean time you might have to use only one thread and lots of
>>> memory to repack the gcc repo, or find the perfect memory allocator to
>>> be used with Git. After all, packing the whole gcc history to around
>>> 230MB is quite a stunt but it requires sufficient resources to
>>> achieve it. Fortunately, like Linus said, such a wholesale repack is not
>>> something that most users have to do anyway.
>>
>> Is there an alternative to "git repack -a -d" that repacks everything
>> but the first pack?
>
> That would be a pretty good idea for big repositories. If I were to
> implement it, I would actually add a .git/config option like
> pack.permanent so that more than one pack could be made permanent; then
> to repack really really everything you'd need "git repack -a -a -d".
It's already there: If you have a pack .git/objects/pack/pack-foo.pack, then
"touch .git/objects/pack/pack-foo.keep" marks the pack as precious.
-- Hannes
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-13 16:39 ` Johannes Sixt
@ 2007-12-14 1:04 ` Jakub Narebski
2007-12-14 6:14 ` Paolo Bonzini
0 siblings, 1 reply; 82+ messages in thread
From: Jakub Narebski @ 2007-12-14 1:04 UTC (permalink / raw
To: git; +Cc: gcc
Johannes Sixt wrote:
> Paolo Bonzini schrieb:
>> Nguyen Thai Ngoc Duy wrote:
>>>
>>> Is there an alternative to "git repack -a -d" that repacks everything
>>> but the first pack?
>>
>> That would be a pretty good idea for big repositories. If I were to
>> implement it, I would actually add a .git/config option like
>> pack.permanent so that more than one pack could be made permanent; then
>> to repack really really everything you'd need "git repack -a -a -d".
>
> It's already there: If you have a pack .git/objects/pack/pack-foo.pack, then
> "touch .git/objects/pack/pack-foo.keep" marks the pack as precious.
Actually you can (and probably should) put the one line with the _reason_
pack is to be kept in the *.keep file.
Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of
all things.
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-14 1:04 ` Jakub Narebski
@ 2007-12-14 6:14 ` Paolo Bonzini
2007-12-14 6:24 ` Nguyen Thai Ngoc Duy
2007-12-14 13:25 ` Nicolas Pitre
0 siblings, 2 replies; 82+ messages in thread
From: Paolo Bonzini @ 2007-12-14 6:14 UTC (permalink / raw
To: git; +Cc: gcc
> Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of
> all things.
I found that the .keep file is not transmitted over the network (at
least I tried with git+ssh:// and http:// protocols), however.
Paolo
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-14 6:14 ` Paolo Bonzini
@ 2007-12-14 6:24 ` Nguyen Thai Ngoc Duy
2007-12-14 8:20 ` Paolo Bonzini
2007-12-14 10:40 ` Jakub Narebski
2007-12-14 13:25 ` Nicolas Pitre
1 sibling, 2 replies; 82+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2007-12-14 6:24 UTC (permalink / raw
To: Paolo Bonzini; +Cc: git, gcc
On Dec 14, 2007 1:14 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> > Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of
> > all things.
>
> I found that the .keep file is not transmitted over the network (at
> least I tried with git+ssh:// and http:// protocols), however.
I'm thinking about "git clone --keep" to mark initial packs precious.
But 'git clone' is under rewrite to C. Let's wait until C rewrite is
done.
--
Duy
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-14 6:24 ` Nguyen Thai Ngoc Duy
@ 2007-12-14 8:20 ` Paolo Bonzini
2007-12-14 9:01 ` Harvey Harrison
2007-12-14 10:40 ` Jakub Narebski
1 sibling, 1 reply; 82+ messages in thread
From: Paolo Bonzini @ 2007-12-14 8:20 UTC (permalink / raw
To: gcc; +Cc: git
> I'm thinking about "git clone --keep" to mark initial packs precious.
> But 'git clone' is under rewrite to C. Let's wait until C rewrite is
> done.
It should be the default, IMHO.
Paolo
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-14 8:20 ` Paolo Bonzini
@ 2007-12-14 9:01 ` Harvey Harrison
0 siblings, 0 replies; 82+ messages in thread
From: Harvey Harrison @ 2007-12-14 9:01 UTC (permalink / raw
To: Paolo Bonzini; +Cc: gcc, git
On Fri, 2007-12-14 at 09:20 +0100, Paolo Bonzini wrote:
> > I'm thinking about "git clone --keep" to mark initial packs precious.
> > But 'git clone' is under rewrite to C. Let's wait until C rewrite is
> > done.
>
> It should be the default, IMHO.
>
While it doesn't mark the packs as .keep, git will reuse all of the old
deltas you got in the original clone, so you're not losing anything.
Harvey
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-14 6:24 ` Nguyen Thai Ngoc Duy
2007-12-14 8:20 ` Paolo Bonzini
@ 2007-12-14 10:40 ` Jakub Narebski
2007-12-14 10:52 ` Nguyen Thai Ngoc Duy
1 sibling, 1 reply; 82+ messages in thread
From: Jakub Narebski @ 2007-12-14 10:40 UTC (permalink / raw
To: Nguyen Thai Ngoc Duy; +Cc: Paolo Bonzini, git, gcc
"Nguyen Thai Ngoc Duy" <pclouds@gmail.com> writes:
> On Dec 14, 2007 1:14 PM, Paolo Bonzini <bonzini@gnu.org> wrote:
> > > Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of
> > > all things.
> >
> > I found that the .keep file is not transmitted over the network (at
> > least I tried with git+ssh:// and http:// protocols), however.
>
> I'm thinking about "git clone --keep" to mark initial packs precious.
> But 'git clone' is under rewrite to C. Let's wait until C rewrite is
> done.
But if you clone via network, pack might be network optimized if you
use "smart" transport, not disk optimized, at least with current git
which regenerates pack also on clone AFAIK.
--
Jakub Narebski
Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-14 10:40 ` Jakub Narebski
@ 2007-12-14 10:52 ` Nguyen Thai Ngoc Duy
0 siblings, 0 replies; 82+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2007-12-14 10:52 UTC (permalink / raw
To: Jakub Narebski, Harvey Harrison; +Cc: Paolo Bonzini, git, gcc
On Dec 14, 2007 4:01 PM, Harvey Harrison <harvey.harrison@gmail.com> wrote:
> While it doesn't mark the packs as .keep, git will reuse all of the old
> deltas you got in the original clone, so you're not losing anything.
There is another reason I want it. I have an ~800MB pack and I don't
want git to rewrite the pack every time I repack my changes. So it's
kind of disk-wise (don't require 800MB on disk to prepare new pack,
and don't write too much).
On Dec 14, 2007 5:40 PM, Jakub Narebski <jnareb@gmail.com> wrote:
> But if you clone via network, pack might be network optimized if you
> use "smart" transport, not disk optimized, at least with current git
> which regenerates pack also on clone AFAIK.
Um.. that's ok it just regenerate once.
--
Duy
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-14 6:14 ` Paolo Bonzini
2007-12-14 6:24 ` Nguyen Thai Ngoc Duy
@ 2007-12-14 13:25 ` Nicolas Pitre
1 sibling, 0 replies; 82+ messages in thread
From: Nicolas Pitre @ 2007-12-14 13:25 UTC (permalink / raw
To: Paolo Bonzini; +Cc: git, gcc
On Fri, 14 Dec 2007, Paolo Bonzini wrote:
> > Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of
> > all things.
>
> I found that the .keep file is not transmitted over the network (at least I
> tried with git+ssh:// and http:// protocols), however.
That is a local policy.
Nicolas
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-13 7:32 ` Andreas Ericsson
@ 2007-12-14 16:03 ` Wolfram Gloger
0 siblings, 0 replies; 82+ messages in thread
From: Wolfram Gloger @ 2007-12-14 16:03 UTC (permalink / raw
To: ae; +Cc: nico, jonsmirl, gitster, gcc, git
Hi,
> >> if (progress->total) {
> >> unsigned percent = n * 100 / progress->total;
> >> if (percent != progress->last_percent || progress_update) {
> >> + struct mallinfo m = mallinfo();
> >> progress->last_percent = percent;
> >> - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s",
> >> - progress->title, percent, n,
> >> - progress->total, tp, eol);
> >> + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s",
> >> + progress->title, percent, n, progress->total,
> >> + m.uordblks >> 18, m.fordblks >> 18,
> >> + tp, eol);
> >
> > Note: I didn't know what unit of memory those blocks represents, so the
> > shift is most probably wrong.
> >
>
> Me neither, but it appears to me as if hblkhd holds the actual memory
> consumed by the process. It seems to store the information in bytes,
> which I find a bit dubious unless glibc has some internal multiplier.
mallinfo() will only give you the used memory for the main arena.
When you have separate arenas (likely when concurrent threads have
been used), the only way to get the full picture is to call
malloc_stats(), which prints to stderr.
Regards,
Wolfram.
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 16:37 ` Linus Torvalds
2007-12-12 16:42 ` David Miller
2007-12-12 17:12 ` Jon Smirl
@ 2007-12-14 16:12 ` Wolfram Gloger
2007-12-14 16:45 ` David Kastrup
2 siblings, 1 reply; 82+ messages in thread
From: Wolfram Gloger @ 2007-12-14 16:12 UTC (permalink / raw
To: torvalds; +Cc: nico, jonsmirl, gitster, gcc, git
Hi,
> Note that delta following involves patterns something like
>
> allocate (small) space for delta
> for i in (1..depth) {
> allocate large space for base
> allocate large space for result
> .. apply delta ..
> free large space for base
> free small space for delta
> }
>
> so if you have some stupid heap algorithm that doesn't try to merge and
> re-use free'd spaces very aggressively (because that takes CPU time!),
ptmalloc2 (in glibc) _per arena_ is basically best-fit. This is the
best known general strategy, but it certainly cannot be the best in
every case.
> you
> might have memory usage be horribly inflated by the heap having all those
> holes for all the objects that got free'd in the chain that don't get
> aggressively re-used.
It depends how large 'large' is -- if it exceeds the mmap() threshold
(settable with mallopt(M_MMAP_THRESHOLD, ...))
the 'large' spaces will be allocated with mmap() and won't cause
any internal fragmentation.
It might pay to experiment with this parameter if it is hard to
avoid the alloc/free large space sequence.
> Threaded memory allocators then make this worse by probably using totally
> different heaps for different threads (in order to avoid locking), so they
> will *all* have the fragmentation issue.
Indeed.
Could someone perhaps try ptmalloc3
(http://malloc.de/malloc/ptmalloc3-current.tar.gz) on this case?
Thanks,
Wolfram.
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-12 8:05 ` David Kastrup
@ 2007-12-14 16:18 ` Wolfram Gloger
0 siblings, 0 replies; 82+ messages in thread
From: Wolfram Gloger @ 2007-12-14 16:18 UTC (permalink / raw
To: dak; +Cc: nico, jonsmirl, gitster, gcc, git
Hi,
> Maybe an malloc/free/mmap wrapper that records the requested sizes and
> alloc/free order and dumps them to file so that one can make a compact
> git-free standalone test case for the glibc maintainers might be a good
> thing.
I already have such a wrapper:
http://malloc.de/malloc/mtrace-20060529.tar.gz
But note that it does interfere with the thread scheduling, so it
can't record the exact same allocation pattern as when not using the
wrapper.
Regards,
Wolfram.
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-14 16:12 ` Wolfram Gloger
@ 2007-12-14 16:45 ` David Kastrup
2007-12-14 16:59 ` Wolfram Gloger
0 siblings, 1 reply; 82+ messages in thread
From: David Kastrup @ 2007-12-14 16:45 UTC (permalink / raw
To: Wolfram Gloger; +Cc: torvalds, nico, jonsmirl, gitster, gcc, git
Wolfram Gloger <wmglo@dent.med.uni-muenchen.de> writes:
> Hi,
>
>> Note that delta following involves patterns something like
>>
>> allocate (small) space for delta
>> for i in (1..depth) {
>> allocate large space for base
>> allocate large space for result
>> .. apply delta ..
>> free large space for base
>> free small space for delta
>> }
>>
>> so if you have some stupid heap algorithm that doesn't try to merge and
>> re-use free'd spaces very aggressively (because that takes CPU time!),
>
> ptmalloc2 (in glibc) _per arena_ is basically best-fit. This is the
> best known general strategy,
Uh what? Someone crank out his copy of "The Art of Computer
Programming", I think volume 1. Best fit is known (analyzed and proven
and documented decades ago) to be one of the worst strategies for memory
allocation. Exactly because it leads to huge fragmentation problems.
--
David Kastrup, Kriemhildstr. 15, 44793 Bochum
^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack
2007-12-14 16:45 ` David Kastrup
@ 2007-12-14 16:59 ` Wolfram Gloger
0 siblings, 0 replies; 82+ messages in thread
From: Wolfram Gloger @ 2007-12-14 16:59 UTC (permalink / raw
To: dak; +Cc: wmglo, torvalds, nico, jonsmirl, gitster, gcc, git
Hi,
> Uh what? Someone crank out his copy of "The Art of Computer
> Programming", I think volume 1. Best fit is known (analyzed and proven
> and documented decades ago) to be one of the worst strategies for memory
> allocation. Exactly because it leads to huge fragmentation problems.
Well, quoting http://gee.cs.oswego.edu/dl/html/malloc.html:
"As shown by Wilson et al, best-fit schemes (of various kinds and
approximations) tend to produce the least fragmentation on real loads
compared to other general approaches such as first-fit."
See [Wilson 1995] ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for
more details and references.
Regards,
Wolfram.
^ permalink raw reply [flat|nested] 82+ messages in thread
end of thread, other threads:[~2007-12-14 17:00 UTC | newest]
Thread overview: 82+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-12-07 23:05 Something is broken in repack Jon Smirl
2007-12-08 0:37 ` Linus Torvalds
2007-12-08 1:27 ` [PATCH] pack-objects: fix delta cache size accounting Nicolas Pitre
2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre
2007-12-08 2:04 ` Jon Smirl
2007-12-08 2:28 ` Nicolas Pitre
2007-12-08 3:29 ` Jon Smirl
2007-12-08 3:37 ` David Brown
2007-12-08 4:22 ` Jon Smirl
2007-12-08 4:30 ` Nicolas Pitre
2007-12-08 5:01 ` Jon Smirl
2007-12-08 5:12 ` Nicolas Pitre
2007-12-08 3:48 ` Harvey Harrison
2007-12-08 2:22 ` Jon Smirl
2007-12-08 3:44 ` Harvey Harrison
2007-12-08 22:18 ` Junio C Hamano
2007-12-09 8:05 ` Junio C Hamano
2007-12-09 15:19 ` Jon Smirl
2007-12-09 18:25 ` Jon Smirl
2007-12-10 1:07 ` Nicolas Pitre
2007-12-10 2:49 ` Nicolas Pitre
2007-12-08 2:56 ` David Brown
2007-12-10 19:56 ` Nicolas Pitre
2007-12-10 20:05 ` Jon Smirl
2007-12-10 20:16 ` Morten Welinder
2007-12-11 2:25 ` Jon Smirl
2007-12-11 2:55 ` Junio C Hamano
2007-12-11 3:27 ` Nicolas Pitre
2007-12-11 11:08 ` David Kastrup
2007-12-11 12:08 ` Pierre Habouzit
2007-12-11 12:18 ` David Kastrup
2007-12-11 3:49 ` Nicolas Pitre
2007-12-11 5:25 ` Jon Smirl
2007-12-11 5:29 ` Jon Smirl
2007-12-11 7:01 ` Jon Smirl
2007-12-11 7:34 ` Andreas Ericsson
2007-12-11 13:49 ` Nicolas Pitre
2007-12-11 15:00 ` Nicolas Pitre
2007-12-11 15:36 ` Jon Smirl
2007-12-11 16:20 ` Nicolas Pitre
2007-12-11 16:21 ` Jon Smirl
2007-12-12 5:12 ` Nicolas Pitre
2007-12-12 8:05 ` David Kastrup
2007-12-14 16:18 ` Wolfram Gloger
2007-12-12 15:48 ` Nicolas Pitre
2007-12-12 16:17 ` Paolo Bonzini
2007-12-12 16:37 ` Linus Torvalds
2007-12-12 16:42 ` David Miller
2007-12-12 16:54 ` Linus Torvalds
2007-12-12 17:12 ` Jon Smirl
2007-12-14 16:12 ` Wolfram Gloger
2007-12-14 16:45 ` David Kastrup
2007-12-14 16:59 ` Wolfram Gloger
2007-12-13 13:32 ` Nguyen Thai Ngoc Duy
2007-12-13 15:32 ` Paolo Bonzini
2007-12-13 16:29 ` Paolo Bonzini
2007-12-13 16:39 ` Johannes Sixt
2007-12-14 1:04 ` Jakub Narebski
2007-12-14 6:14 ` Paolo Bonzini
2007-12-14 6:24 ` Nguyen Thai Ngoc Duy
2007-12-14 8:20 ` Paolo Bonzini
2007-12-14 9:01 ` Harvey Harrison
2007-12-14 10:40 ` Jakub Narebski
2007-12-14 10:52 ` Nguyen Thai Ngoc Duy
2007-12-14 13:25 ` Nicolas Pitre
2007-12-12 16:13 ` Nicolas Pitre
2007-12-13 7:32 ` Andreas Ericsson
2007-12-14 16:03 ` Wolfram Gloger
2007-12-11 16:33 ` Linus Torvalds
2007-12-11 17:21 ` Nicolas Pitre
2007-12-11 17:24 ` David Miller
2007-12-11 17:44 ` Nicolas Pitre
2007-12-11 20:26 ` Andreas Ericsson
2007-12-11 18:43 ` Jon Smirl
2007-12-11 18:57 ` Nicolas Pitre
2007-12-11 19:17 ` Linus Torvalds
2007-12-11 19:40 ` Junio C Hamano
2007-12-11 20:34 ` Andreas Ericsson
2007-12-11 17:28 ` Daniel Berlin
2007-12-11 13:31 ` Nicolas Pitre
2007-12-11 6:01 ` Sean
2007-12-11 6:20 ` Jon Smirl
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).