* Something is broken in repack @ 2007-12-07 23:05 Jon Smirl 2007-12-08 0:37 ` Linus Torvalds ` (4 more replies) 0 siblings, 5 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-07 23:05 UTC (permalink / raw To: Git Mailing List Using this config: [pack] threads = 4 deltacachesize = 256M deltacachelimit = 0 And the 330MB gcc pack for input git repack -a -d -f --depth=250 --window=250 complete seconds RAM 10% 47 1GB 20% 29 1Gb 30% 24 1Gb 40% 18 1GB 50% 110 1.2GB 60% 85 1.4GB 70% 195 1.5GB 80% 186 2.5GB 90% 489 3.8GB 95% 800 4.8GB I killed it because it started swapping The mmaps are only about 400MB in this case. At the end the git process had 4.4GB of physical RAM allocated. Starting from a highly compressed pack greatly aggravates the problem. Starting with a 2GB pack of the same data my process size only grew to 3GB with 2GB of mmaps. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-07 23:05 Something is broken in repack Jon Smirl @ 2007-12-08 0:37 ` Linus Torvalds 2007-12-08 1:27 ` [PATCH] pack-objects: fix delta cache size accounting Nicolas Pitre 2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre ` (3 subsequent siblings) 4 siblings, 1 reply; 82+ messages in thread From: Linus Torvalds @ 2007-12-08 0:37 UTC (permalink / raw To: Jon Smirl, Nicolas Pitre; +Cc: Git Mailing List On Fri, 7 Dec 2007, Jon Smirl wrote: > > Using this config: > [pack] > threads = 4 > deltacachesize = 256M I think deltacachesize is broken. The code in try_delta() that replaces a delta cache entry with another one seems very buggy wrt that whole "delta_cache_size" update. It does delta_cache_size -= trg_entry->delta_size; to account for the old delta going away, but it does this *after* having already replaced trg_entry->delta_size with the new delta entry. I suspect there are other issues going on too, but that's the one that I noticed from a quick look-through. Nico? I think this one is yours.. Linus ^ permalink raw reply [flat|nested] 82+ messages in thread
* [PATCH] pack-objects: fix delta cache size accounting 2007-12-08 0:37 ` Linus Torvalds @ 2007-12-08 1:27 ` Nicolas Pitre 0 siblings, 0 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-08 1:27 UTC (permalink / raw To: Junio C Hamano; +Cc: Linus Torvalds, Jon Smirl, Git Mailing List The wrong value was substracted from delta_cache_size when replacing a cached delta, as trg_entry->delta_size was used after the old size had been replaced by the new size. Noticed by Linus. Signed-off-by: Nicolas Pitre <nico@cam.org> --- On Fri, 7 Dec 2007, Linus Torvalds wrote: > The code in try_delta() that replaces a delta cache entry with another one > seems very buggy wrt that whole "delta_cache_size" update. It does > > delta_cache_size -= trg_entry->delta_size; > > to account for the old delta going away, but it does this *after* having > already replaced trg_entry->delta_size with the new delta entry. Doh! Mea culpa. diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index 4f44658..350ece4 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -1422,10 +1422,6 @@ static int try_delta(struct unpacked *trg, struct unpacked *src, } } - trg_entry->delta = src_entry; - trg_entry->delta_size = delta_size; - trg->depth = src->depth + 1; - /* * Handle memory allocation outside of the cache * accounting lock. Compiler will optimize the strangeness @@ -1439,7 +1435,7 @@ static int try_delta(struct unpacked *trg, struct unpacked *src, trg_entry->delta_data = NULL; } if (delta_cacheable(src_size, trg_size, delta_size)) { - delta_cache_size += trg_entry->delta_size; + delta_cache_size += delta_size; cache_unlock(); trg_entry->delta_data = xrealloc(delta_buf, delta_size); } else { @@ -1447,6 +1443,10 @@ static int try_delta(struct unpacked *trg, struct unpacked *src, free(delta_buf); } + trg_entry->delta = src_entry; + trg_entry->delta_size = delta_size; + trg->depth = src->depth + 1; + return 1; } ^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-07 23:05 Something is broken in repack Jon Smirl 2007-12-08 0:37 ` Linus Torvalds @ 2007-12-08 1:46 ` Nicolas Pitre 2007-12-08 2:04 ` Jon Smirl ` (3 more replies) 2007-12-08 2:56 ` David Brown ` (2 subsequent siblings) 4 siblings, 4 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-08 1:46 UTC (permalink / raw To: Jon Smirl; +Cc: Git Mailing List On Fri, 7 Dec 2007, Jon Smirl wrote: > Using this config: > [pack] > threads = 4 > deltacachesize = 256M > deltacachelimit = 0 Since you have a different result according to the source pack used then those cache settings, even if there was a bug with them, are not significant. > And the 330MB gcc pack for input > git repack -a -d -f --depth=250 --window=250 > > complete seconds RAM > 10% 47 1GB > 20% 29 1Gb > 30% 24 1Gb > 40% 18 1GB > 50% 110 1.2GB > 60% 85 1.4GB > 70% 195 1.5GB > 80% 186 2.5GB > 90% 489 3.8GB > 95% 800 4.8GB > I killed it because it started swapping > > The mmaps are only about 400MB in this case. > At the end the git process had 4.4GB of physical RAM allocated. That's really bad. > Starting from a highly compressed pack greatly aggravates the problem. That is really interesting though. > Starting with a 2GB pack of the same data my process size only grew to > 3GB with 2GB of mmaps. Which is quite reasonable, even if the same issue might still be there. So the problem seems to be related to the pack access code and not the repack code. And it must have something to do with the number of deltas being replayed. And because the repack is attempting delta compression roughly from newest to oldest, and because old objects are typically in a deeper delta chain, then this might explain the logarithmic slowdown. So something must be wrong with the delta cache in sha1_file.c somehow. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre @ 2007-12-08 2:04 ` Jon Smirl 2007-12-08 2:28 ` Nicolas Pitre 2007-12-08 2:22 ` Jon Smirl ` (2 subsequent siblings) 3 siblings, 1 reply; 82+ messages in thread From: Jon Smirl @ 2007-12-08 2:04 UTC (permalink / raw To: Nicolas Pitre; +Cc: Git Mailing List On 12/7/07, Nicolas Pitre <nico@cam.org> wrote: > On Fri, 7 Dec 2007, Jon Smirl wrote: > > > Using this config: > > [pack] > > threads = 4 > > deltacachesize = 256M > > deltacachelimit = 0 > > Since you have a different result according to the source pack used then > those cache settings, even if there was a bug with them, are not > significant. > > > And the 330MB gcc pack for input > > git repack -a -d -f --depth=250 --window=250 > > > > complete seconds RAM > > 10% 47 1GB > > 20% 29 1Gb > > 30% 24 1Gb > > 40% 18 1GB > > 50% 110 1.2GB > > 60% 85 1.4GB > > 70% 195 1.5GB > > 80% 186 2.5GB > > 90% 489 3.8GB > > 95% 800 4.8GB > > I killed it because it started swapping > > > > The mmaps are only about 400MB in this case. > > At the end the git process had 4.4GB of physical RAM allocated. > > That's really bad. > > > Starting from a highly compressed pack greatly aggravates the problem. > > That is really interesting though. > > > Starting with a 2GB pack of the same data my process size only grew to > > 3GB with 2GB of mmaps. > > Which is quite reasonable, even if the same issue might still be there. > > So the problem seems to be related to the pack access code and not the > repack code. And it must have something to do with the number of deltas > being replayed. And because the repack is attempting delta compression > roughly from newest to oldest, and because old objects are typically in > a deeper delta chain, then this might explain the logarithmic slowdown. > > So something must be wrong with the delta cache in sha1_file.c somehow. I applied the delta accounting patch. It took about 200MB of from the memory use but that doesn't make a dent in 4GB of allocations. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 2:04 ` Jon Smirl @ 2007-12-08 2:28 ` Nicolas Pitre 2007-12-08 3:29 ` Jon Smirl 0 siblings, 1 reply; 82+ messages in thread From: Nicolas Pitre @ 2007-12-08 2:28 UTC (permalink / raw To: Jon Smirl; +Cc: Git Mailing List On Fri, 7 Dec 2007, Jon Smirl wrote: > On 12/7/07, Nicolas Pitre <nico@cam.org> wrote: > > On Fri, 7 Dec 2007, Jon Smirl wrote: > > > > > git repack -a -d -f --depth=250 --window=250 > > > > > > complete seconds RAM > > > 10% 47 1GB > > > 20% 29 1Gb > > > 30% 24 1Gb > > > 40% 18 1GB > > > 50% 110 1.2GB > > > 60% 85 1.4GB > > > 70% 195 1.5GB > > > 80% 186 2.5GB > > > 90% 489 3.8GB > > > 95% 800 4.8GB > > > I killed it because it started swapping > > > > > > The mmaps are only about 400MB in this case. > > > At the end the git process had 4.4GB of physical RAM allocated. > > > > That's really bad. > > > > > Starting from a highly compressed pack greatly aggravates the problem. > > > > That is really interesting though. > > > > > Starting with a 2GB pack of the same data my process size only grew to > > > 3GB with 2GB of mmaps. > > > > Which is quite reasonable, even if the same issue might still be there. > > > > So the problem seems to be related to the pack access code and not the > > repack code. And it must have something to do with the number of deltas > > being replayed. And because the repack is attempting delta compression > > roughly from newest to oldest, and because old objects are typically in > > a deeper delta chain, then this might explain the logarithmic slowdown. > > > > So something must be wrong with the delta cache in sha1_file.c somehow. Staring at the cache code I don't see anything wrong with it. > I applied the delta accounting patch. It took about 200MB of from the > memory use but that doesn't make a dent in 4GB of allocations. Right. I didn't expect much from that fix. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 2:28 ` Nicolas Pitre @ 2007-12-08 3:29 ` Jon Smirl 2007-12-08 3:37 ` David Brown 2007-12-08 3:48 ` Harvey Harrison 0 siblings, 2 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-08 3:29 UTC (permalink / raw To: Nicolas Pitre; +Cc: Git Mailing List The kernel repo has the same problem but not nearly as bad. Starting from a default pack git repack -a -d -f --depth=1000 --window=1000 Uses 1GB of physical memory Now do the command again. git repack -a -d -f --depth=1000 --window=1000 Uses 1.3GB of physical memory I suspect the gcc repo has much longer revision chains than the kernel one since the kernel repo is only a few years old. The Mozilla repo contained revision chains with over 2,000 revisions. Longer revision chains result in longer delta chains. So what is allocating the extra memory? Either a function of the number of entries in the chain, or related to accessing the chain since a chain with more entries will need to be accessed more times. I have a 168MB kernel pack now after 15 minutes of four cores at 100%. Here's another observation, the gcc objects are larger. Kernel has 650K objects in 190MB, gcc has 870K objects in 330MB. Average gcc object is 30% larger. How should the average kernel developer interpret this? -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 3:29 ` Jon Smirl @ 2007-12-08 3:37 ` David Brown 2007-12-08 4:22 ` Jon Smirl 2007-12-08 3:48 ` Harvey Harrison 1 sibling, 1 reply; 82+ messages in thread From: David Brown @ 2007-12-08 3:37 UTC (permalink / raw To: Jon Smirl; +Cc: Nicolas Pitre, Git Mailing List On Fri, Dec 07, 2007 at 10:29:31PM -0500, Jon Smirl wrote: >The kernel repo has the same problem but not nearly as bad. > >Starting from a default pack > git repack -a -d -f --depth=1000 --window=1000 >Uses 1GB of physical memory > >Now do the command again. > git repack -a -d -f --depth=1000 --window=1000 >Uses 1.3GB of physical memory With my repo that contains a bunch of 50MB tarfiles, I've found I must specify --window-memory as well to keep repack from using nearly unbounded amounts of memory. Perhaps it is the larger files found in gcc that provokes this. A window size of 1000 can take a lot of memory if the objects are large. Dave ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 3:37 ` David Brown @ 2007-12-08 4:22 ` Jon Smirl 2007-12-08 4:30 ` Nicolas Pitre 0 siblings, 1 reply; 82+ messages in thread From: Jon Smirl @ 2007-12-08 4:22 UTC (permalink / raw To: David Brown, Nicolas Pitre, Git Mailing List On 12/7/07, David Brown <git@davidb.org> wrote: > On Fri, Dec 07, 2007 at 10:29:31PM -0500, Jon Smirl wrote: > >The kernel repo has the same problem but not nearly as bad. > > > >Starting from a default pack > > git repack -a -d -f --depth=1000 --window=1000 > >Uses 1GB of physical memory > > > >Now do the command again. > > git repack -a -d -f --depth=1000 --window=1000 > >Uses 1.3GB of physical memory > > With my repo that contains a bunch of 50MB tarfiles, I've found I must > specify --window-memory as well to keep repack from using nearly unbounded > amounts of memory. Perhaps it is the larger files found in gcc that > provokes this. > > A window size of 1000 can take a lot of memory if the objects are large. This is a partial solution to the problem. Adding window size =256M took memory consumption down from 4.8GB to 2.8GB. It took an hour to run the test. It not the complete solution since my git process is still using 2.4GB physical memory. I also still experiencing a lot of slow down in the last 10%. Does the gcc repo contain some giant objects? Why wasn't the memory freed after their chain was processed? Most of the last 10% is being done on a single CPU. There must be a chain of giant objects that is unbalancing everything. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 4:22 ` Jon Smirl @ 2007-12-08 4:30 ` Nicolas Pitre 2007-12-08 5:01 ` Jon Smirl 0 siblings, 1 reply; 82+ messages in thread From: Nicolas Pitre @ 2007-12-08 4:30 UTC (permalink / raw To: Jon Smirl; +Cc: David Brown, Git Mailing List On Fri, 7 Dec 2007, Jon Smirl wrote: > Does the gcc repo contain some giant objects? Why wasn't the memory > freed after their chain was processed? It should be. > Most of the last 10% is being done on a single CPU. There must be a > chain of giant objects that is unbalancing everything. I'm about to send a patch to fix the thread balancing for real this time. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 4:30 ` Nicolas Pitre @ 2007-12-08 5:01 ` Jon Smirl 2007-12-08 5:12 ` Nicolas Pitre 0 siblings, 1 reply; 82+ messages in thread From: Jon Smirl @ 2007-12-08 5:01 UTC (permalink / raw To: Nicolas Pitre; +Cc: David Brown, Git Mailing List On 12/7/07, Nicolas Pitre <nico@cam.org> wrote: > On Fri, 7 Dec 2007, Jon Smirl wrote: > > > Does the gcc repo contain some giant objects? Why wasn't the memory > > freed after their chain was processed? > > It should be. > > > Most of the last 10% is being done on a single CPU. There must be a > > chain of giant objects that is unbalancing everything. > > I'm about to send a patch to fix the thread balancing for real this > time. Something is really broken in the last 5% of that repo. I have been processing at 97% for 30 minutes without moving to 98%. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 5:01 ` Jon Smirl @ 2007-12-08 5:12 ` Nicolas Pitre 0 siblings, 0 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-08 5:12 UTC (permalink / raw To: Jon Smirl; +Cc: Git Mailing List On Sat, 8 Dec 2007, Jon Smirl wrote: > On 12/7/07, Nicolas Pitre <nico@cam.org> wrote: > > On Fri, 7 Dec 2007, Jon Smirl wrote: > > > > > Does the gcc repo contain some giant objects? Why wasn't the memory > > > freed after their chain was processed? > > > > It should be. > > > > > Most of the last 10% is being done on a single CPU. There must be a > > > chain of giant objects that is unbalancing everything. > > > > I'm about to send a patch to fix the thread balancing for real this > > time. > > Something is really broken in the last 5% of that repo. I have been > processing at 97% for 30 minutes without moving to 98%. This is a clear sign of a problem, indeed. I'll be away for the weekend, so here's a few things to try out if you feel like it: 1) Make sure the problem occurs with the thread code disabled. That would eliminate one variable, and will help for #2. 2) Try bissecting the issue. If you can find an old Git version where the issue doesn't appear then simply run "git bissect" to find the exact commit causing the problem. Best with a repo that doesn't take ages to repack. 3) Compile Git against the dmalloc library in order to identify where the huge memory leak is happening. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 3:29 ` Jon Smirl 2007-12-08 3:37 ` David Brown @ 2007-12-08 3:48 ` Harvey Harrison 1 sibling, 0 replies; 82+ messages in thread From: Harvey Harrison @ 2007-12-08 3:48 UTC (permalink / raw To: Jon Smirl; +Cc: Nicolas Pitre, Git Mailing List On Fri, 2007-12-07 at 22:29 -0500, Jon Smirl wrote: > The kernel repo has the same problem but not nearly as bad. > > Starting from a default pack > git repack -a -d -f --depth=1000 --window=1000 > Uses 1GB of physical memory > > Now do the command again. > git repack -a -d -f --depth=1000 --window=1000 > Uses 1.3GB of physical memory > > I suspect the gcc repo has much longer revision chains than the kernel > one since the kernel repo is only a few years old. The Mozilla repo > contained revision chains with over 2,000 revisions. Longer revision > chains result in longer delta chains. I sent out a partial delta breakdown for the gcc repo earlier, here's the whole list. breakdown of the gcc packfile: Total objects 1017922 ChainLength Objects Cumulative 1: 103817 103817 2: 67332 171149 3: 57520 228669 4: 52570 281239 5: 43910 325149 6: 37520 362669 7: 35248 397917 8: 29819 427736 9: 27619 455355 10: 22656 478011 11: 21073 499084 12: 18738 517822 13: 16674 534496 14: 14882 549378 15: 14424 563802 16: 12765 576567 17: 11662 588229 18: 11845 600074 19: 11694 611768 20: 9625 621393 21: 9031 630424 22: 8437 638861 23: 8217 647078 24: 7927 655005 25: 7955 662960 26: 7092 670052 27: 7004 677056 28: 6724 683780 29: 6626 690406 30: 5875 696281 31: 5970 702251 32: 5726 707977 33: 6025 714002 34: 5354 719356 35: 6413 725769 36: 4933 730702 37: 4888 735590 38: 4561 740151 39: 4366 744517 40: 4166 748683 41: 4531 753214 42: 4029 757243 43: 3701 760944 44: 3647 764591 45: 3553 768144 46: 3509 771653 47: 3473 775126 48: 3442 778568 49: 3379 781947 50: 3395 785342 51: 3315 788657 52: 3168 791825 53: 3345 795170 54: 3166 798336 55: 3237 801573 56: 2795 804368 57: 2768 807136 58: 2666 809802 59: 2723 812525 60: 2547 815072 61: 2565 817637 62: 2622 820259 63: 2521 822780 64: 2492 825272 65: 2529 827801 66: 2566 830367 67: 2685 833052 68: 2458 835510 69: 2457 837967 70: 2440 840407 71: 2410 842817 72: 2337 845154 73: 2301 847455 74: 2201 849656 75: 2127 851783 76: 2256 854039 77: 2038 856077 78: 1925 858002 79: 1965 859967 80: 1929 861896 81: 1890 863786 82: 1873 865659 83: 1964 867623 84: 1898 869521 85: 1839 871360 86: 1933 873293 87: 1876 875169 88: 1851 877020 89: 1789 878809 90: 1790 880599 91: 1804 882403 92: 1696 884099 93: 1863 885962 94: 1889 887851 95: 1766 889617 96: 1731 891348 97: 1775 893123 98: 1750 894873 99: 1767 896640 100: 1644 898284 101: 1642 899926 102: 1489 901415 103: 1532 902947 104: 1564 904511 105: 1477 905988 106: 1461 907449 107: 1383 908832 108: 1422 910254 109: 1316 911570 110: 1480 913050 111: 1329 914379 112: 1375 915754 113: 1292 917046 114: 1224 918270 115: 1123 919393 116: 1216 920609 117: 1252 921861 118: 1252 923113 119: 1346 924459 120: 1320 925779 121: 1277 927056 122: 1234 928290 123: 1200 929490 124: 1255 930745 125: 1206 931951 126: 1155 933106 127: 1246 934352 128: 1226 935578 129: 1194 936772 130: 1268 938040 131: 1334 939374 132: 1146 940520 133: 1220 941740 134: 1055 942795 135: 1110 943905 136: 1095 945000 137: 1294 946294 138: 1204 947498 139: 1218 948716 140: 1101 949817 141: 993 950810 142: 975 951785 143: 1014 952799 144: 968 953767 145: 957 954724 146: 1069 955793 147: 996 956789 148: 967 957756 149: 964 958720 150: 954 959674 151: 949 960623 152: 1001 961624 153: 1042 962666 154: 1057 963723 155: 948 964671 156: 966 965637 157: 833 966470 158: 959 967429 159: 907 968336 160: 854 969190 161: 847 970037 162: 836 970873 163: 769 971642 164: 747 972389 165: 755 973144 166: 707 973851 167: 774 974625 168: 777 975402 169: 783 976185 170: 707 976892 171: 738 977630 172: 775 978405 173: 781 979186 174: 698 979884 175: 801 980685 176: 712 981397 177: 679 982076 178: 775 982851 179: 696 983547 180: 760 984307 181: 740 985047 182: 752 985799 183: 704 986503 184: 683 987186 185: 690 987876 186: 741 988617 187: 642 989259 188: 672 989931 189: 679 990610 190: 691 991301 191: 648 991949 192: 703 992652 193: 675 993327 194: 687 994014 195: 625 994639 196: 607 995246 197: 583 995829 198: 632 996461 199: 540 997001 200: 652 997653 201: 600 998253 202: 628 998881 203: 624 999505 204: 582 1000087 205: 548 1000635 206: 520 1001155 207: 648 1001803 208: 556 1002359 209: 563 1002922 210: 508 1003430 211: 570 1004000 212: 530 1004530 213: 575 1005105 214: 527 1005632 215: 521 1006153 216: 515 1006668 217: 513 1007181 218: 460 1007641 219: 491 1008132 220: 474 1008606 221: 471 1009077 222: 482 1009559 223: 485 1010044 224: 439 1010483 225: 385 1010868 226: 385 1011253 227: 403 1011656 228: 380 1012036 229: 376 1012412 230: 377 1012789 231: 415 1013204 232: 394 1013598 233: 362 1013960 234: 334 1014294 235: 366 1014660 236: 317 1014977 237: 362 1015339 238: 343 1015682 239: 392 1016074 240: 317 1016391 241: 305 1016696 242: 319 1017015 243: 276 1017291 244: 247 1017538 245: 179 1017717 246: 111 1017828 247: 61 1017889 248: 27 1017916 249: 6 1017922 Harvey ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre 2007-12-08 2:04 ` Jon Smirl @ 2007-12-08 2:22 ` Jon Smirl 2007-12-08 3:44 ` Harvey Harrison 2007-12-08 22:18 ` Junio C Hamano 3 siblings, 0 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-08 2:22 UTC (permalink / raw To: Nicolas Pitre; +Cc: Git Mailing List On 12/7/07, Nicolas Pitre <nico@cam.org> wrote: > So the problem seems to be related to the pack access code and not the > repack code. And it must have something to do with the number of deltas > being replayed. And because the repack is attempting delta compression > roughly from newest to oldest, and because old objects are typically in > a deeper delta chain, then this might explain the logarithmic slowdown. What could be wrongly allocating 4GB of memory? Figure that out and you should have your answer. The slow down may be coming from having to search through more and more objects in memory. Memory consumption seem to be correlated to the depth of the delta chain being accessed. It blows up tremendously right at the end. It may even be a square of the length of the chain length. For the normal default case the square didn't hurt, but 250*250 = 62,500 which would eat a huge amount of memory. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre 2007-12-08 2:04 ` Jon Smirl 2007-12-08 2:22 ` Jon Smirl @ 2007-12-08 3:44 ` Harvey Harrison 2007-12-08 22:18 ` Junio C Hamano 3 siblings, 0 replies; 82+ messages in thread From: Harvey Harrison @ 2007-12-08 3:44 UTC (permalink / raw To: Nicolas Pitre; +Cc: Jon Smirl, Git Mailing List On Fri, 2007-12-07 at 20:46 -0500, Nicolas Pitre wrote: > On Fri, 7 Dec 2007, Jon Smirl wrote: > > And the 330MB gcc pack for input > > git repack -a -d -f --depth=250 --window=250 > > > > complete seconds RAM > > 10% 47 1GB > > 20% 29 1Gb > > 30% 24 1Gb > > 40% 18 1GB > > 50% 110 1.2GB > > 60% 85 1.4GB > > 70% 195 1.5GB > > 80% 186 2.5GB > > 90% 489 3.8GB > > 95% 800 4.8GB > > I killed it because it started swapping > > > > The mmaps are only about 400MB in this case. > > At the end the git process had 4.4GB of physical RAM allocated. > > Starting with a 2GB pack of the same data my process size only grew to > > 3GB with 2GB of mmaps. > > Which is quite reasonable, even if the same issue might still be there. > > So the problem seems to be related to the pack access code and not the > repack code. And it must have something to do with the number of deltas > being replayed. And because the repack is attempting delta compression > roughly from newest to oldest, and because old objects are typically in > a deeper delta chain, then this might explain the logarithmic slowdown. > > So something must be wrong with the delta cache in sha1_file.c somehow. All I have is a qualitative observation, but during the process of creating the pack, there was a _huge_ slowdown between 10-15% (hundreds/dozens per second to single object per second and a corresponding increase in process size). Didn't keep any numbers at the time, but it was noticable. I wonder if there are a bunch of huge objects somewhere in gcc's history? Harvey ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre ` (2 preceding siblings ...) 2007-12-08 3:44 ` Harvey Harrison @ 2007-12-08 22:18 ` Junio C Hamano 2007-12-09 8:05 ` Junio C Hamano 2007-12-10 2:49 ` Nicolas Pitre 3 siblings, 2 replies; 82+ messages in thread From: Junio C Hamano @ 2007-12-08 22:18 UTC (permalink / raw To: Nicolas Pitre; +Cc: Jon Smirl, Git Mailing List Nicolas Pitre <nico@cam.org> writes: > On Fri, 7 Dec 2007, Jon Smirl wrote: > >> Starting with a 2GB pack of the same data my process size only grew to >> 3GB with 2GB of mmaps. > > Which is quite reasonable, even if the same issue might still be there. > > So the problem seems to be related to the pack access code and not the > repack code. And it must have something to do with the number of deltas > being replayed. And because the repack is attempting delta compression > roughly from newest to oldest, and because old objects are typically in > a deeper delta chain, then this might explain the logarithmic slowdown. > > So something must be wrong with the delta cache in sha1_file.c somehow. I was reaching the same conclusion but haven't managed to spot anything blatantly wrong in that area. Will need to dig more. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 22:18 ` Junio C Hamano @ 2007-12-09 8:05 ` Junio C Hamano 2007-12-09 15:19 ` Jon Smirl 2007-12-09 18:25 ` Jon Smirl 2007-12-10 2:49 ` Nicolas Pitre 1 sibling, 2 replies; 82+ messages in thread From: Junio C Hamano @ 2007-12-09 8:05 UTC (permalink / raw To: Jon Smirl; +Cc: Nicolas Pitre, Git Mailing List Junio C Hamano <gitster@pobox.com> writes: > Nicolas Pitre <nico@cam.org> writes: > >> On Fri, 7 Dec 2007, Jon Smirl wrote: >> >>> Starting with a 2GB pack of the same data my process size only grew to >>> 3GB with 2GB of mmaps. >> >> Which is quite reasonable, even if the same issue might still be there. >> >> So the problem seems to be related to the pack access code and not the >> repack code. And it must have something to do with the number of deltas >> being replayed. And because the repack is attempting delta compression >> roughly from newest to oldest, and because old objects are typically in >> a deeper delta chain, then this might explain the logarithmic slowdown. >> >> So something must be wrong with the delta cache in sha1_file.c somehow. > > I was reaching the same conclusion but haven't managed to spot anything > blatantly wrong in that area. Will need to dig more. Does this problem have correlation with the use of threads? Do you see the same bloat with or without THREADED_DELTA_SEARCH defined? ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-09 8:05 ` Junio C Hamano @ 2007-12-09 15:19 ` Jon Smirl 2007-12-09 18:25 ` Jon Smirl 1 sibling, 0 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-09 15:19 UTC (permalink / raw To: Junio C Hamano; +Cc: Nicolas Pitre, Git Mailing List On 12/9/07, Junio C Hamano <gitster@pobox.com> wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > Nicolas Pitre <nico@cam.org> writes: > > > >> On Fri, 7 Dec 2007, Jon Smirl wrote: > >> > >>> Starting with a 2GB pack of the same data my process size only grew to > >>> 3GB with 2GB of mmaps. > >> > >> Which is quite reasonable, even if the same issue might still be there. > >> > >> So the problem seems to be related to the pack access code and not the > >> repack code. And it must have something to do with the number of deltas > >> being replayed. And because the repack is attempting delta compression > >> roughly from newest to oldest, and because old objects are typically in > >> a deeper delta chain, then this might explain the logarithmic slowdown. > >> > >> So something must be wrong with the delta cache in sha1_file.c somehow. > > > > I was reaching the same conclusion but haven't managed to spot anything > > blatantly wrong in that area. Will need to dig more. > > Does this problem have correlation with the use of threads? Do you see > the same bloat with or without THREADED_DELTA_SEARCH defined? > I just started a non-threaded one. It will be four or five hours before it finishes. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-09 8:05 ` Junio C Hamano 2007-12-09 15:19 ` Jon Smirl @ 2007-12-09 18:25 ` Jon Smirl 2007-12-10 1:07 ` Nicolas Pitre 1 sibling, 1 reply; 82+ messages in thread From: Jon Smirl @ 2007-12-09 18:25 UTC (permalink / raw To: Junio C Hamano; +Cc: Nicolas Pitre, Git Mailing List On 12/9/07, Junio C Hamano <gitster@pobox.com> wrote: > Junio C Hamano <gitster@pobox.com> writes: > > > Nicolas Pitre <nico@cam.org> writes: > > > >> On Fri, 7 Dec 2007, Jon Smirl wrote: > >> > >>> Starting with a 2GB pack of the same data my process size only grew to > >>> 3GB with 2GB of mmaps. > >> > >> Which is quite reasonable, even if the same issue might still be there. > >> > >> So the problem seems to be related to the pack access code and not the > >> repack code. And it must have something to do with the number of deltas > >> being replayed. And because the repack is attempting delta compression > >> roughly from newest to oldest, and because old objects are typically in > >> a deeper delta chain, then this might explain the logarithmic slowdown. > >> > >> So something must be wrong with the delta cache in sha1_file.c somehow. > > > > I was reaching the same conclusion but haven't managed to spot anything > > blatantly wrong in that area. Will need to dig more. > > Does this problem have correlation with the use of threads? Do you see > the same bloat with or without THREADED_DELTA_SEARCH defined? > Something else seems to be wrong. With threading turned off, 5000 CPU seconds and 13% done. With threading turned on, threads = 1, 5000 CPU seconds, 13% With threading turned on, threads = 2, 180 CPU seconds, 13% With threading turned on, threads = 4, 150 CPU seconds, 13% This can't be right, four cores are not 40x one core. So maybe the observed logarithmic slow down is because the percent complete is being reported wrong in the threaded case. If that's the case we may be looking in the wrong place for problems. The times are only approximate, I'm using the CPU for other things. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-09 18:25 ` Jon Smirl @ 2007-12-10 1:07 ` Nicolas Pitre 0 siblings, 0 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-10 1:07 UTC (permalink / raw To: Jon Smirl; +Cc: Junio C Hamano, Git Mailing List On Sun, 9 Dec 2007, Jon Smirl wrote: > On 12/9/07, Junio C Hamano <gitster@pobox.com> wrote: > > Junio C Hamano <gitster@pobox.com> writes: > > > > > Nicolas Pitre <nico@cam.org> writes: > > > > > >> On Fri, 7 Dec 2007, Jon Smirl wrote: > > >> > > >>> Starting with a 2GB pack of the same data my process size only grew to > > >>> 3GB with 2GB of mmaps. > > >> > > >> Which is quite reasonable, even if the same issue might still be there. > > >> > > >> So the problem seems to be related to the pack access code and not the > > >> repack code. And it must have something to do with the number of deltas > > >> being replayed. And because the repack is attempting delta compression > > >> roughly from newest to oldest, and because old objects are typically in > > >> a deeper delta chain, then this might explain the logarithmic slowdown. > > >> > > >> So something must be wrong with the delta cache in sha1_file.c somehow. > > > > > > I was reaching the same conclusion but haven't managed to spot anything > > > blatantly wrong in that area. Will need to dig more. > > > > Does this problem have correlation with the use of threads? Do you see > > the same bloat with or without THREADED_DELTA_SEARCH defined? > > > > Something else seems to be wrong. > > With threading turned off, 5000 CPU seconds and 13% done. > With threading turned on, threads = 1, 5000 CPU seconds, 13% > With threading turned on, threads = 2, 180 CPU seconds, 13% > With threading turned on, threads = 4, 150 CPU seconds, 13% > > This can't be right, four cores are not 40x one core. It may be right. The object list to apply delta compression on doesn't necessarily require a uniform amount of cycles throughout. When using multiple threads, the list is broken in parts for each thread, and later parts might end up being simply much easier to process, therefore changing the percentage figure. > So maybe the observed logarithmic slow down is because the percent > complete is being reported wrong in the threaded case. If that's the > case we may be looking in the wrong place for problems. I really doubt it. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-08 22:18 ` Junio C Hamano 2007-12-09 8:05 ` Junio C Hamano @ 2007-12-10 2:49 ` Nicolas Pitre 1 sibling, 0 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-10 2:49 UTC (permalink / raw To: Junio C Hamano; +Cc: Jon Smirl, Git Mailing List On Sat, 8 Dec 2007, Junio C Hamano wrote: > Nicolas Pitre <nico@cam.org> writes: > > > On Fri, 7 Dec 2007, Jon Smirl wrote: > > > >> Starting with a 2GB pack of the same data my process size only grew to > >> 3GB with 2GB of mmaps. > > > > Which is quite reasonable, even if the same issue might still be there. > > > > So the problem seems to be related to the pack access code and not the > > repack code. And it must have something to do with the number of deltas > > being replayed. And because the repack is attempting delta compression > > roughly from newest to oldest, and because old objects are typically in > > a deeper delta chain, then this might explain the logarithmic slowdown. > > > > So something must be wrong with the delta cache in sha1_file.c somehow. > > I was reaching the same conclusion but haven't managed to spot anything > blatantly wrong in that area. Will need to dig more. I didn't find anything wrong there either. I'll have to run some more gcc repacking tests myself, despite not having a blazingly fast machine making for rather long turnarounds. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-07 23:05 Something is broken in repack Jon Smirl 2007-12-08 0:37 ` Linus Torvalds 2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre @ 2007-12-08 2:56 ` David Brown 2007-12-10 19:56 ` Nicolas Pitre 2007-12-11 2:25 ` Jon Smirl 4 siblings, 0 replies; 82+ messages in thread From: David Brown @ 2007-12-08 2:56 UTC (permalink / raw To: Jon Smirl; +Cc: Git Mailing List On Fri, Dec 07, 2007 at 06:05:38PM -0500, Jon Smirl wrote: >Using this config: >[pack] > threads = 4 > deltacachesize = 256M > deltacachelimit = 0 Just out of curiousity, does adding [pack] windowmemory = 256M help. I've found this to grow very large when there are large blobs. Dave ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-07 23:05 Something is broken in repack Jon Smirl ` (2 preceding siblings ...) 2007-12-08 2:56 ` David Brown @ 2007-12-10 19:56 ` Nicolas Pitre 2007-12-10 20:05 ` Jon Smirl 2007-12-11 2:25 ` Jon Smirl 4 siblings, 1 reply; 82+ messages in thread From: Nicolas Pitre @ 2007-12-10 19:56 UTC (permalink / raw To: Jon Smirl; +Cc: Git Mailing List On Fri, 7 Dec 2007, Jon Smirl wrote: > Using this config: > [pack] > threads = 4 > deltacachesize = 256M > deltacachelimit = 0 > > And the 330MB gcc pack for input > git repack -a -d -f --depth=250 --window=250 > > complete seconds RAM > 10% 47 1GB > 20% 29 1Gb > 30% 24 1Gb > 40% 18 1GB > 50% 110 1.2GB > 60% 85 1.4GB > 70% 195 1.5GB > 80% 186 2.5GB > 90% 489 3.8GB > 95% 800 4.8GB > I killed it because it started swapping > > The mmaps are only about 400MB in this case. > At the end the git process had 4.4GB of physical RAM allocated. > > Starting from a highly compressed pack greatly aggravates the problem. > Starting with a 2GB pack of the same data my process size only grew to > 3GB with 2GB of mmaps. You said having reproduced the issue, albeit not as severe, with the Linux kernel repo. I did just that: # to get the default pack: $ git repack -a -f -d # first measurement with a repack from a default pack $ /usr/bin/time git repack -a -f --window=256 --depth=256 2572.17user 5.87system 22:46.80elapsed 188%CPU (0avgtext+0avgdata 0maxresident)k 15720inputs+356640outputs (71major+264376minor)pagefaults 0swaps # do it again to start from a highly packed pack $ /usr/bin/time git repack -a -f --window=256 --depth=256 2573.53user 5.62system 22:45.60elapsed 188%CPU (0avgtext+0avgdata 0maxresident)k 29176inputs+356664outputs (210major+274887minor)pagefaults 0swaps This is with pack.threads=2 on a P4 with HT, and I'm using the machine for other tasks as well, but all measured time is sensibly the same for both cases. Virtual memory allocation never reached 700MB in both cases either. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-10 19:56 ` Nicolas Pitre @ 2007-12-10 20:05 ` Jon Smirl 2007-12-10 20:16 ` Morten Welinder 0 siblings, 1 reply; 82+ messages in thread From: Jon Smirl @ 2007-12-10 20:05 UTC (permalink / raw To: Nicolas Pitre; +Cc: Git Mailing List On 12/10/07, Nicolas Pitre <nico@cam.org> wrote: > On Fri, 7 Dec 2007, Jon Smirl wrote: > > > Using this config: > > [pack] > > threads = 4 > > deltacachesize = 256M > > deltacachelimit = 0 > > > > And the 330MB gcc pack for input > > git repack -a -d -f --depth=250 --window=250 > > > > complete seconds RAM > > 10% 47 1GB > > 20% 29 1Gb > > 30% 24 1Gb > > 40% 18 1GB > > 50% 110 1.2GB > > 60% 85 1.4GB > > 70% 195 1.5GB > > 80% 186 2.5GB > > 90% 489 3.8GB > > 95% 800 4.8GB > > I killed it because it started swapping > > > > The mmaps are only about 400MB in this case. > > At the end the git process had 4.4GB of physical RAM allocated. > > > > Starting from a highly compressed pack greatly aggravates the problem. > > Starting with a 2GB pack of the same data my process size only grew to > > 3GB with 2GB of mmaps. > > You said having reproduced the issue, albeit not as severe, with the > Linux kernel repo. I did just that: > > # to get the default pack: > $ git repack -a -f -d > > # first measurement with a repack from a default pack > $ /usr/bin/time git repack -a -f --window=256 --depth=256 > 2572.17user 5.87system 22:46.80elapsed 188%CPU (0avgtext+0avgdata 0maxresident)k > 15720inputs+356640outputs (71major+264376minor)pagefaults 0swaps > > # do it again to start from a highly packed pack > $ /usr/bin/time git repack -a -f --window=256 --depth=256 > 2573.53user 5.62system 22:45.60elapsed 188%CPU (0avgtext+0avgdata 0maxresident)k > 29176inputs+356664outputs (210major+274887minor)pagefaults 0swaps > > This is with pack.threads=2 on a P4 with HT, and I'm using the machine > for other tasks as well, but all measured time is sensibly the same for > both cases. Virtual memory allocation never reached 700MB in both cases > either. > This is the mail about the kernel pack, the one you quoted is a gcc run. The kernel repo has the same problem but not nearly as bad. Starting from a default pack git repack -a -d -f --depth=1000 --window=1000 Uses 1GB of physical memory Now do the command again. git repack -a -d -f --depth=1000 --window=1000 Uses 1.3GB of physical memory I suspect the gcc repo has much longer revision chains than the kernel one since the kernel repo is only a few years old. The Mozilla repo contained revision chains with over 2,000 revisions. Longer revision chains result in longer delta chains. So what is allocating the extra memory? Either a function of the number of entries in the chain, or related to accessing the chain since a chain with more entries will need to be accessed more times. I have a 168MB kernel pack now after 15 minutes of four cores at 100%. Here's another observation, the gcc objects are larger. Kernel has 650K objects in 190MB, gcc has 870K objects in 330MB. Average gcc object is 30% larger. How should the average kernel developer interpret this? > > Nicolas > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-10 20:05 ` Jon Smirl @ 2007-12-10 20:16 ` Morten Welinder 0 siblings, 0 replies; 82+ messages in thread From: Morten Welinder @ 2007-12-10 20:16 UTC (permalink / raw To: Jon Smirl; +Cc: Nicolas Pitre, Git Mailing List > Here's another observation, the gcc objects are larger. Kernel has > 650K objects in 190MB, gcc has 870K objects in 330MB. Average gcc > object is 30% larger. How should the average kernel developer > interpret this? Could this be explained by the ChangeLog file? It's large; it has tons of revisions; it is a prime candidate for delta compression. Morten ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-07 23:05 Something is broken in repack Jon Smirl ` (3 preceding siblings ...) 2007-12-10 19:56 ` Nicolas Pitre @ 2007-12-11 2:25 ` Jon Smirl 2007-12-11 2:55 ` Junio C Hamano 2007-12-11 3:49 ` Nicolas Pitre 4 siblings, 2 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-11 2:25 UTC (permalink / raw To: Git Mailing List, Nicolas Pitre New run using same configuration. With the addition of the more efficient load balancing patches and delta cache accounting. Seconds are wall clock time. They are lower since the patch made threading better at using all four cores. I am stuck at 380-390% CPU utilization for the git process. complete seconds RAM 10% 60 900M (includes counting) 20% 15 900M 30% 15 900M 40% 50 1.2G 50% 80 1.3G 60% 70 1.7G 70% 140 1.8G 80% 180 2.0G 90% 280 2.2G 95% 530 2.8G - 1,420 total to here, previous was 1,983 100% 1390 2.85G During the writing phase RAM fell to 1.6G What is being freed in the writing phase?? I have no explanation for the change in RAM usage. Two guesses come to mind. Memory fragmentation. Or the change in the way the work was split up altered RAM usage. Total CPU time was 195 minutes in 70 minutes clock time. About 70% efficient. During the compress phase all four cores were active until the last 90 seconds. Writing the objects took over 23 minutes CPU bound on one core. New pack file is: 270,594,853 Old one was: 344,543,752 It still has 828,660 objects On 12/7/07, Jon Smirl <jonsmirl@gmail.com> wrote: > Using this config: > [pack] > threads = 4 > deltacachesize = 256M > deltacachelimit = 0 > > And the 330MB gcc pack for input > git repack -a -d -f --depth=250 --window=250 > > complete seconds RAM > 10% 47 1GB > 20% 29 1Gb > 30% 24 1Gb > 40% 18 1GB > 50% 110 1.2GB > 60% 85 1.4GB > 70% 195 1.5GB > 80% 186 2.5GB > 90% 489 3.8GB > 95% 800 4.8GB > I killed it because it started swapping > > The mmaps are only about 400MB in this case. > At the end the git process had 4.4GB of physical RAM allocated. > > Starting from a highly compressed pack greatly aggravates the problem. > Starting with a 2GB pack of the same data my process size only grew to > 3GB with 2GB of mmaps. > > -- > Jon Smirl > jonsmirl@gmail.com > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 2:25 ` Jon Smirl @ 2007-12-11 2:55 ` Junio C Hamano 2007-12-11 3:27 ` Nicolas Pitre 2007-12-11 3:49 ` Nicolas Pitre 1 sibling, 1 reply; 82+ messages in thread From: Junio C Hamano @ 2007-12-11 2:55 UTC (permalink / raw To: Jon Smirl; +Cc: Git Mailing List, Nicolas Pitre "Jon Smirl" <jonsmirl@gmail.com> writes: > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > 100% 1390 2.85G > During the writing phase RAM fell to 1.6G > What is being freed in the writing phase?? entry->delta_data is the only thing I can think of that are freed in the function that have been allocated much earlier before entering the function. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 2:55 ` Junio C Hamano @ 2007-12-11 3:27 ` Nicolas Pitre 2007-12-11 11:08 ` David Kastrup 0 siblings, 1 reply; 82+ messages in thread From: Nicolas Pitre @ 2007-12-11 3:27 UTC (permalink / raw To: Junio C Hamano; +Cc: Jon Smirl, Git Mailing List On Mon, 10 Dec 2007, Junio C Hamano wrote: > "Jon Smirl" <jonsmirl@gmail.com> writes: > > > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > > 100% 1390 2.85G > > During the writing phase RAM fell to 1.6G > > What is being freed in the writing phase?? > > entry->delta_data is the only thing I can think of that are freed > in the function that have been allocated much earlier before entering > the function. Yet all ->delta-data instances are limited to 256MB according to Jon's config. Nicolas > Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 3:27 ` Nicolas Pitre @ 2007-12-11 11:08 ` David Kastrup 2007-12-11 12:08 ` Pierre Habouzit 0 siblings, 1 reply; 82+ messages in thread From: David Kastrup @ 2007-12-11 11:08 UTC (permalink / raw To: Nicolas Pitre; +Cc: Junio C Hamano, Jon Smirl, Git Mailing List Nicolas Pitre <nico@cam.org> writes: > On Mon, 10 Dec 2007, Junio C Hamano wrote: > >> "Jon Smirl" <jonsmirl@gmail.com> writes: >> >> > 95% 530 2.8G - 1,420 total to here, previous was 1,983 >> > 100% 1390 2.85G >> > During the writing phase RAM fell to 1.6G >> > What is being freed in the writing phase?? >> >> entry->delta_data is the only thing I can think of that are freed >> in the function that have been allocated much earlier before entering >> the function. > > Yet all ->delta-data instances are limited to 256MB according to Jon's > config. Maybe address space fragmentation is involved here? malloc/free for large areas works using mmap in glibc. There must be enough _contiguous_ space for a new allocation to succeed. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 11:08 ` David Kastrup @ 2007-12-11 12:08 ` Pierre Habouzit 2007-12-11 12:18 ` David Kastrup 0 siblings, 1 reply; 82+ messages in thread From: Pierre Habouzit @ 2007-12-11 12:08 UTC (permalink / raw To: David Kastrup; +Cc: Nicolas Pitre, Junio C Hamano, Jon Smirl, Git Mailing List [-- Attachment #1: Type: text/plain, Size: 1509 bytes --] On Tue, Dec 11, 2007 at 11:08:47AM +0000, David Kastrup wrote: > Nicolas Pitre <nico@cam.org> writes: > > > On Mon, 10 Dec 2007, Junio C Hamano wrote: > > > >> "Jon Smirl" <jonsmirl@gmail.com> writes: > >> > >> > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > >> > 100% 1390 2.85G > >> > During the writing phase RAM fell to 1.6G > >> > What is being freed in the writing phase?? > >> > >> entry->delta_data is the only thing I can think of that are freed > >> in the function that have been allocated much earlier before entering > >> the function. > > > > Yet all ->delta-data instances are limited to 256MB according to Jon's > > config. > > Maybe address space fragmentation is involved here? malloc/free for > large areas works using mmap in glibc. There must be enough > _contiguous_ space for a new allocation to succeed. Well, that's interesting, but there is a way to know for sure instead of taking bets. Just use valgrind --tool=massif and look at the pretty picture, it'll tell what was going on very accurately. Note that I find your explanation unlikely: glibc uses mmap for sizes over 128k by default (IIRC), and as soon as you use mmaps, that's the kernel that deals with the address space, and it's not necessarily contiguous, that's only true for the heap. -- ·O· Pierre Habouzit ··O madcoder@debian.org OOO http://www.madism.org [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 12:08 ` Pierre Habouzit @ 2007-12-11 12:18 ` David Kastrup 0 siblings, 0 replies; 82+ messages in thread From: David Kastrup @ 2007-12-11 12:18 UTC (permalink / raw To: Pierre Habouzit Cc: Nicolas Pitre, Junio C Hamano, Jon Smirl, Git Mailing List Pierre Habouzit <madcoder@artemis.madism.org> writes: > On Tue, Dec 11, 2007 at 11:08:47AM +0000, David Kastrup wrote: > >> Maybe address space fragmentation is involved here? malloc/free for >> large areas works using mmap in glibc. There must be enough >> _contiguous_ space for a new allocation to succeed. > > Note that I find your explanation unlikely: glibc uses mmap for > sizes over 128k by default (IIRC), and as soon as you use mmaps, > that's the kernel that deals with the address space, and it's not > necessarily contiguous, that's only true for the heap. Every single allocation needs to be contiguous in virtual address space and must not collide with existing virtual address space allocations. So fragmentation is at least a logistical issue. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 2:25 ` Jon Smirl 2007-12-11 2:55 ` Junio C Hamano @ 2007-12-11 3:49 ` Nicolas Pitre 2007-12-11 5:25 ` Jon Smirl 1 sibling, 1 reply; 82+ messages in thread From: Nicolas Pitre @ 2007-12-11 3:49 UTC (permalink / raw To: Jon Smirl; +Cc: Git Mailing List On Mon, 10 Dec 2007, Jon Smirl wrote: > New run using same configuration. With the addition of the more > efficient load balancing patches and delta cache accounting. > > Seconds are wall clock time. They are lower since the patch made > threading better at using all four cores. I am stuck at 380-390% CPU > utilization for the git process. > > complete seconds RAM > 10% 60 900M (includes counting) > 20% 15 900M > 30% 15 900M > 40% 50 1.2G > 50% 80 1.3G > 60% 70 1.7G > 70% 140 1.8G > 80% 180 2.0G > 90% 280 2.2G > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > 100% 1390 2.85G > During the writing phase RAM fell to 1.6G > What is being freed in the writing phase?? The cached delta results, but you put a cap of 256MB for them. Could you try again with that cache disabled entirely, with pack.deltacachesize = 1 (don't use 0 as that means unbounded). And then, while still keeping the delta cache disabled, could you try with pack.threads = 2, and pack.threads = 1 ? I'm sorry to ask you to do this but I don't have enough ram to even complete a repack with threads=2 so I'm reattempting single threaded at the moment. But I really wonder if the threading has such an effect on memory usage. > > I have no explanation for the change in RAM usage. Two guesses come to > mind. Memory fragmentation. Or the change in the way the work was > split up altered RAM usage. > > Total CPU time was 195 minutes in 70 minutes clock time. About 70% > efficient. During the compress phase all four cores were active until > the last 90 seconds. Writing the objects took over 23 minutes CPU > bound on one core. > > New pack file is: 270,594,853 > Old one was: 344,543,752 > It still has 828,660 objects You mean the pack for the gcc repo is now less than 300MB? Wow. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 3:49 ` Nicolas Pitre @ 2007-12-11 5:25 ` Jon Smirl 2007-12-11 5:29 ` Jon Smirl 2007-12-11 6:01 ` Sean 0 siblings, 2 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-11 5:25 UTC (permalink / raw To: Nicolas Pitre, Junio C Hamano; +Cc: Git Mailing List On 12/10/07, Nicolas Pitre <nico@cam.org> wrote: > On Mon, 10 Dec 2007, Jon Smirl wrote: > > > New run using same configuration. With the addition of the more > > efficient load balancing patches and delta cache accounting. > > > > Seconds are wall clock time. They are lower since the patch made > > threading better at using all four cores. I am stuck at 380-390% CPU > > utilization for the git process. > > > > complete seconds RAM > > 10% 60 900M (includes counting) > > 20% 15 900M > > 30% 15 900M > > 40% 50 1.2G > > 50% 80 1.3G > > 60% 70 1.7G > > 70% 140 1.8G > > 80% 180 2.0G > > 90% 280 2.2G > > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > > 100% 1390 2.85G > > During the writing phase RAM fell to 1.6G > > What is being freed in the writing phase?? > > The cached delta results, but you put a cap of 256MB for them. > > Could you try again with that cache disabled entirely, with > pack.deltacachesize = 1 (don't use 0 as that means unbounded). > > And then, while still keeping the delta cache disabled, could you try > with pack.threads = 2, and pack.threads = 1 ? > > I'm sorry to ask you to do this but I don't have enough ram to even > complete a repack with threads=2 so I'm reattempting single threaded at > the moment. But I really wonder if the threading has such an effect on > memory usage. I already have a threads = 1 running with this config. Binary and config were same from threads=4 run. 10% 28min 950M 40% 135min 950M 50% 157min 900M 60% 160min 830M 100% 170min 830M Something is hurting bad with threads. 170 CPU minutes with one thread, versus 195 CPU minutes with four threads. Is there a different memory allocator that can be used when multithreaded on gcc? This whole problem may be coming from the memory allocation function. git is hardly interacting at all on the thread level so it's likely a problem in the C run-time. [core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true [pack] threads = 1 deltacachesize = 256M windowmemory = 256M deltacachelimit = 0 [remote "origin"] url = git://git.infradead.org/gcc.git fetch = +refs/heads/*:refs/remotes/origin/* [branch "trunk"] remote = origin merge = refs/heads/trunk > > > > > > > I have no explanation for the change in RAM usage. Two guesses come to > > mind. Memory fragmentation. Or the change in the way the work was > > split up altered RAM usage. > > > > Total CPU time was 195 minutes in 70 minutes clock time. About 70% > > efficient. During the compress phase all four cores were active until > > the last 90 seconds. Writing the objects took over 23 minutes CPU > > bound on one core. > > > > New pack file is: 270,594,853 > > Old one was: 344,543,752 > > It still has 828,660 objects > > You mean the pack for the gcc repo is now less than 300MB? Wow. > > > Nicolas > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 5:25 ` Jon Smirl @ 2007-12-11 5:29 ` Jon Smirl 2007-12-11 7:01 ` Jon Smirl 2007-12-11 13:31 ` Nicolas Pitre 2007-12-11 6:01 ` Sean 1 sibling, 2 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-11 5:29 UTC (permalink / raw To: Nicolas Pitre, Junio C Hamano, gcc; +Cc: Git Mailing List I added the gcc people to the CC, it's their repository. Maybe they can help up sort this out. On 12/11/07, Jon Smirl <jonsmirl@gmail.com> wrote: > On 12/10/07, Nicolas Pitre <nico@cam.org> wrote: > > On Mon, 10 Dec 2007, Jon Smirl wrote: > > > > > New run using same configuration. With the addition of the more > > > efficient load balancing patches and delta cache accounting. > > > > > > Seconds are wall clock time. They are lower since the patch made > > > threading better at using all four cores. I am stuck at 380-390% CPU > > > utilization for the git process. > > > > > > complete seconds RAM > > > 10% 60 900M (includes counting) > > > 20% 15 900M > > > 30% 15 900M > > > 40% 50 1.2G > > > 50% 80 1.3G > > > 60% 70 1.7G > > > 70% 140 1.8G > > > 80% 180 2.0G > > > 90% 280 2.2G > > > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > > > 100% 1390 2.85G > > > During the writing phase RAM fell to 1.6G > > > What is being freed in the writing phase?? > > > > The cached delta results, but you put a cap of 256MB for them. > > > > Could you try again with that cache disabled entirely, with > > pack.deltacachesize = 1 (don't use 0 as that means unbounded). > > > > And then, while still keeping the delta cache disabled, could you try > > with pack.threads = 2, and pack.threads = 1 ? > > > > I'm sorry to ask you to do this but I don't have enough ram to even > > complete a repack with threads=2 so I'm reattempting single threaded at > > the moment. But I really wonder if the threading has such an effect on > > memory usage. > > I already have a threads = 1 running with this config. Binary and > config were same from threads=4 run. > > 10% 28min 950M > 40% 135min 950M > 50% 157min 900M > 60% 160min 830M > 100% 170min 830M > > Something is hurting bad with threads. 170 CPU minutes with one > thread, versus 195 CPU minutes with four threads. > > Is there a different memory allocator that can be used when > multithreaded on gcc? This whole problem may be coming from the memory > allocation function. git is hardly interacting at all on the thread > level so it's likely a problem in the C run-time. > > [core] > repositoryformatversion = 0 > filemode = true > bare = false > logallrefupdates = true > [pack] > threads = 1 > deltacachesize = 256M > windowmemory = 256M > deltacachelimit = 0 > [remote "origin"] > url = git://git.infradead.org/gcc.git > fetch = +refs/heads/*:refs/remotes/origin/* > [branch "trunk"] > remote = origin > merge = refs/heads/trunk > > > > > > > > > > > > > > > > I have no explanation for the change in RAM usage. Two guesses come to > > > mind. Memory fragmentation. Or the change in the way the work was > > > split up altered RAM usage. > > > > > > Total CPU time was 195 minutes in 70 minutes clock time. About 70% > > > efficient. During the compress phase all four cores were active until > > > the last 90 seconds. Writing the objects took over 23 minutes CPU > > > bound on one core. > > > > > > New pack file is: 270,594,853 > > > Old one was: 344,543,752 > > > It still has 828,660 objects > > > > You mean the pack for the gcc repo is now less than 300MB? Wow. > > > > > > Nicolas > > > > > -- > Jon Smirl > jonsmirl@gmail.com > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 5:29 ` Jon Smirl @ 2007-12-11 7:01 ` Jon Smirl 2007-12-11 7:34 ` Andreas Ericsson ` (3 more replies) 2007-12-11 13:31 ` Nicolas Pitre 1 sibling, 4 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-11 7:01 UTC (permalink / raw To: Nicolas Pitre, Junio C Hamano, gcc; +Cc: Git Mailing List Switching to the Google perftools malloc http://goog-perftools.sourceforge.net/ 10% 30 828M 20% 15 831M 30% 10 834M 40% 50 1014M 50% 80 1086M 60% 80 1500M 70% 200 1.53G 80% 200 1.85G 90% 260 1.87G 95% 520 1.97G 100% 1335 2.24G Google allocator knocked 600MB off from memory use. Memory consumption did not fall during the write out phase like it did with gcc. Since all of this is with the same code except for changing the threading split, those runs where memory consumption went to 4.5GB with the gcc allocator must have triggered an extreme problem with fragmentation. Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of being faster are not true. So why does our threaded code take 20 CPU minutes longer (12%) to run than the same code with a single thread? Clock time is obviously faster. Are the threads working too close to each other in memory and bouncing cache lines between the cores? Q6600 is just two E6600s in the same package, the caches are not shared. Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc) with 4 threads? But only need 950MB with one thread? Where's the extra gigabyte going? Is there another allocator to try? One that combines Google's efficiency with gcc's speed? On 12/11/07, Jon Smirl <jonsmirl@gmail.com> wrote: > I added the gcc people to the CC, it's their repository. Maybe they > can help up sort this out. > > On 12/11/07, Jon Smirl <jonsmirl@gmail.com> wrote: > > On 12/10/07, Nicolas Pitre <nico@cam.org> wrote: > > > On Mon, 10 Dec 2007, Jon Smirl wrote: > > > > > > > New run using same configuration. With the addition of the more > > > > efficient load balancing patches and delta cache accounting. > > > > > > > > Seconds are wall clock time. They are lower since the patch made > > > > threading better at using all four cores. I am stuck at 380-390% CPU > > > > utilization for the git process. > > > > > > > > complete seconds RAM > > > > 10% 60 900M (includes counting) > > > > 20% 15 900M > > > > 30% 15 900M > > > > 40% 50 1.2G > > > > 50% 80 1.3G > > > > 60% 70 1.7G > > > > 70% 140 1.8G > > > > 80% 180 2.0G > > > > 90% 280 2.2G > > > > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > > > > 100% 1390 2.85G > > > > During the writing phase RAM fell to 1.6G > > > > What is being freed in the writing phase?? > > > > > > The cached delta results, but you put a cap of 256MB for them. > > > > > > Could you try again with that cache disabled entirely, with > > > pack.deltacachesize = 1 (don't use 0 as that means unbounded). > > > > > > And then, while still keeping the delta cache disabled, could you try > > > with pack.threads = 2, and pack.threads = 1 ? > > > > > > I'm sorry to ask you to do this but I don't have enough ram to even > > > complete a repack with threads=2 so I'm reattempting single threaded at > > > the moment. But I really wonder if the threading has such an effect on > > > memory usage. > > > > I already have a threads = 1 running with this config. Binary and > > config were same from threads=4 run. > > > > 10% 28min 950M > > 40% 135min 950M > > 50% 157min 900M > > 60% 160min 830M > > 100% 170min 830M > > > > Something is hurting bad with threads. 170 CPU minutes with one > > thread, versus 195 CPU minutes with four threads. > > > > Is there a different memory allocator that can be used when > > multithreaded on gcc? This whole problem may be coming from the memory > > allocation function. git is hardly interacting at all on the thread > > level so it's likely a problem in the C run-time. > > > > [core] > > repositoryformatversion = 0 > > filemode = true > > bare = false > > logallrefupdates = true > > [pack] > > threads = 1 > > deltacachesize = 256M > > windowmemory = 256M > > deltacachelimit = 0 > > [remote "origin"] > > url = git://git.infradead.org/gcc.git > > fetch = +refs/heads/*:refs/remotes/origin/* > > [branch "trunk"] > > remote = origin > > merge = refs/heads/trunk > > > > > > > > > > > > > > > > > > > > > > > > > I have no explanation for the change in RAM usage. Two guesses come to > > > > mind. Memory fragmentation. Or the change in the way the work was > > > > split up altered RAM usage. > > > > > > > > Total CPU time was 195 minutes in 70 minutes clock time. About 70% > > > > efficient. During the compress phase all four cores were active until > > > > the last 90 seconds. Writing the objects took over 23 minutes CPU > > > > bound on one core. > > > > > > > > New pack file is: 270,594,853 > > > > Old one was: 344,543,752 > > > > It still has 828,660 objects > > > > > > You mean the pack for the gcc repo is now less than 300MB? Wow. > > > > > > > > > Nicolas > > > > > > > > > -- > > Jon Smirl > > jonsmirl@gmail.com > > > > > -- > Jon Smirl > jonsmirl@gmail.com > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 7:01 ` Jon Smirl @ 2007-12-11 7:34 ` Andreas Ericsson 2007-12-11 13:49 ` Nicolas Pitre ` (2 subsequent siblings) 3 siblings, 0 replies; 82+ messages in thread From: Andreas Ericsson @ 2007-12-11 7:34 UTC (permalink / raw To: Jon Smirl; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List Jon Smirl wrote: > Switching to the Google perftools malloc > http://goog-perftools.sourceforge.net/ > > Google allocator knocked 600MB off from memory use. > Memory consumption did not fall during the write out phase like it did with gcc. > > Since all of this is with the same code except for changing the > threading split, those runs where memory consumption went to 4.5GB > with the gcc allocator must have triggered an extreme problem with > fragmentation. > > Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of > being faster are not true. > Did you use the tcmalloc with heap checker/profiler, or tcmalloc_minimal? -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 7:01 ` Jon Smirl 2007-12-11 7:34 ` Andreas Ericsson @ 2007-12-11 13:49 ` Nicolas Pitre 2007-12-11 15:00 ` Nicolas Pitre 2007-12-11 16:33 ` Linus Torvalds 2007-12-11 17:28 ` Daniel Berlin 3 siblings, 1 reply; 82+ messages in thread From: Nicolas Pitre @ 2007-12-11 13:49 UTC (permalink / raw To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List On Tue, 11 Dec 2007, Jon Smirl wrote: > Switching to the Google perftools malloc > http://goog-perftools.sourceforge.net/ > > 10% 30 828M > 20% 15 831M > 30% 10 834M > 40% 50 1014M > 50% 80 1086M > 60% 80 1500M > 70% 200 1.53G > 80% 200 1.85G > 90% 260 1.87G > 95% 520 1.97G > 100% 1335 2.24G > > Google allocator knocked 600MB off from memory use. > Memory consumption did not fall during the write out phase like it did with gcc. > > Since all of this is with the same code except for changing the > threading split, those runs where memory consumption went to 4.5GB > with the gcc allocator must have triggered an extreme problem with > fragmentation. Did you mean the glibc allocator? > Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of > being faster are not true. > > So why does our threaded code take 20 CPU minutes longer (12%) to run > than the same code with a single thread? Clock time is obviously > faster. Are the threads working too close to each other in memory and > bouncing cache lines between the cores? Q6600 is just two E6600s in > the same package, the caches are not shared. Of course there'll always be a certain amount of wasted cycles when threaded. The locking overhead, the extra contention for IO, etc. So 12% overhead (3% per thread) when using 4 threads is not that bad I would say. > Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc) > with 4 threads? But only need 950MB with one thread? Where's the extra > gigabyte going? I really don't know. Did you try with pack.deltacachesize set to 1 ? And yet, this is still missing the actual issue. The issue being that the 2.1GB pack as a _source_ doesn't cause as much memory to be allocated even if the _result_ pack ends up being the same. I was able to repack the 2.1GB pack on my machine which has 1GB of ram. Now that it has been repacked, I can't repack it anymore, even when single threaded, as it start crowling into swap fairly quickly. It is really non intuitive and actually senseless that Git would require twice as much RAM to deal with a pack that is 7 times smaller. Nicolas (still puzzled) ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 13:49 ` Nicolas Pitre @ 2007-12-11 15:00 ` Nicolas Pitre 2007-12-11 15:36 ` Jon Smirl 2007-12-11 16:20 ` Nicolas Pitre 0 siblings, 2 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-11 15:00 UTC (permalink / raw To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List On Tue, 11 Dec 2007, Nicolas Pitre wrote: > And yet, this is still missing the actual issue. The issue being that > the 2.1GB pack as a _source_ doesn't cause as much memory to be > allocated even if the _result_ pack ends up being the same. > > I was able to repack the 2.1GB pack on my machine which has 1GB of ram. > Now that it has been repacked, I can't repack it anymore, even when > single threaded, as it start crowling into swap fairly quickly. It is > really non intuitive and actually senseless that Git would require twice > as much RAM to deal with a pack that is 7 times smaller. OK, here's something else for you to try: core.deltabasecachelimit=0 pack.threads=2 pack.deltacachesize=1 With that I'm able to repack the small gcc pack on my machine with 1GB of ram using: git repack -a -f -d --window=250 --depth=250 and top reports a ~700m virt and ~500m res without hitting swap at all. It is only at 25% so far, but I was unable to get that far before. Would be curious to know what you get with 4 threads on your machine. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 15:00 ` Nicolas Pitre @ 2007-12-11 15:36 ` Jon Smirl 2007-12-11 16:20 ` Nicolas Pitre 1 sibling, 0 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-11 15:36 UTC (permalink / raw To: Nicolas Pitre; +Cc: Junio C Hamano, gcc, Git Mailing List On 12/11/07, Nicolas Pitre <nico@cam.org> wrote: > On Tue, 11 Dec 2007, Nicolas Pitre wrote: > > > And yet, this is still missing the actual issue. The issue being that > > the 2.1GB pack as a _source_ doesn't cause as much memory to be > > allocated even if the _result_ pack ends up being the same. > > > > I was able to repack the 2.1GB pack on my machine which has 1GB of ram. > > Now that it has been repacked, I can't repack it anymore, even when > > single threaded, as it start crowling into swap fairly quickly. It is > > really non intuitive and actually senseless that Git would require twice > > as much RAM to deal with a pack that is 7 times smaller. > > OK, here's something else for you to try: > > core.deltabasecachelimit=0 > pack.threads=2 > pack.deltacachesize=1 > > With that I'm able to repack the small gcc pack on my machine with 1GB > of ram using: > > git repack -a -f -d --window=250 --depth=250 > > and top reports a ~700m virt and ~500m res without hitting swap at all. > It is only at 25% so far, but I was unable to get that far before. > > Would be curious to know what you get with 4 threads on your machine. Changing those parameters really slowed down counting the objects. I used to be able to count in 45 seconds now it took 130 seconds. I am still have the Google allocator linked in. 4 threads, cumulative clock time 25% 200 seconds, 820/627M 55% 510 seconds, 1240/1000M - little late recording 75% 15 minutes, 1658/1500M 90% 22 minutes, 1974/1800M it's still running but there is no significant change. Are two types of allocations being mixed? 1) long term, global objects kept until the end of everything 2) volatile, private objects allocated only while the object is being compressed and then freed Separating these would make a big difference to the fragmentation problem. Single threading probably wouldn't see a fragmentation problem from mixing the allocation types. When a thread is created it could allocated a private 20MB (or whatever) pool. The volatile, private objects would come from that pool. Long term objects would stay in the global pool. Since they are long term they will just get laid down sequentially in memory. Separating these allocation types make things way easier for malloc. CPU time would be helped by removing some of the locking if possible. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 15:00 ` Nicolas Pitre 2007-12-11 15:36 ` Jon Smirl @ 2007-12-11 16:20 ` Nicolas Pitre 2007-12-11 16:21 ` Jon Smirl 1 sibling, 1 reply; 82+ messages in thread From: Nicolas Pitre @ 2007-12-11 16:20 UTC (permalink / raw To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List On Tue, 11 Dec 2007, Nicolas Pitre wrote: > OK, here's something else for you to try: > > core.deltabasecachelimit=0 > pack.threads=2 > pack.deltacachesize=1 > > With that I'm able to repack the small gcc pack on my machine with 1GB > of ram using: > > git repack -a -f -d --window=250 --depth=250 > > and top reports a ~700m virt and ~500m res without hitting swap at all. > It is only at 25% so far, but I was unable to get that far before. Well, around 55% memory usage skyrocketed to 1.6GB and the system went deep into swap. So I restarted it with no threads. Nicolas (even more puzzled) ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 16:20 ` Nicolas Pitre @ 2007-12-11 16:21 ` Jon Smirl 2007-12-12 5:12 ` Nicolas Pitre 0 siblings, 1 reply; 82+ messages in thread From: Jon Smirl @ 2007-12-11 16:21 UTC (permalink / raw To: Nicolas Pitre; +Cc: Junio C Hamano, gcc, Git Mailing List On 12/11/07, Nicolas Pitre <nico@cam.org> wrote: > On Tue, 11 Dec 2007, Nicolas Pitre wrote: > > > OK, here's something else for you to try: > > > > core.deltabasecachelimit=0 > > pack.threads=2 > > pack.deltacachesize=1 > > > > With that I'm able to repack the small gcc pack on my machine with 1GB > > of ram using: > > > > git repack -a -f -d --window=250 --depth=250 > > > > and top reports a ~700m virt and ~500m res without hitting swap at all. > > It is only at 25% so far, but I was unable to get that far before. > > Well, around 55% memory usage skyrocketed to 1.6GB and the system went > deep into swap. So I restarted it with no threads. > > Nicolas (even more puzzled) On the plus side you are seeing what I see, so it proves I am not imagining it. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 16:21 ` Jon Smirl @ 2007-12-12 5:12 ` Nicolas Pitre 2007-12-12 8:05 ` David Kastrup ` (2 more replies) 0 siblings, 3 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-12 5:12 UTC (permalink / raw To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List On Tue, 11 Dec 2007, Jon Smirl wrote: > On 12/11/07, Nicolas Pitre <nico@cam.org> wrote: > > On Tue, 11 Dec 2007, Nicolas Pitre wrote: > > > > > OK, here's something else for you to try: > > > > > > core.deltabasecachelimit=0 > > > pack.threads=2 > > > pack.deltacachesize=1 > > > > > > With that I'm able to repack the small gcc pack on my machine with 1GB > > > of ram using: > > > > > > git repack -a -f -d --window=250 --depth=250 > > > > > > and top reports a ~700m virt and ~500m res without hitting swap at all. > > > It is only at 25% so far, but I was unable to get that far before. > > > > Well, around 55% memory usage skyrocketed to 1.6GB and the system went > > deep into swap. So I restarted it with no threads. > > > > Nicolas (even more puzzled) > > On the plus side you are seeing what I see, so it proves I am not imagining it. Well... This is weird. It seems that memory fragmentation is really really killing us here. The fact that the Google allocator did manage to waste quite less memory is a good indicator already. I did modify the progress display to show accounted memory that was allocated vs memory that was freed but still not released to the system. At least that gives you an idea of memory allocation and fragmentation with glibc in real time: diff --git a/progress.c b/progress.c index d19f80c..46ac9ef 100644 --- a/progress.c +++ b/progress.c @@ -8,6 +8,7 @@ * published by the Free Software Foundation. */ +#include <malloc.h> #include "git-compat-util.h" #include "progress.h" @@ -94,10 +95,12 @@ static int display(struct progress *progress, unsigned n, const char *done) if (progress->total) { unsigned percent = n * 100 / progress->total; if (percent != progress->last_percent || progress_update) { + struct mallinfo m = mallinfo(); progress->last_percent = percent; - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", - progress->title, percent, n, - progress->total, tp, eol); + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s", + progress->title, percent, n, progress->total, + m.uordblks >> 18, m.fordblks >> 18, + tp, eol); fflush(stderr); progress_update = 0; return 1; This shows that at some point the repack goes into a big memory surge. I don't have enough RAM to see how fragmented memory gets though, since it starts swapping around 50% done with 2 threads. With only 1 thread, memory usage grows significantly at around 11% with a pretty noticeable slowdown in the progress rate. So I think the theory goes like this: There is a block of big objects together in the list somewhere. Initially, all those big objects are assigned to thread #1 out of 4. Because those objects are big, they get really slow to delta compress, and storing them all in a window with 250 slots takes significant memory. Threads 2, 3, and 4 have "easy" work loads, so they complete fairly quicly compared to thread #1. But since the progress display is global then you won't notice that one thread is actually crawling slowly. To keep all threads busy until the end, those threads that are done with their work load will steal some work from another thread, choosing the one with the largest remaining work. That is most likely thread #1. So as threads 2, 3, and 4 complete, they will steal from thread 1 and populate their own window with those big objects too, and get slow too. And because all threads gets to work on those big objects towards the end, the progress display will then show a significant slowdown, and memory usage will almost quadruple. Add memory fragmentation to that and you have a clogged system. Solution: pack.deltacachesize=1 pack.windowmemory=16M Limiting the window memory to 16MB will automatically shrink the window size when big objects are encountered, therefore keeping much fewer of those objects at the same time in memory, which in turn means they will be processed much more quickly. And somehow that must help with memory fragmentation as well. Setting pack.deltacachesize to 1 is simply to disable the caching of delta results entirely which will only slow down the writing phase, but I wanted to keep it out of the picture for now. With the above settings, I'm currently repacking the gcc repo with 2 threads, and memory allocation never exceeded 700m virt and 400m res, while the mallinfo shows about 350MB, and progress has reached 90% which has never occurred on this machine with the 300MB source pack so far. Nicolas ^ permalink raw reply related [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 5:12 ` Nicolas Pitre @ 2007-12-12 8:05 ` David Kastrup 2007-12-14 16:18 ` Wolfram Gloger 2007-12-12 15:48 ` Nicolas Pitre 2007-12-12 16:13 ` Nicolas Pitre 2 siblings, 1 reply; 82+ messages in thread From: David Kastrup @ 2007-12-12 8:05 UTC (permalink / raw To: Nicolas Pitre; +Cc: Jon Smirl, Junio C Hamano, gcc, Git Mailing List Nicolas Pitre <nico@cam.org> writes: > Well... This is weird. > > It seems that memory fragmentation is really really killing us here. > The fact that the Google allocator did manage to waste quite less memory > is a good indicator already. Maybe an malloc/free/mmap wrapper that records the requested sizes and alloc/free order and dumps them to file so that one can make a compact git-free standalone test case for the glibc maintainers might be a good thing. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 8:05 ` David Kastrup @ 2007-12-14 16:18 ` Wolfram Gloger 0 siblings, 0 replies; 82+ messages in thread From: Wolfram Gloger @ 2007-12-14 16:18 UTC (permalink / raw To: dak; +Cc: nico, jonsmirl, gitster, gcc, git Hi, > Maybe an malloc/free/mmap wrapper that records the requested sizes and > alloc/free order and dumps them to file so that one can make a compact > git-free standalone test case for the glibc maintainers might be a good > thing. I already have such a wrapper: http://malloc.de/malloc/mtrace-20060529.tar.gz But note that it does interfere with the thread scheduling, so it can't record the exact same allocation pattern as when not using the wrapper. Regards, Wolfram. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 5:12 ` Nicolas Pitre 2007-12-12 8:05 ` David Kastrup @ 2007-12-12 15:48 ` Nicolas Pitre 2007-12-12 16:17 ` Paolo Bonzini ` (2 more replies) 2007-12-12 16:13 ` Nicolas Pitre 2 siblings, 3 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-12 15:48 UTC (permalink / raw To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List On Wed, 12 Dec 2007, Nicolas Pitre wrote: > Add memory fragmentation to that and you have a clogged system. > > Solution: > > pack.deltacachesize=1 > pack.windowmemory=16M > > Limiting the window memory to 16MB will automatically shrink the window > size when big objects are encountered, therefore keeping much fewer of > those objects at the same time in memory, which in turn means they will > be processed much more quickly. And somehow that must help with memory > fragmentation as well. OK scrap that. When I returned to the computer this morning, the repack was completed... with a 1.3GB pack instead. So... The gcc repo apparently really needs a large window to efficiently compress those large objects. But when those large objects are already well deltified and you repack again with a large window, somehow the memory allocator is way more involved, probably even more so when there are several threads in parallel amplifying the issue, and things probably get to a point of no return with regard to memory fragmentation after a while. So... my conclusion is that the glibc allocator has fragmentation issues with this work load, given the notable difference with the Google allocator, which itself might not be completely immune to fragmentation issues of its own. And because the gcc repo requires a large window of big objects to get good compression, then you're better not using 4 threads to repack it with -a -f. The fact that the size of the source pack has such an influence is probably only because the increased usage of the delta base object cache is playing a role in the global memory allocation pattern, allowing for the bad fragmentation issue to occur. If you could run one last test with the mallinfo patch I posted, without the pack.windowmemory setting, and adding the reported values along with those from top, then we could formally conclude to memory fragmentation issues. So I don't think Git itself is actually bad. The gcc repo most certainly constitute a nasty use case for memory allocators, but I don't think there is much we can do about it besides possibly implementing our own memory allocator with active defragmentation where possible (read memcpy) at some point to give glibc's allocator some chance to breathe a bit more. In the mean time you might have to use only one thread and lots of memory to repack the gcc repo, or find the perfect memory allocator to be used with Git. After all, packing the whole gcc history to around 230MB is quite a stunt but it requires sufficient resources to achieve it. Fortunately, like Linus said, such a wholesale repack is not something that most users have to do anyway. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 15:48 ` Nicolas Pitre @ 2007-12-12 16:17 ` Paolo Bonzini 2007-12-12 16:37 ` Linus Torvalds 2007-12-13 13:32 ` Nguyen Thai Ngoc Duy 2 siblings, 0 replies; 82+ messages in thread From: Paolo Bonzini @ 2007-12-12 16:17 UTC (permalink / raw To: gcc; +Cc: git > When I returned to the computer this morning, the repack was > completed... with a 1.3GB pack instead. > > So... The gcc repo apparently really needs a large window to efficiently > compress those large objects. So, am I right that if you have a very well-done pack (such as gcc's), you might want to repack in two phases: - first discarding the old deltas and using a small window, thus producing a bad pack that can be repacked without humongous amounts of memory... - ... then discarding the old deltas and producing another well-compressed pack? Paolo ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 15:48 ` Nicolas Pitre 2007-12-12 16:17 ` Paolo Bonzini @ 2007-12-12 16:37 ` Linus Torvalds 2007-12-12 16:42 ` David Miller ` (2 more replies) 2007-12-13 13:32 ` Nguyen Thai Ngoc Duy 2 siblings, 3 replies; 82+ messages in thread From: Linus Torvalds @ 2007-12-12 16:37 UTC (permalink / raw To: Nicolas Pitre; +Cc: Jon Smirl, Junio C Hamano, gcc, Git Mailing List On Wed, 12 Dec 2007, Nicolas Pitre wrote: > > So... my conclusion is that the glibc allocator has fragmentation issues > with this work load, given the notable difference with the Google > allocator, which itself might not be completely immune to fragmentation > issues of its own. Yes. Note that delta following involves patterns something like allocate (small) space for delta for i in (1..depth) { allocate large space for base allocate large space for result .. apply delta .. free large space for base free small space for delta } so if you have some stupid heap algorithm that doesn't try to merge and re-use free'd spaces very aggressively (because that takes CPU time!), you might have memory usage be horribly inflated by the heap having all those holes for all the objects that got free'd in the chain that don't get aggressively re-used. Threaded memory allocators then make this worse by probably using totally different heaps for different threads (in order to avoid locking), so they will *all* have the fragmentation issue. And if you *really* want to cause trouble for a memory allocator, what you should try to do is to allocate the memory in one thread, and free it in another, and then things can really explode (the freeing thread notices that the allocation is not in its thread-local heap, so instead of really freeing it, it puts it on a separate list of areas to be freed later by the original thread when it needs memory - or worse, it adds it to the local thread list, and makes it effectively totally impossible to then ever merge different free'd allocations ever again because the freed things will be on different heap lists!). I'm not saying that particular case happens in git, I'm just saying that it's not unheard of. And with the delta cache and the object lookup, it's not at _all_ impossible that we hit the "allocate in one thread, free in another" case! Linus ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 16:37 ` Linus Torvalds @ 2007-12-12 16:42 ` David Miller 2007-12-12 16:54 ` Linus Torvalds 2007-12-12 17:12 ` Jon Smirl 2007-12-14 16:12 ` Wolfram Gloger 2 siblings, 1 reply; 82+ messages in thread From: David Miller @ 2007-12-12 16:42 UTC (permalink / raw To: torvalds; +Cc: nico, jonsmirl, gitster, gcc, git From: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed, 12 Dec 2007 08:37:10 -0800 (PST) > I'm not saying that particular case happens in git, I'm just saying that > it's not unheard of. And with the delta cache and the object lookup, it's > not at _all_ impossible that we hit the "allocate in one thread, free in > another" case! One thing that supports these theories is that, while running these large repacks, I notice that the RSS is roughly 2/3 of the amount of virtual address space allocated. I personally don't think it's unreasonable for GIT to have it's own customized allocator at least for certain object types. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 16:42 ` David Miller @ 2007-12-12 16:54 ` Linus Torvalds 0 siblings, 0 replies; 82+ messages in thread From: Linus Torvalds @ 2007-12-12 16:54 UTC (permalink / raw To: David Miller; +Cc: nico, jonsmirl, gitster, gcc, git On Wed, 12 Dec 2007, David Miller wrote: > > I personally don't think it's unreasonable for GIT to have it's > own customized allocator at least for certain object types. Well, we actually already *do* have a customized allocator, but currently only for the actual core "object descriptor" that really just has the SHA1 and object flags in it (and a few extra words depending on object type). Those are critical for certain loads, and small too (so using the standard allocator wasted a _lot_ of memory). In addition, they're fixed-size and never free'd, so a specialized allocator really can do a lot better than any general-purpose memory allocator ever could. But the actual object *contents* are currently all allocated with whatever the standard libc malloc/free allocator is that you compile for (or load dynamically). Havign a specialized allocator for them is a much more involved issue, exactly because we do have interesting allocation patterns etc. That said, at least those object allocations are all single-threaded (for right now, at least), so even when git does multi-threaded stuff, the core sha1_file.c stuff is always run under a single lock, and a simpler allocator that doesn't care about threads is likely to be much better than one that tries to have thread-local heaps etc. I suspect that is what the google allocator does. It probably doesn't have per-thread heaps, it just uses locking (and quite possibly things like per-*size* heaps, which is much more memory-efficient and helps avoid some of the fragmentation problems). Locking is much slower than per-thread accesses, but it doesn't have the issues with per-thread-fragmentation and all the problems with one thread allocating and another one freeing. Linus ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 16:37 ` Linus Torvalds 2007-12-12 16:42 ` David Miller @ 2007-12-12 17:12 ` Jon Smirl 2007-12-14 16:12 ` Wolfram Gloger 2 siblings, 0 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-12 17:12 UTC (permalink / raw To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List On 12/12/07, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Wed, 12 Dec 2007, Nicolas Pitre wrote: > > > > So... my conclusion is that the glibc allocator has fragmentation issues > > with this work load, given the notable difference with the Google > > allocator, which itself might not be completely immune to fragmentation > > issues of its own. > > Yes. > > Note that delta following involves patterns something like > > allocate (small) space for delta > for i in (1..depth) { > allocate large space for base > allocate large space for result > .. apply delta .. > free large space for base > free small space for delta > } Is it hard to hack up something that statically allocates a big block of memory per thread for these two and then just reuses it? allocate (small) space for delta allocate large space for base The alternating between long term and short term allocations definitely aggravates fragmentation. > > so if you have some stupid heap algorithm that doesn't try to merge and > re-use free'd spaces very aggressively (because that takes CPU time!), you > might have memory usage be horribly inflated by the heap having all those > holes for all the objects that got free'd in the chain that don't get > aggressively re-used. > > Threaded memory allocators then make this worse by probably using totally > different heaps for different threads (in order to avoid locking), so they > will *all* have the fragmentation issue. > > And if you *really* want to cause trouble for a memory allocator, what you > should try to do is to allocate the memory in one thread, and free it in > another, and then things can really explode (the freeing thread notices > that the allocation is not in its thread-local heap, so instead of really > freeing it, it puts it on a separate list of areas to be freed later by > the original thread when it needs memory - or worse, it adds it to the > local thread list, and makes it effectively totally impossible to then > ever merge different free'd allocations ever again because the freed > things will be on different heap lists!). > > I'm not saying that particular case happens in git, I'm just saying that > it's not unheard of. And with the delta cache and the object lookup, it's > not at _all_ impossible that we hit the "allocate in one thread, free in > another" case! > > Linus > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 16:37 ` Linus Torvalds 2007-12-12 16:42 ` David Miller 2007-12-12 17:12 ` Jon Smirl @ 2007-12-14 16:12 ` Wolfram Gloger 2007-12-14 16:45 ` David Kastrup 2 siblings, 1 reply; 82+ messages in thread From: Wolfram Gloger @ 2007-12-14 16:12 UTC (permalink / raw To: torvalds; +Cc: nico, jonsmirl, gitster, gcc, git Hi, > Note that delta following involves patterns something like > > allocate (small) space for delta > for i in (1..depth) { > allocate large space for base > allocate large space for result > .. apply delta .. > free large space for base > free small space for delta > } > > so if you have some stupid heap algorithm that doesn't try to merge and > re-use free'd spaces very aggressively (because that takes CPU time!), ptmalloc2 (in glibc) _per arena_ is basically best-fit. This is the best known general strategy, but it certainly cannot be the best in every case. > you > might have memory usage be horribly inflated by the heap having all those > holes for all the objects that got free'd in the chain that don't get > aggressively re-used. It depends how large 'large' is -- if it exceeds the mmap() threshold (settable with mallopt(M_MMAP_THRESHOLD, ...)) the 'large' spaces will be allocated with mmap() and won't cause any internal fragmentation. It might pay to experiment with this parameter if it is hard to avoid the alloc/free large space sequence. > Threaded memory allocators then make this worse by probably using totally > different heaps for different threads (in order to avoid locking), so they > will *all* have the fragmentation issue. Indeed. Could someone perhaps try ptmalloc3 (http://malloc.de/malloc/ptmalloc3-current.tar.gz) on this case? Thanks, Wolfram. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-14 16:12 ` Wolfram Gloger @ 2007-12-14 16:45 ` David Kastrup 2007-12-14 16:59 ` Wolfram Gloger 0 siblings, 1 reply; 82+ messages in thread From: David Kastrup @ 2007-12-14 16:45 UTC (permalink / raw To: Wolfram Gloger; +Cc: torvalds, nico, jonsmirl, gitster, gcc, git Wolfram Gloger <wmglo@dent.med.uni-muenchen.de> writes: > Hi, > >> Note that delta following involves patterns something like >> >> allocate (small) space for delta >> for i in (1..depth) { >> allocate large space for base >> allocate large space for result >> .. apply delta .. >> free large space for base >> free small space for delta >> } >> >> so if you have some stupid heap algorithm that doesn't try to merge and >> re-use free'd spaces very aggressively (because that takes CPU time!), > > ptmalloc2 (in glibc) _per arena_ is basically best-fit. This is the > best known general strategy, Uh what? Someone crank out his copy of "The Art of Computer Programming", I think volume 1. Best fit is known (analyzed and proven and documented decades ago) to be one of the worst strategies for memory allocation. Exactly because it leads to huge fragmentation problems. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-14 16:45 ` David Kastrup @ 2007-12-14 16:59 ` Wolfram Gloger 0 siblings, 0 replies; 82+ messages in thread From: Wolfram Gloger @ 2007-12-14 16:59 UTC (permalink / raw To: dak; +Cc: wmglo, torvalds, nico, jonsmirl, gitster, gcc, git Hi, > Uh what? Someone crank out his copy of "The Art of Computer > Programming", I think volume 1. Best fit is known (analyzed and proven > and documented decades ago) to be one of the worst strategies for memory > allocation. Exactly because it leads to huge fragmentation problems. Well, quoting http://gee.cs.oswego.edu/dl/html/malloc.html: "As shown by Wilson et al, best-fit schemes (of various kinds and approximations) tend to produce the least fragmentation on real loads compared to other general approaches such as first-fit." See [Wilson 1995] ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps for more details and references. Regards, Wolfram. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 15:48 ` Nicolas Pitre 2007-12-12 16:17 ` Paolo Bonzini 2007-12-12 16:37 ` Linus Torvalds @ 2007-12-13 13:32 ` Nguyen Thai Ngoc Duy 2007-12-13 15:32 ` Paolo Bonzini 2 siblings, 1 reply; 82+ messages in thread From: Nguyen Thai Ngoc Duy @ 2007-12-13 13:32 UTC (permalink / raw To: Nicolas Pitre; +Cc: Jon Smirl, Junio C Hamano, gcc, Git Mailing List On Dec 12, 2007 10:48 PM, Nicolas Pitre <nico@cam.org> wrote: > In the mean time you might have to use only one thread and lots of > memory to repack the gcc repo, or find the perfect memory allocator to > be used with Git. After all, packing the whole gcc history to around > 230MB is quite a stunt but it requires sufficient resources to > achieve it. Fortunately, like Linus said, such a wholesale repack is not > something that most users have to do anyway. Is there an alternative to "git repack -a -d" that repacks everything but the first pack? -- Duy ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-13 13:32 ` Nguyen Thai Ngoc Duy @ 2007-12-13 15:32 ` Paolo Bonzini 2007-12-13 16:29 ` Paolo Bonzini 2007-12-13 16:39 ` Johannes Sixt 0 siblings, 2 replies; 82+ messages in thread From: Paolo Bonzini @ 2007-12-13 15:32 UTC (permalink / raw To: git; +Cc: gcc Nguyen Thai Ngoc Duy wrote: > On Dec 12, 2007 10:48 PM, Nicolas Pitre <nico@cam.org> wrote: >> In the mean time you might have to use only one thread and lots of >> memory to repack the gcc repo, or find the perfect memory allocator to >> be used with Git. After all, packing the whole gcc history to around >> 230MB is quite a stunt but it requires sufficient resources to >> achieve it. Fortunately, like Linus said, such a wholesale repack is not >> something that most users have to do anyway. > > Is there an alternative to "git repack -a -d" that repacks everything > but the first pack? That would be a pretty good idea for big repositories. If I were to implement it, I would actually add a .git/config option like pack.permanent so that more than one pack could be made permanent; then to repack really really everything you'd need "git repack -a -a -d". Paolo ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-13 15:32 ` Paolo Bonzini @ 2007-12-13 16:29 ` Paolo Bonzini 2007-12-13 16:39 ` Johannes Sixt 1 sibling, 0 replies; 82+ messages in thread From: Paolo Bonzini @ 2007-12-13 16:29 UTC (permalink / raw Cc: git, gcc >> Is there an alternative to "git repack -a -d" that repacks everything >> but the first pack? > > That would be a pretty good idea for big repositories. If I were to > implement it, I would actually add a .git/config option like > pack.permanent so that more than one pack could be made permanent; then > to repack really really everything you'd need "git repack -a -a -d". Actually there is something like this, as seen from the source of git-repack: for e in `cd "$PACKDIR" && find . -type f -name '*.pack' \ | sed -e 's/^\.\///' -e 's/\.pack$//'` do if [ -e "$PACKDIR/$e.keep" ]; then : keep else args="$args --unpacked=$e.pack" existing="$existing $e" fi done So, just create a file named as the pack, but with extension ".keep". Paolo ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-13 15:32 ` Paolo Bonzini 2007-12-13 16:29 ` Paolo Bonzini @ 2007-12-13 16:39 ` Johannes Sixt 2007-12-14 1:04 ` Jakub Narebski 1 sibling, 1 reply; 82+ messages in thread From: Johannes Sixt @ 2007-12-13 16:39 UTC (permalink / raw To: Paolo Bonzini; +Cc: git, gcc Paolo Bonzini schrieb: > Nguyen Thai Ngoc Duy wrote: >> On Dec 12, 2007 10:48 PM, Nicolas Pitre <nico@cam.org> wrote: >>> In the mean time you might have to use only one thread and lots of >>> memory to repack the gcc repo, or find the perfect memory allocator to >>> be used with Git. After all, packing the whole gcc history to around >>> 230MB is quite a stunt but it requires sufficient resources to >>> achieve it. Fortunately, like Linus said, such a wholesale repack is not >>> something that most users have to do anyway. >> >> Is there an alternative to "git repack -a -d" that repacks everything >> but the first pack? > > That would be a pretty good idea for big repositories. If I were to > implement it, I would actually add a .git/config option like > pack.permanent so that more than one pack could be made permanent; then > to repack really really everything you'd need "git repack -a -a -d". It's already there: If you have a pack .git/objects/pack/pack-foo.pack, then "touch .git/objects/pack/pack-foo.keep" marks the pack as precious. -- Hannes ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-13 16:39 ` Johannes Sixt @ 2007-12-14 1:04 ` Jakub Narebski 2007-12-14 6:14 ` Paolo Bonzini 0 siblings, 1 reply; 82+ messages in thread From: Jakub Narebski @ 2007-12-14 1:04 UTC (permalink / raw To: git; +Cc: gcc Johannes Sixt wrote: > Paolo Bonzini schrieb: >> Nguyen Thai Ngoc Duy wrote: >>> >>> Is there an alternative to "git repack -a -d" that repacks everything >>> but the first pack? >> >> That would be a pretty good idea for big repositories. If I were to >> implement it, I would actually add a .git/config option like >> pack.permanent so that more than one pack could be made permanent; then >> to repack really really everything you'd need "git repack -a -a -d". > > It's already there: If you have a pack .git/objects/pack/pack-foo.pack, then > "touch .git/objects/pack/pack-foo.keep" marks the pack as precious. Actually you can (and probably should) put the one line with the _reason_ pack is to be kept in the *.keep file. Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of all things. -- Jakub Narebski Warsaw, Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-14 1:04 ` Jakub Narebski @ 2007-12-14 6:14 ` Paolo Bonzini 2007-12-14 6:24 ` Nguyen Thai Ngoc Duy 2007-12-14 13:25 ` Nicolas Pitre 0 siblings, 2 replies; 82+ messages in thread From: Paolo Bonzini @ 2007-12-14 6:14 UTC (permalink / raw To: git; +Cc: gcc > Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of > all things. I found that the .keep file is not transmitted over the network (at least I tried with git+ssh:// and http:// protocols), however. Paolo ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-14 6:14 ` Paolo Bonzini @ 2007-12-14 6:24 ` Nguyen Thai Ngoc Duy 2007-12-14 8:20 ` Paolo Bonzini 2007-12-14 10:40 ` Jakub Narebski 2007-12-14 13:25 ` Nicolas Pitre 1 sibling, 2 replies; 82+ messages in thread From: Nguyen Thai Ngoc Duy @ 2007-12-14 6:24 UTC (permalink / raw To: Paolo Bonzini; +Cc: git, gcc On Dec 14, 2007 1:14 PM, Paolo Bonzini <bonzini@gnu.org> wrote: > > Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of > > all things. > > I found that the .keep file is not transmitted over the network (at > least I tried with git+ssh:// and http:// protocols), however. I'm thinking about "git clone --keep" to mark initial packs precious. But 'git clone' is under rewrite to C. Let's wait until C rewrite is done. -- Duy ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-14 6:24 ` Nguyen Thai Ngoc Duy @ 2007-12-14 8:20 ` Paolo Bonzini 2007-12-14 9:01 ` Harvey Harrison 2007-12-14 10:40 ` Jakub Narebski 1 sibling, 1 reply; 82+ messages in thread From: Paolo Bonzini @ 2007-12-14 8:20 UTC (permalink / raw To: gcc; +Cc: git > I'm thinking about "git clone --keep" to mark initial packs precious. > But 'git clone' is under rewrite to C. Let's wait until C rewrite is > done. It should be the default, IMHO. Paolo ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-14 8:20 ` Paolo Bonzini @ 2007-12-14 9:01 ` Harvey Harrison 0 siblings, 0 replies; 82+ messages in thread From: Harvey Harrison @ 2007-12-14 9:01 UTC (permalink / raw To: Paolo Bonzini; +Cc: gcc, git On Fri, 2007-12-14 at 09:20 +0100, Paolo Bonzini wrote: > > I'm thinking about "git clone --keep" to mark initial packs precious. > > But 'git clone' is under rewrite to C. Let's wait until C rewrite is > > done. > > It should be the default, IMHO. > While it doesn't mark the packs as .keep, git will reuse all of the old deltas you got in the original clone, so you're not losing anything. Harvey ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-14 6:24 ` Nguyen Thai Ngoc Duy 2007-12-14 8:20 ` Paolo Bonzini @ 2007-12-14 10:40 ` Jakub Narebski 2007-12-14 10:52 ` Nguyen Thai Ngoc Duy 1 sibling, 1 reply; 82+ messages in thread From: Jakub Narebski @ 2007-12-14 10:40 UTC (permalink / raw To: Nguyen Thai Ngoc Duy; +Cc: Paolo Bonzini, git, gcc "Nguyen Thai Ngoc Duy" <pclouds@gmail.com> writes: > On Dec 14, 2007 1:14 PM, Paolo Bonzini <bonzini@gnu.org> wrote: > > > Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of > > > all things. > > > > I found that the .keep file is not transmitted over the network (at > > least I tried with git+ssh:// and http:// protocols), however. > > I'm thinking about "git clone --keep" to mark initial packs precious. > But 'git clone' is under rewrite to C. Let's wait until C rewrite is > done. But if you clone via network, pack might be network optimized if you use "smart" transport, not disk optimized, at least with current git which regenerates pack also on clone AFAIK. -- Jakub Narebski Poland ShadeHawk on #git ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-14 10:40 ` Jakub Narebski @ 2007-12-14 10:52 ` Nguyen Thai Ngoc Duy 0 siblings, 0 replies; 82+ messages in thread From: Nguyen Thai Ngoc Duy @ 2007-12-14 10:52 UTC (permalink / raw To: Jakub Narebski, Harvey Harrison; +Cc: Paolo Bonzini, git, gcc On Dec 14, 2007 4:01 PM, Harvey Harrison <harvey.harrison@gmail.com> wrote: > While it doesn't mark the packs as .keep, git will reuse all of the old > deltas you got in the original clone, so you're not losing anything. There is another reason I want it. I have an ~800MB pack and I don't want git to rewrite the pack every time I repack my changes. So it's kind of disk-wise (don't require 800MB on disk to prepare new pack, and don't write too much). On Dec 14, 2007 5:40 PM, Jakub Narebski <jnareb@gmail.com> wrote: > But if you clone via network, pack might be network optimized if you > use "smart" transport, not disk optimized, at least with current git > which regenerates pack also on clone AFAIK. Um.. that's ok it just regenerate once. -- Duy ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-14 6:14 ` Paolo Bonzini 2007-12-14 6:24 ` Nguyen Thai Ngoc Duy @ 2007-12-14 13:25 ` Nicolas Pitre 1 sibling, 0 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-14 13:25 UTC (permalink / raw To: Paolo Bonzini; +Cc: git, gcc On Fri, 14 Dec 2007, Paolo Bonzini wrote: > > Hmmm... it is even documented in git-gc(1)... and git-index-pack(1) of > > all things. > > I found that the .keep file is not transmitted over the network (at least I > tried with git+ssh:// and http:// protocols), however. That is a local policy. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 5:12 ` Nicolas Pitre 2007-12-12 8:05 ` David Kastrup 2007-12-12 15:48 ` Nicolas Pitre @ 2007-12-12 16:13 ` Nicolas Pitre 2007-12-13 7:32 ` Andreas Ericsson 2 siblings, 1 reply; 82+ messages in thread From: Nicolas Pitre @ 2007-12-12 16:13 UTC (permalink / raw To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List On Wed, 12 Dec 2007, Nicolas Pitre wrote: > I did modify the progress display to show accounted memory that was > allocated vs memory that was freed but still not released to the system. > At least that gives you an idea of memory allocation and fragmentation > with glibc in real time: > > diff --git a/progress.c b/progress.c > index d19f80c..46ac9ef 100644 > --- a/progress.c > +++ b/progress.c > @@ -8,6 +8,7 @@ > * published by the Free Software Foundation. > */ > > +#include <malloc.h> > #include "git-compat-util.h" > #include "progress.h" > > @@ -94,10 +95,12 @@ static int display(struct progress *progress, unsigned n, const char *done) > if (progress->total) { > unsigned percent = n * 100 / progress->total; > if (percent != progress->last_percent || progress_update) { > + struct mallinfo m = mallinfo(); > progress->last_percent = percent; > - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", > - progress->title, percent, n, > - progress->total, tp, eol); > + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s", > + progress->title, percent, n, progress->total, > + m.uordblks >> 18, m.fordblks >> 18, > + tp, eol); Note: I didn't know what unit of memory those blocks represents, so the shift is most probably wrong. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-12 16:13 ` Nicolas Pitre @ 2007-12-13 7:32 ` Andreas Ericsson 2007-12-14 16:03 ` Wolfram Gloger 0 siblings, 1 reply; 82+ messages in thread From: Andreas Ericsson @ 2007-12-13 7:32 UTC (permalink / raw To: Nicolas Pitre; +Cc: Jon Smirl, Junio C Hamano, gcc, Git Mailing List Nicolas Pitre wrote: > On Wed, 12 Dec 2007, Nicolas Pitre wrote: > >> I did modify the progress display to show accounted memory that was >> allocated vs memory that was freed but still not released to the system. >> At least that gives you an idea of memory allocation and fragmentation >> with glibc in real time: >> >> diff --git a/progress.c b/progress.c >> index d19f80c..46ac9ef 100644 >> --- a/progress.c >> +++ b/progress.c >> @@ -8,6 +8,7 @@ >> * published by the Free Software Foundation. >> */ >> >> +#include <malloc.h> >> #include "git-compat-util.h" >> #include "progress.h" >> >> @@ -94,10 +95,12 @@ static int display(struct progress *progress, unsigned n, const char *done) >> if (progress->total) { >> unsigned percent = n * 100 / progress->total; >> if (percent != progress->last_percent || progress_update) { >> + struct mallinfo m = mallinfo(); >> progress->last_percent = percent; >> - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", >> - progress->title, percent, n, >> - progress->total, tp, eol); >> + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s", >> + progress->title, percent, n, progress->total, >> + m.uordblks >> 18, m.fordblks >> 18, >> + tp, eol); > > Note: I didn't know what unit of memory those blocks represents, so the > shift is most probably wrong. > Me neither, but it appears to me as if hblkhd holds the actual memory consumed by the process. It seems to store the information in bytes, which I find a bit dubious unless glibc has some internal multiplier. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-13 7:32 ` Andreas Ericsson @ 2007-12-14 16:03 ` Wolfram Gloger 0 siblings, 0 replies; 82+ messages in thread From: Wolfram Gloger @ 2007-12-14 16:03 UTC (permalink / raw To: ae; +Cc: nico, jonsmirl, gitster, gcc, git Hi, > >> if (progress->total) { > >> unsigned percent = n * 100 / progress->total; > >> if (percent != progress->last_percent || progress_update) { > >> + struct mallinfo m = mallinfo(); > >> progress->last_percent = percent; > >> - fprintf(stderr, "%s: %3u%% (%u/%u)%s%s", > >> - progress->title, percent, n, > >> - progress->total, tp, eol); > >> + fprintf(stderr, "%s: %3u%% (%u/%u) %u/%uMB%s%s", > >> + progress->title, percent, n, progress->total, > >> + m.uordblks >> 18, m.fordblks >> 18, > >> + tp, eol); > > > > Note: I didn't know what unit of memory those blocks represents, so the > > shift is most probably wrong. > > > > Me neither, but it appears to me as if hblkhd holds the actual memory > consumed by the process. It seems to store the information in bytes, > which I find a bit dubious unless glibc has some internal multiplier. mallinfo() will only give you the used memory for the main arena. When you have separate arenas (likely when concurrent threads have been used), the only way to get the full picture is to call malloc_stats(), which prints to stderr. Regards, Wolfram. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 7:01 ` Jon Smirl 2007-12-11 7:34 ` Andreas Ericsson 2007-12-11 13:49 ` Nicolas Pitre @ 2007-12-11 16:33 ` Linus Torvalds 2007-12-11 17:21 ` Nicolas Pitre 2007-12-11 18:43 ` Jon Smirl 2007-12-11 17:28 ` Daniel Berlin 3 siblings, 2 replies; 82+ messages in thread From: Linus Torvalds @ 2007-12-11 16:33 UTC (permalink / raw To: Jon Smirl; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List On Tue, 11 Dec 2007, Jon Smirl wrote: > > So why does our threaded code take 20 CPU minutes longer (12%) to run > than the same code with a single thread? Threaded code *always* takes more CPU time. The only thing you can hope for is a wall-clock reduction. You're seeing probably a combination of (a) more cache misses (b) bigger dataset active at a time and a probably fairly miniscule (c) threading itself tends to have some overheads. > Q6600 is just two E6600s in the same package, the caches are not shared. Sure they are shared. They're just not *entirely* shared. But they are shared between each two cores, so each thread essentially has only half the cache they had with the non-threaded version. Threading is *not* a magic solution to all problems. It gives you potentially twice the CPU power, but there are real downsides that you should keep in mind. > Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc) > with 4 threads? But only need 950MB with one thread? Where's the extra > gigabyte going? I suspect that it's really simple: you have a few rather big files in the gcc history, with deep delta chains. And what happens when you have four threads running at the same time is that they all need to keep all those objects that they are working on - and their hash state - in memory at the same time! So if you want to use more threads, that _forces_ you to have a bigger memory footprint, simply because you have more "live" objects that you work on. Normally, that isn't much of a problem, since most source files are small, but if you have a few deep delta chains on big files, both the delta chain itself is going to use memory (you may have limited the size of the cache, but it's still needed for the actual delta generation, so it's not like the memory usage went away). That said, I suspect there are a few things fighting you: - threading is hard. I haven't looked a lot at the changes Nico did to do a threaded object packer, but what I've seen does not convince me it is correct. The "trg_entry" accesses are *mostly* protected with "cache_lock", but nothing else really seems to be, so quite frankly, I wouldn't trust the threaded version very much. It's off by default, and for a good reason, I think. For example: the packing code does this: if (!src->data) { read_lock(); src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz); read_unlock(); ... and that's racy. If two threads come in at roughly the same time and see a NULL src->data, theÿ́'ll both get the lock, and they'll both (serially) try to fill it in. It will all *work*, but one of them will have done unnecessary work, and one of them will have their result thrown away and leaked. Are you hitting issues like this? I dunno. The object sorting means that different threads normally shouldn't look at the same objects (not even the sources), so probably not, but basically, I wouldn't trust the threading 100%. It needs work, and it needs to stay off by default. - you're working on a problem that isn't really even worth optimizing that much. The *normal* case is to re-use old deltas, which makes all of the issues you are fighting basically go away (because you only have a few _incremental_ objects that need deltaing). In other words: the _real_ optimizations have already been done, and are done elsewhere, and are much smarter (the best way to optimize X is not to make X run fast, but to avoid doing X in the first place!). The thing you are trying to work with is the one-time-only case where you explicitly disable that big and important optimization, and then you complain about the end result being slow! It's like saying that you're compiling with extreme debugging and no optimizations, and then complaining that the end result doesn't run as fast as if you used -O2. Except this is a hundred times worse, because you literally asked git to do the really expensive thing that it really really doesn't want to do ;) > Is there another allocator to try? One that combines Google's > efficiency with gcc's speed? See above: I'd look around at threading-related bugs and check the way we lock (or don't) accesses. Linus ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 16:33 ` Linus Torvalds @ 2007-12-11 17:21 ` Nicolas Pitre 2007-12-11 17:24 ` David Miller 2007-12-11 18:43 ` Jon Smirl 1 sibling, 1 reply; 82+ messages in thread From: Nicolas Pitre @ 2007-12-11 17:21 UTC (permalink / raw To: Linus Torvalds; +Cc: Jon Smirl, Junio C Hamano, gcc, Git Mailing List [-- Attachment #1: Type: TEXT/PLAIN, Size: 4146 bytes --] On Tue, 11 Dec 2007, Linus Torvalds wrote: > That said, I suspect there are a few things fighting you: > > - threading is hard. I haven't looked a lot at the changes Nico did to do > a threaded object packer, but what I've seen does not convince me it is > correct. The "trg_entry" accesses are *mostly* protected with > "cache_lock", but nothing else really seems to be, so quite frankly, I > wouldn't trust the threaded version very much. It's off by default, and > for a good reason, I think. I beg to differ (of course, since I always know precisely what I do, and like you, my code never has bugs). Seriously though, the trg_entry has not to be protected at all. Why? Simply because each thread has its own exclusive set of objects which no other threads ever mess with. They never overlap. > For example: the packing code does this: > > if (!src->data) { > read_lock(); > src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz); > read_unlock(); > ... > > and that's racy. If two threads come in at roughly the same time and > see a NULL src->data, theÿ́'ll both get the lock, and they'll both > (serially) try to fill it in. It will all *work*, but one of them will > have done unnecessary work, and one of them will have their result > thrown away and leaked. No. Once again, it is impossible for two threads to ever see the same src->data at all. The lock is there simply because read_sha1_file() is not reentrant. > Are you hitting issues like this? I dunno. The object sorting means > that different threads normally shouldn't look at the same objects (not > even the sources), so probably not, but basically, I wouldn't trust the > threading 100%. It needs work, and it needs to stay off by default. For now it is, but I wouldn't say it really needs significant work at this point. The latest thread patches were more about tuning than correctness. What the threading could be doing, though, is uncovering some other bugs, like in the pack mmap windowing code for example. Although that code is serialized by the read lock above, the fact that multiple threads are hammering on it in turns means that the mmap window is possibly seeking back and forth much more often than otherwise, possibly leaking something in the process. > - you're working on a problem that isn't really even worth optimizing > that much. The *normal* case is to re-use old deltas, which makes all > of the issues you are fighting basically go away (because you only have > a few _incremental_ objects that need deltaing). > > In other words: the _real_ optimizations have already been done, and > are done elsewhere, and are much smarter (the best way to optimize X is > not to make X run fast, but to avoid doing X in the first place!). The > thing you are trying to work with is the one-time-only case where you > explicitly disable that big and important optimization, and then you > complain about the end result being slow! > > It's like saying that you're compiling with extreme debugging and no > optimizations, and then complaining that the end result doesn't run as > fast as if you used -O2. Except this is a hundred times worse, because > you literally asked git to do the really expensive thing that it really > really doesn't want to do ;) Linus, please pay attention to the _actual_ important issue here. Sure I've been tuning the threading code in parallel to the attempt to debug this memory usage issue. BUT. The point is that repacking the gcc repo using "git repack -a -f --window=250" has a radically different memory usage profile whether you do the repack on the earlier 2.1GB pack or the later 300MB pack. _That_ is the issue. Ironically, it is the 300MB pack that causes the repack to blow memory usage out of proportion. And in both cases, the threading code has to do the same work whether or not the original pack was densely packed or not since -f throws away every existing deltas anyway. So something is fishy elsewhere than in the packing code. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 17:21 ` Nicolas Pitre @ 2007-12-11 17:24 ` David Miller 2007-12-11 17:44 ` Nicolas Pitre 0 siblings, 1 reply; 82+ messages in thread From: David Miller @ 2007-12-11 17:24 UTC (permalink / raw To: nico; +Cc: torvalds, jonsmirl, gitster, gcc, git From: Nicolas Pitre <nico@cam.org> Date: Tue, 11 Dec 2007 12:21:11 -0500 (EST) > BUT. The point is that repacking the gcc repo using "git repack -a -f > --window=250" has a radically different memory usage profile whether you > do the repack on the earlier 2.1GB pack or the later 300MB pack. If you repack on the smaller pack file, git has to expand more stuff internally in order to search the deltas, whereas with the larger pack file I bet git has to less often undelta'ify to get base objects blobs for delta search. In fact that behavior makes perfect sense to me and I don't understand GIT internals very well :-) ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 17:24 ` David Miller @ 2007-12-11 17:44 ` Nicolas Pitre 2007-12-11 20:26 ` Andreas Ericsson 0 siblings, 1 reply; 82+ messages in thread From: Nicolas Pitre @ 2007-12-11 17:44 UTC (permalink / raw To: David Miller; +Cc: Linus Torvalds, jonsmirl, Junio C Hamano, gcc, git On Tue, 11 Dec 2007, David Miller wrote: > From: Nicolas Pitre <nico@cam.org> > Date: Tue, 11 Dec 2007 12:21:11 -0500 (EST) > > > BUT. The point is that repacking the gcc repo using "git repack -a -f > > --window=250" has a radically different memory usage profile whether you > > do the repack on the earlier 2.1GB pack or the later 300MB pack. > > If you repack on the smaller pack file, git has to expand more stuff > internally in order to search the deltas, whereas with the larger pack > file I bet git has to less often undelta'ify to get base objects blobs > for delta search. Of course. I came to that conclusion two days ago. And despite being pretty familiar with the involved code (I wrote part of it myself) I just can't spot anything wrong with it so far. But somehow the threading code keep distracting people from that issue since it gets to do the same work whether or not the source pack is densely packed or not. Nicolas (who wish he had access to a much faster machine to investigate this issue) ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 17:44 ` Nicolas Pitre @ 2007-12-11 20:26 ` Andreas Ericsson 0 siblings, 0 replies; 82+ messages in thread From: Andreas Ericsson @ 2007-12-11 20:26 UTC (permalink / raw To: Nicolas Pitre Cc: David Miller, Linus Torvalds, jonsmirl, Junio C Hamano, gcc, git Nicolas Pitre wrote: > On Tue, 11 Dec 2007, David Miller wrote: > >> From: Nicolas Pitre <nico@cam.org> >> Date: Tue, 11 Dec 2007 12:21:11 -0500 (EST) >> >>> BUT. The point is that repacking the gcc repo using "git repack -a -f >>> --window=250" has a radically different memory usage profile whether you >>> do the repack on the earlier 2.1GB pack or the later 300MB pack. >> If you repack on the smaller pack file, git has to expand more stuff >> internally in order to search the deltas, whereas with the larger pack >> file I bet git has to less often undelta'ify to get base objects blobs >> for delta search. > > Of course. I came to that conclusion two days ago. And despite being > pretty familiar with the involved code (I wrote part of it myself) I > just can't spot anything wrong with it so far. > > But somehow the threading code keep distracting people from that issue > since it gets to do the same work whether or not the source pack is > densely packed or not. > > Nicolas > (who wish he had access to a much faster machine to investigate this issue) If it's still an issue next week, we'll have a 16 core (8 dual-core cpu's) machine with some 32gb of ram in that'll be free for about two days. You'll have to remind me about it though, as I've got a lot on my mind these days. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 16:33 ` Linus Torvalds 2007-12-11 17:21 ` Nicolas Pitre @ 2007-12-11 18:43 ` Jon Smirl 2007-12-11 18:57 ` Nicolas Pitre 2007-12-11 19:17 ` Linus Torvalds 1 sibling, 2 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-11 18:43 UTC (permalink / raw To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List On 12/11/07, Linus Torvalds <torvalds@linux-foundation.org> wrote: > > > On Tue, 11 Dec 2007, Jon Smirl wrote: > > > > So why does our threaded code take 20 CPU minutes longer (12%) to run > > than the same code with a single thread? > > Threaded code *always* takes more CPU time. The only thing you can hope > for is a wall-clock reduction. You're seeing probably a combination of > (a) more cache misses > (b) bigger dataset active at a time > and a probably fairly miniscule > (c) threading itself tends to have some overheads. > > > Q6600 is just two E6600s in the same package, the caches are not shared. > > Sure they are shared. They're just not *entirely* shared. But they are > shared between each two cores, so each thread essentially has only half > the cache they had with the non-threaded version. > > Threading is *not* a magic solution to all problems. It gives you > potentially twice the CPU power, but there are real downsides that you > should keep in mind. > > > Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc) > > with 4 threads? But only need 950MB with one thread? Where's the extra > > gigabyte going? > > I suspect that it's really simple: you have a few rather big files in the > gcc history, with deep delta chains. And what happens when you have four > threads running at the same time is that they all need to keep all those > objects that they are working on - and their hash state - in memory at the > same time! > > So if you want to use more threads, that _forces_ you to have a bigger > memory footprint, simply because you have more "live" objects that you > work on. Normally, that isn't much of a problem, since most source files > are small, but if you have a few deep delta chains on big files, both the > delta chain itself is going to use memory (you may have limited the size > of the cache, but it's still needed for the actual delta generation, so > it's not like the memory usage went away). This makes sense. Those runs that blew up to 4.5GB were a combination of this effect and fragmentation in the gcc allocator. Google allocator appears to be much better at controlling fragmentation. Is there a reasonable scheme to force the chains to only be loaded once and then shared between worker threads? The memory blow up appears to be directly correlated with chain length. > > That said, I suspect there are a few things fighting you: > > - threading is hard. I haven't looked a lot at the changes Nico did to do > a threaded object packer, but what I've seen does not convince me it is > correct. The "trg_entry" accesses are *mostly* protected with > "cache_lock", but nothing else really seems to be, so quite frankly, I > wouldn't trust the threaded version very much. It's off by default, and > for a good reason, I think. > > For example: the packing code does this: > > if (!src->data) { > read_lock(); > src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz); > read_unlock(); > ... > > and that's racy. If two threads come in at roughly the same time and > see a NULL src->data, theÿ́'ll both get the lock, and they'll both > (serially) try to fill it in. It will all *work*, but one of them will > have done unnecessary work, and one of them will have their result > thrown away and leaked. That may account for the threaded version needing an extra 20 minutes CPU time. An extra 12% of CPU seems like too much overhead for threading. Just letting a couple of those long chain compressions be done twice > > Are you hitting issues like this? I dunno. The object sorting means > that different threads normally shouldn't look at the same objects (not > even the sources), so probably not, but basically, I wouldn't trust the > threading 100%. It needs work, and it needs to stay off by default. > > - you're working on a problem that isn't really even worth optimizing > that much. The *normal* case is to re-use old deltas, which makes all > of the issues you are fighting basically go away (because you only have > a few _incremental_ objects that need deltaing). I agree, this problem only occurs when people import giant repositories. But every time someone hits these problems they declare git to be screwed up and proceed to thrash it in their blogs. > In other words: the _real_ optimizations have already been done, and > are done elsewhere, and are much smarter (the best way to optimize X is > not to make X run fast, but to avoid doing X in the first place!). The > thing you are trying to work with is the one-time-only case where you > explicitly disable that big and important optimization, and then you > complain about the end result being slow! > > It's like saying that you're compiling with extreme debugging and no > optimizations, and then complaining that the end result doesn't run as > fast as if you used -O2. Except this is a hundred times worse, because > you literally asked git to do the really expensive thing that it really > really doesn't want to do ;) > > > Is there another allocator to try? One that combines Google's > > efficiency with gcc's speed? > > See above: I'd look around at threading-related bugs and check the way we > lock (or don't) accesses. > > Linus > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 18:43 ` Jon Smirl @ 2007-12-11 18:57 ` Nicolas Pitre 2007-12-11 19:17 ` Linus Torvalds 1 sibling, 0 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-11 18:57 UTC (permalink / raw To: Jon Smirl; +Cc: Linus Torvalds, Junio C Hamano, gcc, Git Mailing List [-- Attachment #1: Type: TEXT/PLAIN, Size: 3040 bytes --] On Tue, 11 Dec 2007, Jon Smirl wrote: > This makes sense. Those runs that blew up to 4.5GB were a combination > of this effect and fragmentation in the gcc allocator. I disagree. This is insane. > Google allocator appears to be much better at controlling fragmentation. Indeed. And if fragmentation is indeed wasting half of Git's memory usage then we'll have to come with a custom memory allocator. > Is there a reasonable scheme to force the chains to only be loaded > once and then shared between worker threads? The memory blow up > appears to be directly correlated with chain length. No. That would be the equivalent of holding each revision of all files uncompressed all at once in memory. > > That said, I suspect there are a few things fighting you: > > > > - threading is hard. I haven't looked a lot at the changes Nico did to do > > a threaded object packer, but what I've seen does not convince me it is > > correct. The "trg_entry" accesses are *mostly* protected with > > "cache_lock", but nothing else really seems to be, so quite frankly, I > > wouldn't trust the threaded version very much. It's off by default, and > > for a good reason, I think. > > > > For example: the packing code does this: > > > > if (!src->data) { > > read_lock(); > > src->data = read_sha1_file(src_entry->idx.sha1, &type, &sz); > > read_unlock(); > > ... > > > > and that's racy. If two threads come in at roughly the same time and > > see a NULL src->data, theÿ́'ll both get the lock, and they'll both > > (serially) try to fill it in. It will all *work*, but one of them will > > have done unnecessary work, and one of them will have their result > > thrown away and leaked. > > That may account for the threaded version needing an extra 20 minutes > CPU time. An extra 12% of CPU seems like too much overhead for > threading. Just letting a couple of those long chain compressions be > done twice No it may not. This theory is wrong as explained before. > > > > Are you hitting issues like this? I dunno. The object sorting means > > that different threads normally shouldn't look at the same objects (not > > even the sources), so probably not, but basically, I wouldn't trust the > > threading 100%. It needs work, and it needs to stay off by default. > > > > - you're working on a problem that isn't really even worth optimizing > > that much. The *normal* case is to re-use old deltas, which makes all > > of the issues you are fighting basically go away (because you only have > > a few _incremental_ objects that need deltaing). > > I agree, this problem only occurs when people import giant > repositories. But every time someone hits these problems they declare > git to be screwed up and proceed to thrash it in their blogs. It's not only for repack. Someone just reported git-blame being unusable too due to insane memory usage, which I suspect is due to the same issue. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 18:43 ` Jon Smirl 2007-12-11 18:57 ` Nicolas Pitre @ 2007-12-11 19:17 ` Linus Torvalds 2007-12-11 19:40 ` Junio C Hamano 1 sibling, 1 reply; 82+ messages in thread From: Linus Torvalds @ 2007-12-11 19:17 UTC (permalink / raw To: Jon Smirl; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List On Tue, 11 Dec 2007, Jon Smirl wrote: > > > > So if you want to use more threads, that _forces_ you to have a bigger > > memory footprint, simply because you have more "live" objects that you > > work on. Normally, that isn't much of a problem, since most source files > > are small, but if you have a few deep delta chains on big files, both the > > delta chain itself is going to use memory (you may have limited the size > > of the cache, but it's still needed for the actual delta generation, so > > it's not like the memory usage went away). > > This makes sense. Those runs that blew up to 4.5GB were a combination > of this effect and fragmentation in the gcc allocator. Google > allocator appears to be much better at controlling fragmentation. Yes. I think we do have some case where we simply keep a lot of objects around, and if we are talking reasonably large deltas, we'll have the whole delta-chain in memory just to unpack one single object. The delta cache size limits kick in only when we explicitly cache old delta results (in case they will be re-used, which is rather common), it doesn't affect the normal "I'm using this data right now" case at all. And then fragmentation makes it much much worse. Since the allocation patterns aren't nice (they are pretty random and depend on just the sizes of the objects), and the lifetimes aren't always nicely nested _either_ (they become more so when you disable the cache entirely, but that's just death for performance), I'm not surprised that there can be memory allocators that end up having some issues. > Is there a reasonable scheme to force the chains to only be loaded > once and then shared between worker threads? The memory blow up > appears to be directly correlated with chain length. The worker threads explicitly avoid touching the same objects, and no, you definitely don't want to explode the chains globally once, because the whole point is that we do fit 15 years worth of history into 300MB of pack-file thanks to having a very dense representation. The "loaded once" part is the mmap'ing of the pack-file into memory, but if you were to actually then try to expand the chains, you'd be talking about many *many* more gigabytes of memory than you already see used ;) So what you actually want to do is to just re-use already packed delta chains directly, which is what we normally do. But you are explicitly looking at the "--no-reuse-delta" (aka "git repack -f") case, which is why it then blows up. I'm sure we can find places to improve. But I would like to re-iterate the statement that you're kind of doing a "don't do that then" case which is really - by design - meant to be done once and never again, and is using resources - again, pretty much by design - wildly inappropriately just to get an initial packing done. > That may account for the threaded version needing an extra 20 minutes > CPU time. An extra 12% of CPU seems like too much overhead for > threading. Just letting a couple of those long chain compressions be > done twice Well, Nico pointed out that those things should all be thread-private data, so no, the race isn't there (unless there's some other bug there). > I agree, this problem only occurs when people import giant > repositories. But every time someone hits these problems they declare > git to be screwed up and proceed to thrash it in their blogs. Sure. I'd love to do global packing without paying the cost, but it really was a design decision. Thanks to doing off-line packing ("let it run overnight on some beefy machine") we can get better results. It's expensive, yes. But it was pretty much meant to be expensive. It's a very efficient compression algorithm, after all, and you're turning it up to eleven ;) I also suspect that the gcc archive makes things more interesting thanks to having some rather large files. The ChangeLog is probably the worst case (large file with *lots* of edits), but I suspect the *.po files aren't wonderful either. Linus ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 19:17 ` Linus Torvalds @ 2007-12-11 19:40 ` Junio C Hamano 2007-12-11 20:34 ` Andreas Ericsson 0 siblings, 1 reply; 82+ messages in thread From: Junio C Hamano @ 2007-12-11 19:40 UTC (permalink / raw To: Linus Torvalds; +Cc: Jon Smirl, Nicolas Pitre, gcc, Git Mailing List Linus Torvalds <torvalds@linux-foundation.org> writes: > On Tue, 11 Dec 2007, Jon Smirl wrote: >> > >> > So if you want to use more threads, that _forces_ you to have a bigger >> > memory footprint, simply because you have more "live" objects that you >> > work on. Normally, that isn't much of a problem, since most source files >> > are small, but if you have a few deep delta chains on big files, both the >> > delta chain itself is going to use memory (you may have limited the size >> > of the cache, but it's still needed for the actual delta generation, so >> > it's not like the memory usage went away). >> >> This makes sense. Those runs that blew up to 4.5GB were a combination >> of this effect and fragmentation in the gcc allocator. Google >> allocator appears to be much better at controlling fragmentation. > > Yes. I think we do have some case where we simply keep a lot of objects > around, and if we are talking reasonably large deltas, we'll have the > whole delta-chain in memory just to unpack one single object. Eh, excuse me. unpack_delta_entry() - first unpacks the base object (this goes recursive); - uncompresses the delta; - applies the delta to the base to obtain the target object; - frees delta; - frees (but allows it to be cached) the base object; - returns the result So no matter how deep a chain is, you keep only one delta at a time in core, not whole delta-chain in core. > So what you actually want to do is to just re-use already packed delta > chains directly, which is what we normally do. But you are explicitly > looking at the "--no-reuse-delta" (aka "git repack -f") case, which is why > it then blows up. While that does not explain, as Nico pointed out, the huge difference between the two repack runs that have different starting pack, I would say it is a fair thing to say. If you have a suboptimal pack (i.e. not enough reusable deltas, as in the 2.1GB pack case), do run "repack -f", but if you have a good pack (i.e. 300MB pack), don't. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 19:40 ` Junio C Hamano @ 2007-12-11 20:34 ` Andreas Ericsson 0 siblings, 0 replies; 82+ messages in thread From: Andreas Ericsson @ 2007-12-11 20:34 UTC (permalink / raw To: Junio C Hamano Cc: Linus Torvalds, Jon Smirl, Nicolas Pitre, gcc, Git Mailing List Junio C Hamano wrote: > Linus Torvalds <torvalds@linux-foundation.org> writes: > >> So what you actually want to do is to just re-use already packed delta >> chains directly, which is what we normally do. But you are explicitly >> looking at the "--no-reuse-delta" (aka "git repack -f") case, which is why >> it then blows up. > > While that does not explain, as Nico pointed out, the huge difference > between the two repack runs that have different starting pack, I would > say it is a fair thing to say. If you have a suboptimal pack (i.e. not > enough reusable deltas, as in the 2.1GB pack case), do run "repack -f", > but if you have a good pack (i.e. 300MB pack), don't. I think this is too much of a mystery for a lot of people to let it go. Even I started looking into it, and I've got so little spare time just now that I wouldn't stand much of a chance of making a contribution even if I had written the code originally. That being said, I the fact that some git repositories really *can't* be repacked on some machines (because it eats ALL virtual memory) is really something that lowers git's reputation among huge projects. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 7:01 ` Jon Smirl ` (2 preceding siblings ...) 2007-12-11 16:33 ` Linus Torvalds @ 2007-12-11 17:28 ` Daniel Berlin 3 siblings, 0 replies; 82+ messages in thread From: Daniel Berlin @ 2007-12-11 17:28 UTC (permalink / raw To: Jon Smirl; +Cc: Nicolas Pitre, Junio C Hamano, gcc, Git Mailing List On 12/11/07, Jon Smirl <jonsmirl@gmail.com> wrote: > > Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of > being faster are not true. Depends on your allocation patterns. For our apps, it certainly is :) Of course, i don't know if we've updated the external allocator in a while, i'll bug the people in charge of it. ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 5:29 ` Jon Smirl 2007-12-11 7:01 ` Jon Smirl @ 2007-12-11 13:31 ` Nicolas Pitre 1 sibling, 0 replies; 82+ messages in thread From: Nicolas Pitre @ 2007-12-11 13:31 UTC (permalink / raw To: Jon Smirl; +Cc: Junio C Hamano, gcc, Git Mailing List On Tue, 11 Dec 2007, Jon Smirl wrote: > I added the gcc people to the CC, it's their repository. Maybe they > can help up sort this out. Unless there is a Git expert amongst the gcc crowd, I somehow doubt it. And gcc people with an interest in Git internals are probably already on the Git mailing list. Nicolas ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 5:25 ` Jon Smirl 2007-12-11 5:29 ` Jon Smirl @ 2007-12-11 6:01 ` Sean 2007-12-11 6:20 ` Jon Smirl 1 sibling, 1 reply; 82+ messages in thread From: Sean @ 2007-12-11 6:01 UTC (permalink / raw To: Jon Smirl; +Cc: Nicolas Pitre, Junio C Hamano, Git Mailing List On Tue, 11 Dec 2007 00:25:55 -0500 "Jon Smirl" <jonsmirl@gmail.com> wrote: > Something is hurting bad with threads. 170 CPU minutes with one > thread, versus 195 CPU minutes with four threads. > > Is there a different memory allocator that can be used when > multithreaded on gcc? This whole problem may be coming from the memory > allocation function. git is hardly interacting at all on the thread > level so it's likely a problem in the C run-time. You might want to try Google's malloc, it's basically a drop in replacement with some optional built-in performance monitoring capabilities. It is said to be much faster and better at threading than glibc's: http://code.google.com/p/google-perftools/wiki/GooglePerformanceTools http://google-perftools.googlecode.com/svn/trunk/doc/tcmalloc.html You can LD_PRELOAD it or link directly. Cheers, Sean ^ permalink raw reply [flat|nested] 82+ messages in thread
* Re: Something is broken in repack 2007-12-11 6:01 ` Sean @ 2007-12-11 6:20 ` Jon Smirl 0 siblings, 0 replies; 82+ messages in thread From: Jon Smirl @ 2007-12-11 6:20 UTC (permalink / raw To: Sean; +Cc: Nicolas Pitre, Junio C Hamano, Git Mailing List On 12/11/07, Sean <seanlkml@sympatico.ca> wrote: > On Tue, 11 Dec 2007 00:25:55 -0500 > "Jon Smirl" <jonsmirl@gmail.com> wrote: > > > Something is hurting bad with threads. 170 CPU minutes with one > > thread, versus 195 CPU minutes with four threads. > > > > Is there a different memory allocator that can be used when > > multithreaded on gcc? This whole problem may be coming from the memory > > allocation function. git is hardly interacting at all on the thread > > level so it's likely a problem in the C run-time. > > You might want to try Google's malloc, it's basically a drop in replacement > with some optional built-in performance monitoring capabilities. It is said > to be much faster and better at threading than glibc's: > > http://code.google.com/p/google-perftools/wiki/GooglePerformanceTools > http://google-perftools.googlecode.com/svn/trunk/doc/tcmalloc.html > > > You can LD_PRELOAD it or link directly. I'm 45 minutes into a run using it. It doesn't seem to be any faster but it is reducing memory consumption significantly. The run should be done in another 20 minutes or so. > > Cheers, > Sean > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 82+ messages in thread
end of thread, other threads:[~2007-12-14 17:00 UTC | newest] Thread overview: 82+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-12-07 23:05 Something is broken in repack Jon Smirl 2007-12-08 0:37 ` Linus Torvalds 2007-12-08 1:27 ` [PATCH] pack-objects: fix delta cache size accounting Nicolas Pitre 2007-12-08 1:46 ` Something is broken in repack Nicolas Pitre 2007-12-08 2:04 ` Jon Smirl 2007-12-08 2:28 ` Nicolas Pitre 2007-12-08 3:29 ` Jon Smirl 2007-12-08 3:37 ` David Brown 2007-12-08 4:22 ` Jon Smirl 2007-12-08 4:30 ` Nicolas Pitre 2007-12-08 5:01 ` Jon Smirl 2007-12-08 5:12 ` Nicolas Pitre 2007-12-08 3:48 ` Harvey Harrison 2007-12-08 2:22 ` Jon Smirl 2007-12-08 3:44 ` Harvey Harrison 2007-12-08 22:18 ` Junio C Hamano 2007-12-09 8:05 ` Junio C Hamano 2007-12-09 15:19 ` Jon Smirl 2007-12-09 18:25 ` Jon Smirl 2007-12-10 1:07 ` Nicolas Pitre 2007-12-10 2:49 ` Nicolas Pitre 2007-12-08 2:56 ` David Brown 2007-12-10 19:56 ` Nicolas Pitre 2007-12-10 20:05 ` Jon Smirl 2007-12-10 20:16 ` Morten Welinder 2007-12-11 2:25 ` Jon Smirl 2007-12-11 2:55 ` Junio C Hamano 2007-12-11 3:27 ` Nicolas Pitre 2007-12-11 11:08 ` David Kastrup 2007-12-11 12:08 ` Pierre Habouzit 2007-12-11 12:18 ` David Kastrup 2007-12-11 3:49 ` Nicolas Pitre 2007-12-11 5:25 ` Jon Smirl 2007-12-11 5:29 ` Jon Smirl 2007-12-11 7:01 ` Jon Smirl 2007-12-11 7:34 ` Andreas Ericsson 2007-12-11 13:49 ` Nicolas Pitre 2007-12-11 15:00 ` Nicolas Pitre 2007-12-11 15:36 ` Jon Smirl 2007-12-11 16:20 ` Nicolas Pitre 2007-12-11 16:21 ` Jon Smirl 2007-12-12 5:12 ` Nicolas Pitre 2007-12-12 8:05 ` David Kastrup 2007-12-14 16:18 ` Wolfram Gloger 2007-12-12 15:48 ` Nicolas Pitre 2007-12-12 16:17 ` Paolo Bonzini 2007-12-12 16:37 ` Linus Torvalds 2007-12-12 16:42 ` David Miller 2007-12-12 16:54 ` Linus Torvalds 2007-12-12 17:12 ` Jon Smirl 2007-12-14 16:12 ` Wolfram Gloger 2007-12-14 16:45 ` David Kastrup 2007-12-14 16:59 ` Wolfram Gloger 2007-12-13 13:32 ` Nguyen Thai Ngoc Duy 2007-12-13 15:32 ` Paolo Bonzini 2007-12-13 16:29 ` Paolo Bonzini 2007-12-13 16:39 ` Johannes Sixt 2007-12-14 1:04 ` Jakub Narebski 2007-12-14 6:14 ` Paolo Bonzini 2007-12-14 6:24 ` Nguyen Thai Ngoc Duy 2007-12-14 8:20 ` Paolo Bonzini 2007-12-14 9:01 ` Harvey Harrison 2007-12-14 10:40 ` Jakub Narebski 2007-12-14 10:52 ` Nguyen Thai Ngoc Duy 2007-12-14 13:25 ` Nicolas Pitre 2007-12-12 16:13 ` Nicolas Pitre 2007-12-13 7:32 ` Andreas Ericsson 2007-12-14 16:03 ` Wolfram Gloger 2007-12-11 16:33 ` Linus Torvalds 2007-12-11 17:21 ` Nicolas Pitre 2007-12-11 17:24 ` David Miller 2007-12-11 17:44 ` Nicolas Pitre 2007-12-11 20:26 ` Andreas Ericsson 2007-12-11 18:43 ` Jon Smirl 2007-12-11 18:57 ` Nicolas Pitre 2007-12-11 19:17 ` Linus Torvalds 2007-12-11 19:40 ` Junio C Hamano 2007-12-11 20:34 ` Andreas Ericsson 2007-12-11 17:28 ` Daniel Berlin 2007-12-11 13:31 ` Nicolas Pitre 2007-12-11 6:01 ` Sean 2007-12-11 6:20 ` Jon Smirl
Code repositories for project(s) associated with this public inbox https://80x24.org/mirrors/git.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).