git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
* Performance of various compressors
@ 2005-04-21  5:06 Mike Taht
  2005-04-21  5:14 ` Mike Taht
  2005-04-22 20:38 ` Performance of various compressors Aaron Lehmann
  0 siblings, 2 replies; 11+ messages in thread
From: Mike Taht @ 2005-04-21  5:06 UTC (permalink / raw
  To: git

I started rolling a tool to measure various aspects of git performance. 
I will start looking at merge next, and at workloads different from the 
kernel (gcc4 anyone?) ...

The only data points worth sharing a this point are:

That doing the compression at a level of 3, rather than the max of 9, 
cuts the cpu time required for a big git commit by over half, and that 
that actually translates into a win on the I/O to disk. (these tests 
were performed on a dual opteron 842)

The benefits of compression aren't very much for git right now.

And: A big git commit is I/O bound. But we knew that. Maybe it's 
possible to make it less I/O bound.

Git branch: 7a4c67965de68ae7bc7aa1fde33f8eb9d8114697
Tree: 2.6.11.7 source tree
Branch: N/a
Merge File: N/a
HW: dual opteron 242
Mem: 1GB
Disk: seagate barracuda
Filesystem: Reiser3
Git add: N/a
Cache: Hot
Git Commit: 44.97user 5.94system 1:45.24elapsed 48%CPU
Git Merge:
Options:
Feature: Test of compression=9 (std git)

du -s .git/objects  110106  # du is probably not the right thing
du -s --apparent-size .git/objects 58979

Git branch: 9e272677621c91784cf2533123a41745178f0701
Tree: 2.6.11.7 source tree
Branch: N/a
Merge File: N/a
HW: dual opteron 242
Mem: 1GB
Disk: seagate barracuda
Disk mode: udma5
Filesystem: Reiser3
Git add: N/a
Cache: Hot
Git Commit: 16.79user 6.15system 1:21.92elapsed 28%CPU
Git Merge:
Options:
Feature: Test of compression=3 (std git)

du -s .git/objects  115218
du -s --apparent-size .git/objects 64274

There's some variety in the best/worst case timings for I/O for the 
compressor=3 case...

16.79user 6.15system 1:21.92elapsed 28%CPU
16.68user 5.71system 1:13.19elapsed 30%CPU
-- 

Mike Taht


lastly -

Timings of git commit with tmpfs (note, these were done with an ancient, 
5 hour old version of git and the script)

Hot cache, tmpfs .git compression=9
44.97user 2.76system 0:47.72elapsed 100%CPU

Hot cache, tmpfs .git, compression=6
Wed Apr 20 20:18:11 PDT 2005
23.55user 2.83system 0:26.36elapsed 100%CPU (0avgtext+0avgdata 
0maxresident)k
0inputs+0outputs (0major+568680minor)pagefaults 0swaps
109620  .git/objects
58618   .git/objects



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of various compressors
  2005-04-21  5:06 Performance of various compressors Mike Taht
@ 2005-04-21  5:14 ` Mike Taht
  2005-04-21  5:22   ` [PATCH] experimental - " Mike Taht
  2005-04-22 20:38 ` Performance of various compressors Aaron Lehmann
  1 sibling, 1 reply; 11+ messages in thread
From: Mike Taht @ 2005-04-21  5:14 UTC (permalink / raw
  To: git

Just to clarify this was a git add of the linux-2.6.11.7 sources (sorry, 
untimed) , and timing the git commit.

Mo betta data latah.

Mike Taht wrote:
> I started rolling a tool to measure various aspects of git performance. 
> I will start looking at merge next, and at workloads different from the 
> kernel (gcc4 anyone?) ...
> 
> The only data points worth sharing a this point are:
> 
> That doing the compression at a level of 3, rather than the max of 9, 
> cuts the cpu time required for a big git commit by over half, and that 
> that actually translates into a win on the I/O to disk. (these tests 
> were performed on a dual opteron 842)
> 
> The benefits of compression aren't very much for git right now.
> 
> And: A big git commit is I/O bound. But we knew that. Maybe it's 
> possible to make it less I/O bound.
> 
> Git branch: 7a4c67965de68ae7bc7aa1fde33f8eb9d8114697
> Tree: 2.6.11.7 source tree
> Branch: N/a
> Merge File: N/a
> HW: dual opteron 242
> Mem: 1GB
> Disk: seagate barracuda
> Filesystem: Reiser3
> Git add: N/a
> Cache: Hot
> Git Commit: 44.97user 5.94system 1:45.24elapsed 48%CPU
> Git Merge:
> Options:
> Feature: Test of compression=9 (std git)
> 
> du -s .git/objects  110106  # du is probably not the right thing
> du -s --apparent-size .git/objects 58979
> 
> Git branch: 9e272677621c91784cf2533123a41745178f0701
> Tree: 2.6.11.7 source tree
> Branch: N/a
> Merge File: N/a
> HW: dual opteron 242
> Mem: 1GB
> Disk: seagate barracuda
> Disk mode: udma5
> Filesystem: Reiser3
> Git add: N/a
> Cache: Hot
> Git Commit: 16.79user 6.15system 1:21.92elapsed 28%CPU
> Git Merge:
> Options:
> Feature: Test of compression=3 (std git)
> 
> du -s .git/objects  115218
> du -s --apparent-size .git/objects 64274
> 
> There's some variety in the best/worst case timings for I/O for the 
> compressor=3 case...
> 
> 16.79user 6.15system 1:21.92elapsed 28%CPU
> 16.68user 5.71system 1:13.19elapsed 30%CPU


-- 

Mike Taht


   "The chief contribution of Protestantism to human thought is its 
massive proof
that God is a bore.
	-- H.L. Mencken, "The Aesthetic Recoil," American Mercury, July, 1931."

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] experimental - Performance of various compressors
  2005-04-21  5:14 ` Mike Taht
@ 2005-04-21  5:22   ` Mike Taht
  2005-04-21 10:23     ` HOWTO: PATCH: don't hardcode path-to-bash, use sys/limits.h Klaus Robert Suetterlin
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Taht @ 2005-04-21  5:22 UTC (permalink / raw
  To: Mike Taht; +Cc: git




Don't apply this patch and change GIT_COMPRESSION unless you know what 
you are doing and why you are doing it. You will break an older version 
of git. You may break a newer version of git. You have been warned.

I also note that there's a bzlib out there.

cache.h: 828d660ab82bb35a1ca632a2ba4620dc483889bd
--- a/cache.h
+++ b/cache.h
@@ -16,6 +16,8 @@
  #include <openssl/sha.h>
  #include <zlib.h>

+#define GIT_COMPRESSION Z_BEST_COMPRESSION
+
  /*
   * Basic data structures for the directory cache
   *
sha1_file.c: 754e8b4e9ea8104df48152f875d6b874304e2a62
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -199,7 +199,7 @@ int write_sha1_file(char *buf, unsigned

         /* Set it up */
         memset(&stream, 0, sizeof(stream));
-       deflateInit(&stream, Z_BEST_COMPRESSION);
+       deflateInit(&stream, GIT_COMPRESSION);
         size = deflateBound(&stream, len);
         compressed = malloc(size);

update-cache.c: a09883541c745c76413c62109a80f40df4b7a7fb
--- a/update-cache.c
+++ b/update-cache.c
@@ -40,7 +40,7 @@ static int index_fd(unsigned char *sha1,
         SHA1_Final(sha1, &c);

         memset(&stream, 0, sizeof(stream));
-       deflateInit(&stream, Z_BEST_COMPRESSION);
+       deflateInit(&stream, GIT_COMPRESSION);

         /*
          * ASCII size + nul byte



Mike Taht wrote:
> Just to clarify this was a git add of the linux-2.6.11.7 sources (sorry, 
> untimed) , and timing the git commit.
> 
> Mo betta data latah.
> 
> Mike Taht wrote:
> 
>> I started rolling a tool to measure various aspects of git 
>> performance. I will start looking at merge next, and at workloads 
>> different from the kernel (gcc4 anyone?) ...
>>
>> The only data points worth sharing a this point are:
>>
>> That doing the compression at a level of 3, rather than the max of 9, 
>> cuts the cpu time required for a big git commit by over half, and that 
>> that actually translates into a win on the I/O to disk. (these tests 
>> were performed on a dual opteron 842)
>>
>> The benefits of compression aren't very much for git right now.
>>
>> And: A big git commit is I/O bound. But we knew that. Maybe it's 
>> possible to make it less I/O bound.
>>
>> Git branch: 7a4c67965de68ae7bc7aa1fde33f8eb9d8114697
>> Tree: 2.6.11.7 source tree
>> Branch: N/a
>> Merge File: N/a
>> HW: dual opteron 242
>> Mem: 1GB
>> Disk: seagate barracuda
>> Filesystem: Reiser3
>> Git add: N/a
>> Cache: Hot
>> Git Commit: 44.97user 5.94system 1:45.24elapsed 48%CPU
>> Git Merge:
>> Options:
>> Feature: Test of compression=9 (std git)
>>
>> du -s .git/objects  110106  # du is probably not the right thing
>> du -s --apparent-size .git/objects 58979
>>
>> Git branch: 9e272677621c91784cf2533123a41745178f0701
>> Tree: 2.6.11.7 source tree
>> Branch: N/a
>> Merge File: N/a
>> HW: dual opteron 242
>> Mem: 1GB
>> Disk: seagate barracuda
>> Disk mode: udma5
>> Filesystem: Reiser3
>> Git add: N/a
>> Cache: Hot
>> Git Commit: 16.79user 6.15system 1:21.92elapsed 28%CPU
>> Git Merge:
>> Options:
>> Feature: Test of compression=3 (std git)
>>
>> du -s .git/objects  115218
>> du -s --apparent-size .git/objects 64274
>>
>> There's some variety in the best/worst case timings for I/O for the 
>> compressor=3 case...
>>
>> 16.79user 6.15system 1:21.92elapsed 28%CPU
>> 16.68user 5.71system 1:13.19elapsed 30%CPU
> 
> 
> 


-- 

Mike Taht


   ""His mind is like a steel trap -- full of mice."
	-- Foghorn Leghorn"

^ permalink raw reply	[flat|nested] 11+ messages in thread

* HOWTO: PATCH: don't hardcode path-to-bash, use sys/limits.h
  2005-04-21  5:22   ` [PATCH] experimental - " Mike Taht
@ 2005-04-21 10:23     ` Klaus Robert Suetterlin
  2005-04-21 14:31       ` Alecs King
  0 siblings, 1 reply; 11+ messages in thread
From: Klaus Robert Suetterlin @ 2005-04-21 10:23 UTC (permalink / raw
  To: Mike Taht; +Cc: git

Hi,

I supply a patch that dehardcodes the path to bash (which is not /bin on all computers) and adds sys/limits.h to provide ULONG_MAX.

If this is not the right way to supply patches, or if this email misses some crucial point please tell me so, and supply explanation.

-- 
Robert Suetterlin (robert@mpe.mpg.de)
phone: (+49)89 / 30000-3546   fax: (+49)89 / 30000-3950

commit 5f6caff82b1f3b5931d92aaff99be6d8dbad10ca
tree d7ea8aeefbbc2ab63cb5acd41b647b1b5f11fb83
parent cd1c034369b73da7503da365fa556aab27004814
author Klaus Robert Suetterlin <krs@xdt04.mpe-garching.mpg.de> 1114078431 +0200
committer Klaus Robert Suetterlin <krs@xdt04.mpe-garching.mpg.de> 1114078431 +0200

Don't hardcode the path-to-bash please.

Index: commit.c
===================================================================
--- c0260bfb82da04aeff4e598ced5295d6ae2e262d/commit.c  (mode:100644 sha1:eda45d7e15358ed6f2cd0502de2a08987307fc98)
+++ d7ea8aeefbbc2ab63cb5acd41b647b1b5f11fb83/commit.c  (mode:100644 sha1:cfe9a8ddf6ee2702e3923cb22240f9f9ed1bd04c)
@@ -1,3 +1,4 @@
+#include <sys/limits.h>
 #include "commit.h"
 #include "cache.h"
 #include <string.h>
Index: gitdiff-do
===================================================================
--- c0260bfb82da04aeff4e598ced5295d6ae2e262d/gitdiff-do  (mode:100755 sha1:afed4e40b259a61b0f12979ba7326f26743bc553)
+++ d7ea8aeefbbc2ab63cb5acd41b647b1b5f11fb83/gitdiff-do  (mode:100755 sha1:218dfabeb4a5dcbd2cf58bd6f672f385690ec397)
@@ -1,4 +1,4 @@
-#!/bin/bash
+#!/usr/bin/env bash
 #
 # Make a diff between two GIT trees.
 # Copyright (c) Petr Baudis, 2005
Index: gitlog.sh
===================================================================
--- c0260bfb82da04aeff4e598ced5295d6ae2e262d/gitlog.sh  (mode:100755 sha1:a496a864f9586e47a4d7bd3ae0af0b3e07b7deb8)
+++ d7ea8aeefbbc2ab63cb5acd41b647b1b5f11fb83/gitlog.sh  (mode:100755 sha1:7b3aa8a89bc64273c648920ccd1686859754803e)
@@ -1,4 +1,4 @@
-#!/bin/bash
+#!/usr/bin/env bash
 #
 # Make a log of changes in a GIT branch.
 #

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: HOWTO: PATCH: don't hardcode path-to-bash, use sys/limits.h
  2005-04-21 10:23     ` HOWTO: PATCH: don't hardcode path-to-bash, use sys/limits.h Klaus Robert Suetterlin
@ 2005-04-21 14:31       ` Alecs King
  2005-04-21 19:42         ` [PATCH] #!/bin/sh --> #!/usr/bin/env bash Alecs King
  0 siblings, 1 reply; 11+ messages in thread
From: Alecs King @ 2005-04-21 14:31 UTC (permalink / raw
  To: git

On Thu, Apr 21, 2005 at 12:23:26PM +0200, Klaus Robert Suetterlin wrote:
> Hi,
> 
> I supply a patch that dehardcodes the path to bash (which is not /bin
> on all computers) and adds sys/limits.h to provide ULONG_MAX.

Hi, i did a similar patch a while back ago. As for ULONG_MAX, not every
sytem has <sys/limits.h>, i think <limits.h> is the rite place to go.

The patch below tested on both debian and fbsd.


commit 2deea74db72fb57a8b80e7945f23814112b22723
tree 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd
parent cd1c034369b73da7503da365fa556aab27004814
author Alecs King <alecsk ! gmail d@t com> 1114075114 +0800
committer Alecs King <alecsk ! gmail d@t com> 1114075114 +0800

trivial fix for making it more portable

Index: commit-tree.c
===================================================================
--- c0260bfb82da04aeff4e598ced5295d6ae2e262d/commit-tree.c  (mode:100644 sha1:043c7aa371101a1ea8cfc467279abf6c8acc7fd1)
+++ 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/commit-tree.c  (mode:100644 sha1:8a1f12dca07041d203ce22442b8470d42d322ef5)
@@ -252,7 +252,7 @@
 
 	then -= offset;
 
-	snprintf(result, maxlen, "%lu %5.5s", then, p);
+	snprintf(result, maxlen, "%lu %5.5s", (unsigned long) then, p);
 }
 
 static void check_valid(unsigned char *sha1, const char *expect)
Index: commit.c
===================================================================
--- c0260bfb82da04aeff4e598ced5295d6ae2e262d/commit.c  (mode:100644 sha1:eda45d7e15358ed6f2cd0502de2a08987307fc98)
+++ 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/commit.c  (mode:100644 sha1:9f0668eb68cec56a738a58fe930ae0ae2960e2b2)
@@ -1,6 +1,7 @@
 #include "commit.h"
 #include "cache.h"
 #include <string.h>
+#include <limits.h>
 
 const char *commit_type = "commit";
 
Index: gitdiff-do
===================================================================
--- c0260bfb82da04aeff4e598ced5295d6ae2e262d/gitdiff-do  (mode:100755 sha1:afed4e40b259a61b0f12979ba7326f26743bc553)
+++ 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitdiff-do  (mode:100755 sha1:218dfabeb4a5dcbd2cf58bd6f672f385690ec397)
@@ -1,4 +1,4 @@
-#!/bin/bash
+#!/usr/bin/env bash
 #
 # Make a diff between two GIT trees.
 # Copyright (c) Petr Baudis, 2005
Index: gitlog.sh
===================================================================
--- c0260bfb82da04aeff4e598ced5295d6ae2e262d/gitlog.sh  (mode:100755 sha1:a496a864f9586e47a4d7bd3ae0af0b3e07b7deb8)
+++ 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitlog.sh  (mode:100755 sha1:7b3aa8a89bc64273c648920ccd1686859754803e)
@@ -1,4 +1,4 @@
-#!/bin/bash
+#!/usr/bin/env bash
 #
 # Make a log of changes in a GIT branch.
 #
Index: revision.h
===================================================================
--- c0260bfb82da04aeff4e598ced5295d6ae2e262d/revision.h  (mode:100644 sha1:46cc10440be781cea4993aca37ee35e251495084)
+++ 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/revision.h  (mode:100644 sha1:f0754f5d8ea3da52503b8ea8c16b34566e4ae6e0)
@@ -10,6 +10,7 @@
  * definition for this rev, and not just seen it as
  * a parent target.
  */
+#include <limits.h>
 #define marked(rev)	((rev)->flags & 0xffff)
 #define SEEN 0x10000
 #define USED 0x20000

-- 
Alecs King

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] #!/bin/sh --> #!/usr/bin/env bash
  2005-04-21 14:31       ` Alecs King
@ 2005-04-21 19:42         ` Alecs King
  2005-04-22  7:37           ` H. Peter Anvin
  0 siblings, 1 reply; 11+ messages in thread
From: Alecs King @ 2005-04-21 19:42 UTC (permalink / raw
  To: git

On Thu, Apr 21, 2005 at 10:31:02PM +0800, Alecs King wrote:
> On Thu, Apr 21, 2005 at 12:23:26PM +0200, Klaus Robert Suetterlin wrote:
> > Hi,
> > 
> > I supply a patch that dehardcodes the path to bash (which is not /bin
> > on all computers) and adds sys/limits.h to provide ULONG_MAX.
> 
> Hi, i did a similar patch a while back ago. As for ULONG_MAX, not every
> sytem has <sys/limits.h>, i think <limits.h> is the rite place to go.
> 
> The patch below tested on both debian and fbsd.
>  
> [snip]

And as for bash, only gitdiff-do and gitlog.sh 'explicitly' use bash
instead of /bin/sh.  On most Linux distros, /bin/sh is just a symbolic
link to bash.  But not on some others.  I found gitlsobj.sh could not
work using a plain /bin/sh on fbsd.  To make life easier, i think it
might be better if we all explicitly use bash for all shell scripts.

patch below assumes the patch above has been applied.


commit 341cd1241815178d567ce612c97c2bb5a663021a
tree abb16c39fe8354383b632f7fa9dd4611ff66e1d1
parent 2deea74db72fb57a8b80e7945f23814112b22723
author Alecs King <alecsk ! gmail d@t com> 1114107613 +0800
committer Alecs King <alecsk ! gmail d@t com> 1114107613 +0800

Explicitly use bash
#!/bin/sh ==> #!/usr/bin/env bash

Index: gitXlntree.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitXlntree.sh  (mode:100755 sha1:c474913d09906739d8175f1b430720a3ac67e798)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitXlntree.sh  (mode:100755 sha1:adc01eeb56f394a6168ae1f6f1fe4c40e1c2aecc)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Provide an independent view to the objects database.
 # Copyright (c) Petr Baudis, 2005
Index: gitXnormid.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitXnormid.sh  (mode:100755 sha1:c0d53afabe8662ebfc3c697faf08b0a2b43c93f7)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitXnormid.sh  (mode:100755 sha1:9b311aca57bd8b7012f45d730c6fd26d5fb5d2b2)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Internal: Normalize the given ID to a tree ID.
 # Copyright (c) Petr Baudis, 2005
Index: gitadd.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitadd.sh  (mode:100755 sha1:3f5e9a2d6b452d596cd853f1585113bdb356a2e3)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitadd.sh  (mode:100755 sha1:6feb7372e95be4546af17e0c6b55d10c9a1c441d)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Add new file to a GIT repository.
 # Copyright (c) Petr Baudis, 2005
Index: gitaddremote.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitaddremote.sh  (mode:100755 sha1:a117b9e8d14b977143caa48c26fc51794e8b7135)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitaddremote.sh  (mode:100755 sha1:bccaa9068063b07d13012477861c6706b7cd40a6)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Add new "remote" to the GIT repository.
 # Copyright (c) Petr Baudis, 2005
Index: gitapply.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitapply.sh  (mode:100755 sha1:7703809dc0743c6e4c1fa5b7d922a4efc16b4276)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitapply.sh  (mode:100755 sha1:794ea5ed6acdd34e34742a17cbd784dcbf738289)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Apply a diff generated by git diff.
 # Copyright (c) Petr Baudis, 2005
Index: gitcancel.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitcancel.sh  (mode:100755 sha1:74b4083d67eda87d88a6f92c6c66877bba8bda8a)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitcancel.sh  (mode:100755 sha1:c320ee98e2ed0b13a68de3b2ec4e4a8451b5189a)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Cancels current edits in the working tree.
 # Copyright (c) Petr Baudis, 2005
Index: gitcommit.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitcommit.sh  (mode:100755 sha1:a13bef2c84492ed75679d7d52bb710df35544f8a)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitcommit.sh  (mode:100755 sha1:0207f402cc5107de2a4685f6fcade081c41d91e9)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Commit into a GIT repository.
 # Copyright (c) Petr Baudis, 2005
Index: gitdiff.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitdiff.sh  (mode:100755 sha1:8e14a868f513f4ec524a2c8974c8d202c6824038)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitdiff.sh  (mode:100755 sha1:e27915d4172717ddd4d01269877312b08ed2acc4)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Make a diff between two GIT trees.
 # Copyright (c) Petr Baudis, 2005
Index: gitexport.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitexport.sh  (mode:100755 sha1:5b94424beca55ffe6b5535e4975e6e63c1bae672)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitexport.sh  (mode:100755 sha1:428cd9d845598e320556729b6098505132a4e7c4)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Exports a particular revision from a GIT repository.
 # Copyright (c) Johannes E. Schindelin, 2005
Index: gitfork.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitfork.sh  (mode:100755 sha1:b827c3037ac4f3cdfb6708bf8edb60944f59318a)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitfork.sh  (mode:100755 sha1:ce26f985ebb48b6a3127ac8afd427ba30ba5668a)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Create a branch sharing the objects database.
 # Copyright (c) Petr Baudis, 2005
Index: gitinit.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitinit.sh  (mode:100755 sha1:9905166859827893e326b01bdc3970ff6d51064d)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitinit.sh  (mode:100755 sha1:bc00e9ee709aabeb4764b77ac4e5a19212fa5857)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Initialize a GIT repository.
 # Copyright (c) Petr Baudis, 2005
Index: gitls.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitls.sh  (mode:100755 sha1:c8d2220eae66addd49493cdb32af21b6c0217b23)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitls.sh  (mode:100755 sha1:a05883b09512bd1d1fe31e1c6d43f01a395c58a1)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # List contents of a particular tree in a GIT repository.
 # Copyright (c) Petr Baudis, 2005
Index: gitlsobj.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitlsobj.sh  (mode:100755 sha1:423a1bc7476bad7bf40f1b3ddb03d83fdcf1f9cd)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitlsobj.sh  (mode:100755 sha1:3f4426eeac7cc5ad51a46632319814fbf62b2cc3)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # List objects of the GIT repository.
 # Copyright (c) Randy Dunlap, 2005
Index: gitlsremote.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitlsremote.sh  (mode:100755 sha1:2212be93aaa8a371e83cafb69fa21a7a1b24ed13)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitlsremote.sh  (mode:100755 sha1:29657d7a899ffb425a36ec04bf1c62aa1ecc14d7)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Lists remote GIT repositories
 # Copyright (c) Steven Cole 2005
Index: gitmerge-file.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitmerge-file.sh  (mode:100755 sha1:820de487babb76ce419b6823c8fe4c58608d0c8c)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitmerge-file.sh  (mode:100755 sha1:237186eaefc4a503c386e4a0e7c28818e6704db7)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Copyright (c) Linus Torvalds, 2005
 #
Index: gitmerge.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitmerge.sh  (mode:100755 sha1:bc68f6cda84cbf1165d71b17d6207b3c46a8cad4)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitmerge.sh  (mode:100755 sha1:92e552700a40c5e1f7339c9b1f261cb39206a3c3)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Merge a branch to the current tree.
 # Copyright (c) Petr Baudis, 2005
Index: gitpatch.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitpatch.sh  (mode:100755 sha1:580e3e6b0c23625abd2288be35ee33a787a1ba3c)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitpatch.sh  (mode:100755 sha1:fd00c88133c874ac71a90a045a313363f9f22350)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Make a patch from a given commit.
 # Copyright (c) Petr Baudis, 2005
Index: gitpull.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitpull.sh  (mode:100755 sha1:0cafc0270ea91aaf099f398b7e5cd360be9ea086)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitpull.sh  (mode:100755 sha1:7f847f39e0b2aa150fe195d8d4f6f0d62487ae72)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Pulls changes from "remote" to the local GIT repository.
 # Copyright (c) Petr Baudis, 2005
Index: gitrm.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitrm.sh  (mode:100755 sha1:3fa31f9a1ae843dcb184b8371ff60f626e8820b3)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitrm.sh  (mode:100755 sha1:e014b979ea7b8f7ae69eabc7dd146c8a7f286d19)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Remove a file from a GIT repository.
 # Copyright (c) Petr Baudis, 2005
Index: gitseek.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitseek.sh  (mode:100755 sha1:b80969a4ba040202827ea7532235abab15ca9392)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitseek.sh  (mode:100755 sha1:035b78a93307da8f67f7447ed3a182a6d17d2c50)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Seek the working tree to a given commit.
 # Copyright (c) Petr Baudis, 2005
Index: gitstatus.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gitstatus.sh  (mode:100755 sha1:7d5209ea838106eb2ab5bde2704997508a22a4e8)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gitstatus.sh  (mode:100755 sha1:9cfb2ce947082002cff3e5497ca2a994c4bbb101)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Show status of entries in your working tree.
 # Copyright (c) Petr Baudis, 2005
Index: gittag.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gittag.sh  (mode:100755 sha1:9e1e200deda54b2401d6d685f0d5305cfbfa38ca)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gittag.sh  (mode:100755 sha1:19c7f3ecffa55f117c27c9a1d8de67f65805f1c7)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Mark certain commit by a tag.
 # Copyright (c) Petr Baudis, 2005
Index: gittrack.sh
===================================================================
--- 0c92ac3af53457b6b9651cf82d98ce3a7b166dcd/gittrack.sh  (mode:100755 sha1:7509d4adb2b2c50cd2acdf9126fc57cff79e6009)
+++ abb16c39fe8354383b632f7fa9dd4611ff66e1d1/gittrack.sh  (mode:100755 sha1:bdf33313f4ad0c4f7b6b235fbc68fc85f226a33a)
@@ -1,4 +1,4 @@
-#!/bin/sh
+#!/usr/bin/env bash
 #
 # Makes your working tree track the given branch.
 # Copyright (c) Petr Baudis, 2005

-- 
Alecs King

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] #!/bin/sh --> #!/usr/bin/env bash
  2005-04-21 19:42         ` [PATCH] #!/bin/sh --> #!/usr/bin/env bash Alecs King
@ 2005-04-22  7:37           ` H. Peter Anvin
  2005-04-23  2:34             ` David A. Wheeler
  0 siblings, 1 reply; 11+ messages in thread
From: H. Peter Anvin @ 2005-04-22  7:37 UTC (permalink / raw
  To: Alecs King; +Cc: git

Alecs King wrote:
> 
> And as for bash, only gitdiff-do and gitlog.sh 'explicitly' use bash
> instead of /bin/sh.  On most Linux distros, /bin/sh is just a symbolic
> link to bash.  But not on some others.  I found gitlsobj.sh could not
> work using a plain /bin/sh on fbsd.  To make life easier, i think it
> might be better if we all explicitly use bash for all shell scripts.
> 

How about #!/bin/bash (build from .in files if you feel it necessary to 
support systems which don't have bash in /bin) instead of doubling the 
number of execs?

	-hpa

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Performance of various compressors
  2005-04-21  5:06 Performance of various compressors Mike Taht
  2005-04-21  5:14 ` Mike Taht
@ 2005-04-22 20:38 ` Aaron Lehmann
  2005-04-25 12:17   ` git I/O performance (was: Performance of various compressors) Klaus Robert Suetterlin
  1 sibling, 1 reply; 11+ messages in thread
From: Aaron Lehmann @ 2005-04-22 20:38 UTC (permalink / raw
  To: Mike Taht; +Cc: git

On Wed, Apr 20, 2005 at 10:06:38PM -0700, Mike Taht wrote:
> That doing the compression at a level of 3, rather than the max of 9, 
> cuts the cpu time required for a big git commit by over half, and that 
> that actually translates into a win on the I/O to disk. (these tests 
> were performed on a dual opteron 842)

If (de)compression is slowing things down, you might want to check out
lzo (http://www.oberhumer.com/opensource/lzo/). I tested it on the
2.6.11 kernel source and found that lzo -7 output is only 2% larger
than gzip -3, but lzo decompression is almost 3 times faster. The
downside is that lzo took 5 times longer to perform the compression at
-7. Compression with lzo -3 is 3.5 times faster than gzip -3, but it
produces a file that's 37% bigger. Unfortunately, lzo has no settings
in between -3 and -7. I'd expect git to be more sensitive to
decompression speeds, though.

BTW, lzo decompression speed is not affected by the compression level.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] #!/bin/sh --> #!/usr/bin/env bash
  2005-04-22  7:37           ` H. Peter Anvin
@ 2005-04-23  2:34             ` David A. Wheeler
  2005-04-23  6:16               ` H. Peter Anvin
  0 siblings, 1 reply; 11+ messages in thread
From: David A. Wheeler @ 2005-04-23  2:34 UTC (permalink / raw
  To: H. Peter Anvin; +Cc: Alecs King, git


> Alecs King wrote:
> 
>>
>> And as for bash, only gitdiff-do and gitlog.sh 'explicitly' use bash
>> instead of /bin/sh.  On most Linux distros, /bin/sh is just a symbolic
>> link to bash.  But not on some others.  I found gitlsobj.sh could not
>> work using a plain /bin/sh on fbsd.  To make life easier, i think it
>> might be better if we all explicitly use bash for all shell scripts.


H. Peter Anvin wrote:
> How about #!/bin/bash (build from .in files if you feel it necessary to 
> support systems which don't have bash in /bin) instead of doubling the 
> number of execs?

If # of execs is that critical, it probably should not be in
bash anyway.  OpenBSD (at least 3.1)'s bash appears to be in
/usr/local/bin/bash, NOT /bin/bash.
I'd go with the /bin/env solution for now;
it maximizes the "it just works" factor, and
when it comes time for .in files much of the cogito code (at least)
will probably be rewritten in Perl, and anything performance-sensitive
will be in C.

--- David A. Wheeler

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] #!/bin/sh --> #!/usr/bin/env bash
  2005-04-23  2:34             ` David A. Wheeler
@ 2005-04-23  6:16               ` H. Peter Anvin
  0 siblings, 0 replies; 11+ messages in thread
From: H. Peter Anvin @ 2005-04-23  6:16 UTC (permalink / raw
  To: dwheeler; +Cc: Alecs King, git

David A. Wheeler wrote:
> 
> If # of execs is that critical, it probably should not be in
> bash anyway.  OpenBSD (at least 3.1)'s bash appears to be in
> /usr/local/bin/bash, NOT /bin/bash.
> I'd go with the /bin/env solution for now;
> it maximizes the "it just works" factor, and
> when it comes time for .in files much of the cogito code (at least)
> will probably be rewritten in Perl, and anything performance-sensitive
> will be in C.
> 

Makes sense.

	-hpa

^ permalink raw reply	[flat|nested] 11+ messages in thread

* git I/O performance (was: Performance of various compressors)
  2005-04-22 20:38 ` Performance of various compressors Aaron Lehmann
@ 2005-04-25 12:17   ` Klaus Robert Suetterlin
  0 siblings, 0 replies; 11+ messages in thread
From: Klaus Robert Suetterlin @ 2005-04-25 12:17 UTC (permalink / raw
  To: Aaron Lehmann; +Cc: Mike Taht, git

I did some statistics on the freebsd /usr/src/sys directory, as I
did not have access to the linux kernel sources.

This is 5435 Files, of about 81MB (according to du -sh).  I did

find sys/ -type f -exec gzip -9 {} +
find sys/ -type f -exec gzip -d {} +

and similar calls to get an impression how different compression
levels and compressors will act on the data most likely handled by
git backend storage.

On a 700MHz p3, UDMA33, freebsd 5.3, ffs (soft updates) I get:

compressor | levels (size, time to compress, time to uncompress)
-----------+-------------------------------------------------------------------
gzip       | 9 (28M, 1:19, 30), 6 (28M, 31.7, 30), 3 (30M, 26.1, 28.7)
           | 1 (31M, 23.6, 29.8)
bzip2      | 9 (27M, 2:14, 37.4) 6 (27M, 2:11, 38.8) 3 (27M, 2:10, 38.3)
lzop       | 9 (32M, 2:15, 35.4) 7 (32M, 57.9, 40.3) 3 (39M, 36.0, 44.4)


These speeds are for the case that our work set fits into filesystem
caches.  This will be the most common case --- as normal commits will
not checkin the whole tree.

That is.  We should really use gzip -6.  It results in the best
compression at a reasonable time.  bzip2 can't really compress those
tiny files efficiently. lzop is limited by open/close (see below).

BTW. I also did this for the whole /usr/src of freebsd (which is
35000 files and 350MB, du -sh gives 398MB).  The numbers look best
for gzip -6.

The files we work with seem to have an average uncompressed size
of 10-15kB and seem to compress by about a factor of three.

So I did a test in C: open("file%d"), write(file, buf, 10000),
close(file).  I repeated this for 35000 files as in the freebsd src
case, to get some statistics.  The gprof output tells me, that
open+close take the same ammount of time as the write.  (You should
really try do to rm test[0-9]* on those 35000 files :))  I wrote
the full 10000 bytes, to check for the case when we have no compression
at all.  When compression gets better we will become more and more
open/close limited.

This means we are open/close limited in git.  Even if we compress
the files to zero size, we cannot get faster than by a factor of
two!

Earlier messages in this thread showed that we are also limited by
filesytem cache, so we should use compression and efficient prefetch
to get best performance out of it.  Because even if we cannot get
faster than by a factor of two through compression (even delta
compression won't help!  It would make things worse IMHO) we can
get a lot worse (like ten times slower) for large sets on slow
machines with few memory.

I also tried to get a better ratio by using the standard db.h btree
database, so I wouldn't have to open and close all those files.
Unfortunately the btree is about twice as large as the files, so I
had to write twice as much data to disk(800MB).  Also db->put
is much more complicated than write.  So the test ended up taking
about 10% more time, than the open/write/close case.  Maybe in the
case of a smaller work set (i.e. 1000 files instead of 35000) this
might provide faster backend speeds.  Also one could optimise speed
by tweaking the acccess method parameters.


On Fri, Apr 22, 2005 at 01:38:01PM -0700, Aaron Lehmann wrote:
> On Wed, Apr 20, 2005 at 10:06:38PM -0700, Mike Taht wrote:
> > That doing the compression at a level of 3, rather than the max of 9, 
> > cuts the cpu time required for a big git commit by over half, and that 
> > that actually translates into a win on the I/O to disk. (these tests 
> > were performed on a dual opteron 842)
> 
> If (de)compression is slowing things down, you might want to check out
> lzo (http://www.oberhumer.com/opensource/lzo/). I tested it on the
> 2.6.11 kernel source and found that lzo -7 output is only 2% larger
> than gzip -3, but lzo decompression is almost 3 times faster. The
> downside is that lzo took 5 times longer to perform the compression at
> -7. Compression with lzo -3 is 3.5 times faster than gzip -3, but it
> produces a file that's 37% bigger. Unfortunately, lzo has no settings
> in between -3 and -7. I'd expect git to be more sensitive to
> decompression speeds, though.
> 
> BTW, lzo decompression speed is not affected by the compression level.
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Robert Suetterlin (robert@mpe.mpg.de)
phone: (+49)89 / 30000-3546   fax: (+49)89 / 30000-3950

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-04-25 12:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-21  5:06 Performance of various compressors Mike Taht
2005-04-21  5:14 ` Mike Taht
2005-04-21  5:22   ` [PATCH] experimental - " Mike Taht
2005-04-21 10:23     ` HOWTO: PATCH: don't hardcode path-to-bash, use sys/limits.h Klaus Robert Suetterlin
2005-04-21 14:31       ` Alecs King
2005-04-21 19:42         ` [PATCH] #!/bin/sh --> #!/usr/bin/env bash Alecs King
2005-04-22  7:37           ` H. Peter Anvin
2005-04-23  2:34             ` David A. Wheeler
2005-04-23  6:16               ` H. Peter Anvin
2005-04-22 20:38 ` Performance of various compressors Aaron Lehmann
2005-04-25 12:17   ` git I/O performance (was: Performance of various compressors) Klaus Robert Suetterlin

Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).