From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jakub Narebski <jnareb@gmail.com>,
Brendan Miller <catphive@catphive.net>,
git@vger.kernel.org
Subject: Re: obnoxious CLI complaints
Date: Sat, 12 Sep 2009 00:01:50 +0200 [thread overview]
Message-ID: <4AAAC8CE.8020302@lsrfire.ath.cx> (raw)
In-Reply-To: <alpine.LFD.2.01.0909110744030.3654@localhost.localdomain>
Am 11.09.2009 16:47, schrieb Linus Torvalds:
>
>
> On Fri, 11 Sep 2009, René Scharfe wrote:
>>
>> Using zlib directly avoids the overhead of a pipe and of buffering the
>> output for blocked writes; surprisingly (to me), it isn't any faster.
>
> In fact, it should be slower.
>
> On SMP, you're quite likely better off using the pipe, and compressing on
> another CPU. Of course, it's usually the case that the compression is _so_
> much slower than generating the tar-file (especially for the hot-cache
> case) that it doesn't matter or the pipe overhead is even a slowdown.
>
> But especially if generating the tar-file has some delays in it
> (cold-cache object lookup, whatever), the "compress in separate process"
> is likely simply better, because you can compress while the other process
> is looking up data for the tar.
Yes, that makes sense and can be seen here (quad core, Fedora 11, best
of five consecutive runs, Linux kernel repo):
# git v1.6.5-rc0
$ time git archive --format=tar v2.6.31 | gzip -6 >/dev/null
real 0m16.591s
user 0m19.769s
sys 0m0.474s
# git v1.6.5-rc0 + patch
$ time ../git/git archive --format=tar.gz -6 v2.6.31 >/dev/null
real 0m20.390s
user 0m20.299s
sys 0m0.088s
User time is quite similar, real time is lower when using a pipe.
But what has bugged me since I added zip support is this result:
# git v1.6.5-rc0
$ time git archive --format=zip -6 v2.6.31 >/dev/null
real 0m16.471s
user 0m16.340s
sys 0m0.128s
I'd have expected this to be the slowest case, because it's compressing
all files separately, i.e. it needs to create and flush the compression
context lots of times instead of only once as in the two cases above.
And it's sequential and uses zlib, just like the tar.gz format. I
suspect the convenience function gzwrite() adds this overhead.
Oh, I just discovered pigz (http://zlib.net/pigz/), a parallel gzip:
# git v1.6.5-rc0, pigz 2.1.5
$ time git archive --format=tar v2.6.31 | pigz -6 >/dev/null
real 0m6.251s
user 0m21.383s
sys 0m0.547s
So pipes win. :) Still need to investigate why zip is as (relatively)
fast as it is, though.
René
next prev parent reply other threads:[~2009-09-11 22:02 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-09 21:27 obnoxious CLI complaints Brendan Miller
2009-09-09 21:54 ` Jakub Narebski
2009-09-09 22:06 ` Wincent Colaiuta
2009-09-10 16:50 ` Jakub Narebski
2009-09-10 18:53 ` Junio C Hamano
2009-09-10 22:19 ` René Scharfe
2009-09-11 3:15 ` Björn Steinbrink
2009-09-10 19:46 ` John Tapsell
2009-09-10 20:17 ` Sverre Rabbelier
2009-09-10 20:23 ` Jakub Narebski
2009-09-10 22:04 ` John Tapsell
2009-09-10 22:49 ` Junio C Hamano
2009-09-10 23:19 ` demerphq
2009-09-11 0:37 ` Junio C Hamano
2009-09-11 0:18 ` John Tapsell
2009-09-11 0:25 ` Junio C Hamano
2009-09-10 0:09 ` Brendan Miller
2009-09-10 1:25 ` Todd Zullinger
2009-09-10 9:16 ` Jakub Narebski
2009-09-10 18:18 ` Eric Schaefer
2009-09-10 18:52 ` Sverre Rabbelier
2009-09-10 22:19 ` René Scharfe
2009-09-11 14:47 ` Linus Torvalds
2009-09-11 22:01 ` René Scharfe [this message]
2009-09-11 22:16 ` Linus Torvalds
2009-09-12 10:31 ` Dmitry Potapov
2009-09-12 18:32 ` John Tapsell
2009-09-12 21:44 ` Dmitry Potapov
2009-09-12 22:21 ` John Tapsell
2009-09-12 22:35 ` A Large Angry SCM
2009-09-12 22:43 ` Dmitry Potapov
2009-09-12 23:08 ` John Tapsell
2009-09-13 2:47 ` Junio C Hamano
2009-09-13 17:36 ` [PATCH 1/2] git-archive: add '-o' as a alias for '--output' Dmitry Potapov
2009-09-13 17:36 ` [PATCH 2/2] teach git-archive to auto detect the output format Dmitry Potapov
2009-09-13 18:52 ` Junio C Hamano
2009-09-13 20:17 ` [PATCH v2 " Dmitry Potapov
2009-09-13 21:27 ` Junio C Hamano
2009-09-13 18:34 ` [PATCH 1/2] git-archive: add '-o' as a alias for '--output' Junio C Hamano
2009-09-13 20:13 ` [PATCH v2 " Dmitry Potapov
2009-09-17 0:48 ` obnoxious CLI complaints Brendan Miller
2009-09-17 1:27 ` Junio C Hamano
2009-09-09 21:58 ` Sverre Rabbelier
2009-09-09 22:58 ` Pierre Habouzit
2009-09-10 1:32 ` Björn Steinbrink
2009-09-10 18:54 ` Matthieu Moy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4AAAC8CE.8020302@lsrfire.ath.cx \
--to=rene.scharfe@lsrfire.ath.cx \
--cc=catphive@catphive.net \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).