git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe" <rene.scharfe@lsrfire.ath.cx>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Jakub Narebski <jnareb@gmail.com>,
	Brendan Miller <catphive@catphive.net>,
	git@vger.kernel.org
Subject: Re: obnoxious CLI complaints
Date: Sat, 12 Sep 2009 00:01:50 +0200	[thread overview]
Message-ID: <4AAAC8CE.8020302@lsrfire.ath.cx> (raw)
In-Reply-To: <alpine.LFD.2.01.0909110744030.3654@localhost.localdomain>

Am 11.09.2009 16:47, schrieb Linus Torvalds:
>
>
> On Fri, 11 Sep 2009, René Scharfe wrote:
>>
>> Using zlib directly avoids the overhead of a pipe and of buffering the
>> output for blocked writes; surprisingly (to me), it isn't any faster.
>
> In fact, it should be slower.
>
> On SMP, you're quite likely better off using the pipe, and compressing on
> another CPU. Of course, it's usually the case that the compression is _so_
> much slower than generating the tar-file (especially for the hot-cache
> case) that it doesn't matter or the pipe overhead is even a slowdown.
>
> But especially if generating the tar-file has some delays in it
> (cold-cache object lookup, whatever), the "compress in separate process"
> is likely simply better, because you can compress while the other process
> is looking up data for the tar.

Yes, that makes sense and can be seen here (quad core, Fedora 11, best
of five consecutive runs, Linux kernel repo):

	# git v1.6.5-rc0
	$ time git archive --format=tar v2.6.31 | gzip -6 >/dev/null

	real	0m16.591s
	user	0m19.769s
	sys	0m0.474s

	# git v1.6.5-rc0 + patch
	$ time ../git/git archive --format=tar.gz -6 v2.6.31 >/dev/null

	real	0m20.390s
	user	0m20.299s
	sys	0m0.088s

User time is quite similar, real time is lower when using a pipe.

But what has bugged me since I added zip support is this result:

	# git v1.6.5-rc0
	$ time git archive --format=zip -6 v2.6.31 >/dev/null

	real	0m16.471s
	user	0m16.340s
	sys	0m0.128s

I'd have expected this to be the slowest case, because it's compressing
all files separately, i.e. it needs to create and flush the compression
context lots of times instead of only once as in the two cases above.
And it's sequential and uses zlib, just like the tar.gz format.  I
suspect the convenience function gzwrite() adds this overhead.


Oh, I just discovered pigz (http://zlib.net/pigz/), a parallel gzip:

	# git v1.6.5-rc0, pigz 2.1.5
	$ time git archive --format=tar v2.6.31 | pigz -6 >/dev/null

	real	0m6.251s
	user	0m21.383s
	sys	0m0.547s

So pipes win. :)  Still need to investigate why zip is as (relatively)
fast as it is, though.

René

  reply	other threads:[~2009-09-11 22:02 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-09 21:27 obnoxious CLI complaints Brendan Miller
2009-09-09 21:54 ` Jakub Narebski
2009-09-09 22:06   ` Wincent Colaiuta
2009-09-10 16:50     ` Jakub Narebski
2009-09-10 18:53       ` Junio C Hamano
2009-09-10 22:19         ` René Scharfe
2009-09-11  3:15         ` Björn Steinbrink
2009-09-10 19:46       ` John Tapsell
2009-09-10 20:17         ` Sverre Rabbelier
2009-09-10 20:23         ` Jakub Narebski
2009-09-10 22:04           ` John Tapsell
2009-09-10 22:49             ` Junio C Hamano
2009-09-10 23:19               ` demerphq
2009-09-11  0:37                 ` Junio C Hamano
2009-09-11  0:18               ` John Tapsell
2009-09-11  0:25                 ` Junio C Hamano
2009-09-10  0:09   ` Brendan Miller
2009-09-10  1:25     ` Todd Zullinger
2009-09-10  9:16     ` Jakub Narebski
2009-09-10 18:18       ` Eric Schaefer
2009-09-10 18:52         ` Sverre Rabbelier
2009-09-10 22:19       ` René Scharfe
2009-09-11 14:47         ` Linus Torvalds
2009-09-11 22:01           ` René Scharfe [this message]
2009-09-11 22:16             ` Linus Torvalds
2009-09-12 10:31     ` Dmitry Potapov
2009-09-12 18:32       ` John Tapsell
2009-09-12 21:44         ` Dmitry Potapov
2009-09-12 22:21           ` John Tapsell
2009-09-12 22:35             ` A Large Angry SCM
2009-09-12 22:43             ` Dmitry Potapov
2009-09-12 23:08               ` John Tapsell
2009-09-13  2:47                 ` Junio C Hamano
2009-09-13 17:36                   ` [PATCH 1/2] git-archive: add '-o' as a alias for '--output' Dmitry Potapov
2009-09-13 17:36                     ` [PATCH 2/2] teach git-archive to auto detect the output format Dmitry Potapov
2009-09-13 18:52                       ` Junio C Hamano
2009-09-13 20:17                         ` [PATCH v2 " Dmitry Potapov
2009-09-13 21:27                           ` Junio C Hamano
2009-09-13 18:34                     ` [PATCH 1/2] git-archive: add '-o' as a alias for '--output' Junio C Hamano
2009-09-13 20:13                       ` [PATCH v2 " Dmitry Potapov
2009-09-17  0:48                   ` obnoxious CLI complaints Brendan Miller
2009-09-17  1:27                     ` Junio C Hamano
2009-09-09 21:58 ` Sverre Rabbelier
2009-09-09 22:58 ` Pierre Habouzit
2009-09-10  1:32 ` Björn Steinbrink
2009-09-10 18:54   ` Matthieu Moy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4AAAC8CE.8020302@lsrfire.ath.cx \
    --to=rene.scharfe@lsrfire.ath.cx \
    --cc=catphive@catphive.net \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).