git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Linus Torvalds <torvalds@osdl.org>
To: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Cc: Eric Wong <normalperson@yhbt.net>,
	Alex Riesen <raa.lkml@gmail.com>, Sam Vilain <sam@vilain.net>,
	Junio C Hamano <junkio@cox.net>,
	git@vger.kernel.org
Subject: Re: [PATCH] fmt-merge-msg: avoid open "-|" list form for Perl 5.6
Date: Fri, 24 Feb 2006 08:14:52 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0602240800240.3771@g5.osdl.org> (raw)
In-Reply-To: <Pine.LNX.4.63.0602241440330.9461@wbgn013.biozentrum.uni-wuerzburg.de>



On Fri, 24 Feb 2006, Johannes Schindelin wrote:
> 
> Sorry, but no. Really no. Pipes have several advantages over temporary 
> files:
> 
> - The second program can already work on the data before the first 
>   finishes.

This really is a _huge_ issue in general, although probably not a very 
big one in this case.

This is what I talked about when I said "streaming" data. Look at the 
difference between

	git whatchanged -s drivers/usb

and

	git log drivers/usb

in the kernel repo. They give almost the same output, but...

Notice how one starts _immediately_, while the other starts after a few 
seconds (or, if you have a slow machine, and an unpacked archive, after 
tens of seconds or longer).

And the reason is that "git log" uses "git-rev-list" with a path limiter, 
and currently that ends up having to walk basically the whole history in 
order to generate a minimal graph.

In contrast, "git-whatchanged" uses "git-diff-tree" to limit the output, 
and git-diff-tree doesn't care about "minimal graph" or crud like that: it 
just cares about discarding any local commits that aren't interesting. It 
doesn't need to worry about updating parent chains etc, so it can do it 
all incrementally - and can thus start output as soon as it gets anything 
at all.

Now, maybe you think that "a few seconds" isn't a big deal. Sure, it's 
actually fast as hell, considering what it is doing, and anybody should be 
really really impressed that we can do that at all.

But (a) it _is_ a huge deal. Responsiveness is really important. And 
worse: (b) it scales badly with repository size. Creating the whole 
data-set before starting to output it really doesn't scale.

Now, I have ways to make "git-rev-list" better. It doesn't really need to 
walk the _whole_ history for its path limiting before it can start 
outputting stuff: it really _could_ do things more incrementally. However, 
it's a real bitch sometimes to work with incremental data when you don't 
know everything, so it gets a lot more complicated. 

So my point isn't that "git log drivers/usb" will get less and less 
responsive over time. I can fix that - eventually. My point is that in 
order to make it more responsive, I need to make it less synchronous. More 
"streaming". 

And that is where a pipe is so much better than a file. It's very 
fundamentally a streaming interface.

However, I suspect some of these issues are non-issues for the perl 
programs that work with a few entries at a time.

		Linus

  reply	other threads:[~2006-02-24 16:15 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-20 18:37 Should we support Perl 5.6? Johannes Schindelin
2006-02-20 19:10 ` Eric Wong
2006-02-20 21:01   ` Andreas Ericsson
2006-02-20 21:15     ` Junio C Hamano
2006-02-20 22:05   ` [PATCH] fmt-merge-msg: avoid open "-|" list form for Perl 5.6 Junio C Hamano
2006-02-20 22:12     ` [PATCH] rerere: " Junio C Hamano
2006-02-20 22:12     ` [PATCH] send-email: " Junio C Hamano
2006-02-20 22:12     ` [PATCH] svmimport: " Junio C Hamano
2006-02-20 22:19     ` [PATCH] cvsimport: " Junio C Hamano
2006-02-21 17:30     ` [PATCH] fmt-merge-msg: " Alex Riesen
2006-02-21 20:36       ` Sam Vilain
2006-02-21 21:57         ` Alex Riesen
2006-02-21 22:19           ` Johannes Schindelin
2006-02-21 22:35             ` Eric Wong
2006-02-21 22:38             ` Shawn Pearce
2006-02-21 23:00             ` Martin Langhoff
2006-02-21 22:38           ` Sam Vilain
2006-02-22 16:35             ` Alex Riesen
2006-02-22 19:44               ` Johannes Schindelin
2006-02-22 19:51               ` Sam Vilain
2006-02-22 19:54                 ` Junio C Hamano
2006-02-22 22:00               ` Johannes Schindelin
2006-02-22 22:25                 ` Junio C Hamano
2006-02-23  8:00                 ` Alex Riesen
2006-02-23  8:45                   ` Junio C Hamano
2006-02-23  9:35                     ` Alex Riesen
2006-02-23  9:41                       ` Alex Riesen
2006-02-23  9:48                         ` Andreas Ericsson
2006-02-23 10:10                           ` Alex Riesen
2006-02-23 13:29                             ` Andreas Ericsson
2006-02-23 14:07                               ` Alex Riesen
2006-02-23 14:22                                 ` Andreas Ericsson
2006-02-23 17:13                                 ` Linus Torvalds
2006-02-23 19:32                                   ` Junio C Hamano
2006-02-23 19:38                                     ` Johannes Schindelin
2006-02-23 19:54                                       ` Linus Torvalds
2006-02-23 20:19                                         ` Johannes Schindelin
2006-02-23 19:51                                     ` Linus Torvalds
2006-02-23 20:31                                       ` Sam Vilain
2006-02-24  6:43                                         ` Linus Torvalds
2006-02-23 21:43                                   ` Alex Riesen
2006-02-26 19:55                                 ` Christopher Faylor
2006-02-26 20:18                                   ` Linus Torvalds
2006-02-26 20:40                                     ` Christopher Faylor
2006-03-02 14:18                                       ` Alex Riesen
2006-03-02 15:18                                         ` Mark Wooding
2006-03-02 16:11                                           ` Alex Riesen
2006-03-02 15:22                                         ` Christopher Faylor
2006-03-02 16:20                                           ` Alex Riesen
2006-02-26 23:17                                   ` NT directory traversal speed on 25K files on Cygwin Rutger Nijlunsing
2006-02-27  1:18                                     ` Christopher Faylor
2006-02-27 18:30                                       ` Rutger Nijlunsing
2006-02-27 18:34                                         ` Christopher Faylor
2006-02-27  9:19                                     ` Andreas Ericsson
2006-02-27 18:45                                       ` Rutger Nijlunsing
2006-03-02 13:40                                         ` Alex Riesen
2006-03-02 14:10                                   ` [PATCH] fmt-merge-msg: avoid open "-|" list form for Perl 5.6 Alex Riesen
2006-03-02 15:00                                     ` Christopher Faylor
2006-03-02 16:10                                       ` Alex Riesen
2006-03-02 17:39                                         ` Andreas Ericsson
2006-03-02 22:01                                           ` Alex Riesen
2006-02-26 20:33                               ` Christopher Faylor
2006-02-24 12:02               ` Eric Wong
2006-02-24 13:44                 ` Johannes Schindelin
2006-02-24 16:14                   ` Linus Torvalds [this message]
2006-02-21 20:56       ` Eric Wong
2006-02-21 22:04         ` Alex Riesen
     [not found]           ` <1cf1c57a0602211412r1988b14ao435edd29207dc0d0@mail.gmail.com>
2006-02-21 22:13             ` Ron Parker
  -- strict thread matches above, loose matches on Subject: below --
2006-03-02 16:44 Christopher Faylor
2006-03-02 16:55 ` Shawn Pearce
2006-03-02 22:09   ` Alex Riesen
2006-03-02 23:27     ` Linus Torvalds
2006-03-03  0:34       ` Junio C Hamano
2006-03-03  0:49         ` Linus Torvalds
2006-03-03  1:25           ` Junio C Hamano
2006-03-03  1:52             ` Linus Torvalds
2006-03-03  0:14     ` Christopher Faylor
2006-03-02 17:33 ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0602240800240.3771@g5.osdl.org \
    --to=torvalds@osdl.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=normalperson@yhbt.net \
    --cc=raa.lkml@gmail.com \
    --cc=sam@vilain.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).