git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Andreas Ericsson <ae@op5.se>
To: Marco Costalba <mcostalba@gmail.com>
Cc: Linus Torvalds <torvalds@osdl.org>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <junkio@cox.net>, Alex Riesen <raa.lkml@gmail.com>,
	Shawn Pearce <spearce@spearce.org>
Subject: Re: [RFC \ WISH] Add -o option to git-rev-list
Date: Mon, 11 Dec 2006 14:40:55 +0100	[thread overview]
Message-ID: <457D5FE7.3010309@op5.se> (raw)
In-Reply-To: <e5bfff550612110459w205cb9b3lf735359012f84f7c@mail.gmail.com>

Marco Costalba wrote:
> On 12/11/06, Andreas Ericsson <ae@op5.se> wrote:
>> Marco Costalba wrote:
>> > On 12/10/06, Linus Torvalds <torvalds@osdl.org> wrote:
>> >>
>> >> Why don't you use the pipe and standard read()?
>> >>
>> >> Even if you use "popen()" and get a "FILE *" back, you can still do
>> >>
>> >>         int fd = fileno(file);
>> >>
>> >> and use the raw IO capabilities.
>> >>
>> >> The thing is, temporary files can actually be faster under Linux just
>> >> because the Linux page-cache simply kicks ass. But it's not going 
>> to be
>> >> _that_ big of a difference, and you need all that crazy "wait for
>> >> rev-list
>> >> to finish" and the "clean up temp-file on errors" etc crap, so 
>> there's no
>> >> way it's a better solution.
>> >>
>> >
>> > Two things.
>> >
>> > - memory use: the next natural step with files is, instead of loading
>> > the file content in memory and *keep it there*, we could load one
>> > chunk at a time, index the chunk and discard. At the end we keep in
>> > memory only indexing info to quickly get to the data when needed, but
>> > the big part of data stay on the file.
>> >
>>
>> memory usage vs speed tradeoff. Since qgit is a pure user-app, I think
>> it's safe to opt for the memory hungry option. If people run it on too
>> lowbie hardware they'll just have to make do with other ways of viewing
>> the DAG or shutting down some other programs.
>>
>> > - This is probably my ignorance, but experimenting with popen() I
>> > found I could not know *when* git-rev-list ends because both feof()
>> > and ferror() give 0 after a fread() with git-rev-list already defunct.
>> > Not having a reference to the process (it is hidden behind popen() ),
>> > I had to check for 0 bytes read after a successful read (to avoid
>> > racing in case I ask the pipe before the first data it's ready) to
>> > know that job is finished and call pclose().
>> >
>>
>> (coding in MUA, so highly untested)
>>
> 
> Thanks Andreas, I will do some tests with your code. But at first
> sight I fail to see (I'm not an expert on this tough ;-)  ) where is
> the difference from using popen() and fileno() to get the file
> descriptors.
> 

read() vs fread(), so no libc buffers. When I did comparisons with this 
(a long time ago, I don't have the test-program around) in style of

	read(out[0], buf, sizeof(buf));
	write(fileno(stdout), buf, sizeof(buf));

with a command line like this;

	cat any-file | test-program > /dev/null

I saw a static ~10ms increase in execution time compared to

	cat any-file > /dev/null

regardless of the size of "any-file", so I assume this overhead comes 
from the extra fork(), which you'll never get rid of unless you use 
libgit.a.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

      reply	other threads:[~2006-12-11 13:41 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-10 11:38 [RFC \ WISH] Add -o option to git-rev-list Marco Costalba
2006-12-10 14:54 ` Alex Riesen
2006-12-10 18:16 ` Linus Torvalds
2006-12-10 19:51   ` Marco Costalba
2006-12-10 20:00     ` globs in partial checkout? Michael S. Tsirkin
2006-12-10 20:13       ` Linus Torvalds
2006-12-10 21:07         ` Michael S. Tsirkin
2006-12-10 20:08     ` [RFC \ WISH] Add -o option to git-rev-list Linus Torvalds
2006-12-10 20:19       ` Linus Torvalds
2006-12-10 22:05         ` Marco Costalba
2006-12-10 22:09           ` Marco Costalba
2006-12-10 22:16           ` Linus Torvalds
2006-12-10 22:35             ` Marco Costalba
2006-12-10 22:53               ` Linus Torvalds
2006-12-11  0:15                 ` Marco Costalba
2006-12-11  0:51                   ` Linus Torvalds
2006-12-11  7:17                     ` Marco Costalba
2006-12-11 10:00                       ` Alex Riesen
2006-12-11 16:59                       ` Linus Torvalds
2006-12-11 17:07                         ` Linus Torvalds
2006-12-11 17:39                           ` Marco Costalba
2006-12-11 18:15                             ` Linus Torvalds
2006-12-11 18:59                               ` Marco Costalba
2006-12-11 19:25                                 ` Linus Torvalds
2006-12-11 20:28                                 ` Josef Weidendorfer
2006-12-11 20:40                                   ` Linus Torvalds
2006-12-11 20:54                                     ` Josef Weidendorfer
2006-12-11 21:14                                       ` Linus Torvalds
2006-12-15 18:45                                         ` Marco Costalba
2006-12-15 19:20                                           ` Linus Torvalds
2006-12-15 20:41                                             ` Marco Costalba
2006-12-15 21:04                                               ` Marco Costalba
2006-12-11  9:26                   ` Josef Weidendorfer
2006-12-11 12:52                     ` Marco Costalba
2006-12-11 13:28                       ` Josef Weidendorfer
2006-12-11 17:28                         ` Marco Costalba
2006-12-11 11:39     ` Andreas Ericsson
2006-12-11 12:59       ` Marco Costalba
2006-12-11 13:40         ` Andreas Ericsson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=457D5FE7.3010309@op5.se \
    --to=ae@op5.se \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=mcostalba@gmail.com \
    --cc=raa.lkml@gmail.com \
    --cc=spearce@spearce.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).