From: Andreas Ericsson <ae@op5.se>
To: Marco Costalba <mcostalba@gmail.com>
Cc: Linus Torvalds <torvalds@osdl.org>,
Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <junkio@cox.net>, Alex Riesen <raa.lkml@gmail.com>,
Shawn Pearce <spearce@spearce.org>
Subject: Re: [RFC \ WISH] Add -o option to git-rev-list
Date: Mon, 11 Dec 2006 14:40:55 +0100 [thread overview]
Message-ID: <457D5FE7.3010309@op5.se> (raw)
In-Reply-To: <e5bfff550612110459w205cb9b3lf735359012f84f7c@mail.gmail.com>
Marco Costalba wrote:
> On 12/11/06, Andreas Ericsson <ae@op5.se> wrote:
>> Marco Costalba wrote:
>> > On 12/10/06, Linus Torvalds <torvalds@osdl.org> wrote:
>> >>
>> >> Why don't you use the pipe and standard read()?
>> >>
>> >> Even if you use "popen()" and get a "FILE *" back, you can still do
>> >>
>> >> int fd = fileno(file);
>> >>
>> >> and use the raw IO capabilities.
>> >>
>> >> The thing is, temporary files can actually be faster under Linux just
>> >> because the Linux page-cache simply kicks ass. But it's not going
>> to be
>> >> _that_ big of a difference, and you need all that crazy "wait for
>> >> rev-list
>> >> to finish" and the "clean up temp-file on errors" etc crap, so
>> there's no
>> >> way it's a better solution.
>> >>
>> >
>> > Two things.
>> >
>> > - memory use: the next natural step with files is, instead of loading
>> > the file content in memory and *keep it there*, we could load one
>> > chunk at a time, index the chunk and discard. At the end we keep in
>> > memory only indexing info to quickly get to the data when needed, but
>> > the big part of data stay on the file.
>> >
>>
>> memory usage vs speed tradeoff. Since qgit is a pure user-app, I think
>> it's safe to opt for the memory hungry option. If people run it on too
>> lowbie hardware they'll just have to make do with other ways of viewing
>> the DAG or shutting down some other programs.
>>
>> > - This is probably my ignorance, but experimenting with popen() I
>> > found I could not know *when* git-rev-list ends because both feof()
>> > and ferror() give 0 after a fread() with git-rev-list already defunct.
>> > Not having a reference to the process (it is hidden behind popen() ),
>> > I had to check for 0 bytes read after a successful read (to avoid
>> > racing in case I ask the pipe before the first data it's ready) to
>> > know that job is finished and call pclose().
>> >
>>
>> (coding in MUA, so highly untested)
>>
>
> Thanks Andreas, I will do some tests with your code. But at first
> sight I fail to see (I'm not an expert on this tough ;-) ) where is
> the difference from using popen() and fileno() to get the file
> descriptors.
>
read() vs fread(), so no libc buffers. When I did comparisons with this
(a long time ago, I don't have the test-program around) in style of
read(out[0], buf, sizeof(buf));
write(fileno(stdout), buf, sizeof(buf));
with a command line like this;
cat any-file | test-program > /dev/null
I saw a static ~10ms increase in execution time compared to
cat any-file > /dev/null
regardless of the size of "any-file", so I assume this overhead comes
from the extra fork(), which you'll never get rid of unless you use
libgit.a.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
prev parent reply other threads:[~2006-12-11 13:41 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-12-10 11:38 [RFC \ WISH] Add -o option to git-rev-list Marco Costalba
2006-12-10 14:54 ` Alex Riesen
2006-12-10 18:16 ` Linus Torvalds
2006-12-10 19:51 ` Marco Costalba
2006-12-10 20:00 ` globs in partial checkout? Michael S. Tsirkin
2006-12-10 20:13 ` Linus Torvalds
2006-12-10 21:07 ` Michael S. Tsirkin
2006-12-10 20:08 ` [RFC \ WISH] Add -o option to git-rev-list Linus Torvalds
2006-12-10 20:19 ` Linus Torvalds
2006-12-10 22:05 ` Marco Costalba
2006-12-10 22:09 ` Marco Costalba
2006-12-10 22:16 ` Linus Torvalds
2006-12-10 22:35 ` Marco Costalba
2006-12-10 22:53 ` Linus Torvalds
2006-12-11 0:15 ` Marco Costalba
2006-12-11 0:51 ` Linus Torvalds
2006-12-11 7:17 ` Marco Costalba
2006-12-11 10:00 ` Alex Riesen
2006-12-11 16:59 ` Linus Torvalds
2006-12-11 17:07 ` Linus Torvalds
2006-12-11 17:39 ` Marco Costalba
2006-12-11 18:15 ` Linus Torvalds
2006-12-11 18:59 ` Marco Costalba
2006-12-11 19:25 ` Linus Torvalds
2006-12-11 20:28 ` Josef Weidendorfer
2006-12-11 20:40 ` Linus Torvalds
2006-12-11 20:54 ` Josef Weidendorfer
2006-12-11 21:14 ` Linus Torvalds
2006-12-15 18:45 ` Marco Costalba
2006-12-15 19:20 ` Linus Torvalds
2006-12-15 20:41 ` Marco Costalba
2006-12-15 21:04 ` Marco Costalba
2006-12-11 9:26 ` Josef Weidendorfer
2006-12-11 12:52 ` Marco Costalba
2006-12-11 13:28 ` Josef Weidendorfer
2006-12-11 17:28 ` Marco Costalba
2006-12-11 11:39 ` Andreas Ericsson
2006-12-11 12:59 ` Marco Costalba
2006-12-11 13:40 ` Andreas Ericsson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=457D5FE7.3010309@op5.se \
--to=ae@op5.se \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=mcostalba@gmail.com \
--cc=raa.lkml@gmail.com \
--cc=spearce@spearce.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).