From: Junio C Hamano <gitster@pobox.com>
To: Dmitry Potapov <dpotapov@gmail.com>
Cc: Zygo Blaxell <zblaxell@esightcorp.com>,
Ilari Liusvaara <ilari.liusvaara@elisanet.fi>,
Thomas Rast <trast@student.ethz.ch>,
Jonathan Nieder <jrnieder@gmail.com>,
git@vger.kernel.org
Subject: Re: [PATCH] Teach "git add" and friends to be paranoid
Date: Sat, 20 Feb 2010 11:23:11 -0800 [thread overview]
Message-ID: <7v8waniue8.fsf@alter.siamese.dyndns.org> (raw)
In-Reply-To: 7v635tkta7.fsf@alter.siamese.dyndns.org
Junio C Hamano <gitster@pobox.com> writes:
> Dmitry Potapov <dpotapov@gmail.com> writes:
>
>> ... I think it is possible
>> to avoid the overhead of being on the safe side in a few common cases.
>> Here is a patch. I have not had time to test it, but changes appear to
>> trivial.
>
> Yeah, these are obvious "paranoia not needed" cases.
>
Actually the "if it is smaller than 256k" part is not quite obvious.
>> @@ -2490,7 +2491,7 @@ int index_fd(unsigned char *sha1, int fd, struct stat *st, int write_object,
>> size_t size = xsize_t(st->st_size);
>>
>> flag = write_object ? INDEX_MEM_WRITE_OBJECT : 0;
>> - if (!S_ISREG(st->st_mode)) {
>> + if (!S_ISREG(st->st_mode) || size < 262144) {
>> struct strbuf sbuf = STRBUF_INIT;
>> if (strbuf_read(&sbuf, fd, 4096) >= 0)
>> ret = index_mem(sha1, sbuf.buf, sbuf.len,
INDEX_MEM_PARANOID is never given to index_mem() in this codepath, so
trade-offs look like this:
* In non-paranoia mode, your conjecture is that between
- malloc, read, SHA-1, deflate, and then free; and
- mmap, SHA-1, deflate and then munmap
the former is faster for small files that can fit in core without
thrashing.
* In paranoia mode, your conjecture is that between
- malloc, read, SHA-1, deflate, and then free; and
- mmap, SHA-1, SHA-1 and deflate in chunks, and then munmap
the former is faster for small files that can fit in core without
thrashing.
The "mmap" strategy has larger cost in paranoia mode compared to its cost
in non-paranoia mode. The "read" strategy on the other hand has the same
cost in both modes. If this "read small files" is good for non-paranoia
mode, it is obvious that it is also good (better) for paranoia mode.
Which means that this hunk addresses an unrelated issue. "paranoid
avoidance" falls naturally as a side effect of doing this, but that is not
the primary effect of this change.
There needs some benchmarking to justify it, I think.
So I'd split this hunk out when queuing.
Thanks.
next prev parent reply other threads:[~2010-02-20 19:23 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20100211234753.22574.48799.reportbug@gibbs.hungrycats.org>
2010-02-12 0:27 ` Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs Jonathan Nieder
2010-02-12 1:23 ` Zygo Blaxell
2010-02-13 12:12 ` Jonathan Nieder
2010-02-13 13:39 ` Ilari Liusvaara
2010-02-13 14:39 ` Thomas Rast
2010-02-13 16:29 ` Ilari Liusvaara
2010-02-13 22:09 ` Dmitry Potapov
2010-02-13 22:37 ` Zygo Blaxell
2010-02-14 1:18 ` [PATCH] don't use mmap() to hash files Dmitry Potapov
2010-02-14 1:37 ` Junio C Hamano
2010-02-14 2:18 ` Dmitry Potapov
2010-02-14 3:14 ` Junio C Hamano
2010-02-14 11:14 ` Thomas Rast
2010-02-14 11:46 ` Junio C Hamano
2010-02-14 1:53 ` Johannes Schindelin
2010-02-14 2:00 ` Junio C Hamano
2010-02-14 2:42 ` Dmitry Potapov
2010-02-14 11:07 ` Jakub Narebski
2010-02-14 11:55 ` Paolo Bonzini
2010-02-14 18:10 ` Johannes Schindelin
2010-02-14 19:06 ` Dmitry Potapov
2010-02-14 19:22 ` Johannes Schindelin
2010-02-14 19:28 ` Johannes Schindelin
2010-02-14 19:56 ` Dmitry Potapov
2010-02-14 23:52 ` Zygo Blaxell
2010-02-15 5:05 ` Nicolas Pitre
2010-02-15 12:23 ` Dmitry Potapov
2010-02-15 7:48 ` Paolo Bonzini
2010-02-15 12:25 ` Dmitry Potapov
2010-02-14 19:55 ` Dmitry Potapov
2010-02-14 23:13 ` Avery Pennarun
2010-02-15 4:16 ` Nicolas Pitre
2010-02-15 5:01 ` Avery Pennarun
2010-02-15 5:48 ` Nicolas Pitre
2010-02-15 19:19 ` Avery Pennarun
2010-02-15 19:29 ` Nicolas Pitre
2010-02-14 3:05 ` [PATCH v2] " Dmitry Potapov
2010-02-18 1:16 ` [PATCH] Teach "git add" and friends to be paranoid Junio C Hamano
2010-02-18 1:20 ` Junio C Hamano
2010-02-18 15:32 ` Zygo Blaxell
2010-02-19 17:51 ` Junio C Hamano
2010-02-18 1:38 ` Jeff King
2010-02-18 4:55 ` Nicolas Pitre
2010-02-18 5:36 ` Junio C Hamano
2010-02-18 7:27 ` Wincent Colaiuta
2010-02-18 16:18 ` Zygo Blaxell
2010-02-18 18:12 ` Jonathan Nieder
2010-02-18 18:35 ` Junio C Hamano
2010-02-22 12:59 ` Paolo Bonzini
2010-02-22 13:33 ` Dmitry Potapov
2010-02-18 10:14 ` Thomas Rast
2010-02-18 18:16 ` Junio C Hamano
2010-02-18 19:58 ` Nicolas Pitre
2010-02-18 20:11 ` 16 gig, 350,000 file repository Bill Lear
2010-02-18 20:58 ` Nicolas Pitre
2010-02-19 9:27 ` Erik Faye-Lund
2010-02-22 22:20 ` Bill Lear
2010-02-22 22:31 ` Nicolas Pitre
2010-02-18 20:14 ` [PATCH] Teach "git add" and friends to be paranoid Peter Harris
2010-02-18 20:17 ` Junio C Hamano
2010-02-18 21:30 ` Nicolas Pitre
2010-02-19 1:04 ` Jonathan Nieder
2010-02-19 15:26 ` Zygo Blaxell
2010-02-19 17:52 ` Junio C Hamano
2010-02-19 19:08 ` Zygo Blaxell
2010-02-19 8:28 ` Dmitry Potapov
2010-02-19 17:52 ` Junio C Hamano
2010-02-20 19:23 ` Junio C Hamano [this message]
2010-02-21 7:21 ` Dmitry Potapov
2010-02-21 19:32 ` Junio C Hamano
2010-02-22 3:35 ` Dmitry Potapov
2010-02-22 6:59 ` Junio C Hamano
2010-02-22 12:25 ` Dmitry Potapov
2010-02-22 15:40 ` Nicolas Pitre
2010-02-22 16:01 ` Dmitry Potapov
2010-02-22 17:31 ` Zygo Blaxell
2010-02-22 18:01 ` Nicolas Pitre
2010-02-22 19:56 ` Junio C Hamano
2010-02-22 20:52 ` Nicolas Pitre
2010-02-22 18:05 ` Dmitry Potapov
2010-02-22 18:14 ` Nicolas Pitre
2010-02-14 1:36 ` mmap with MAP_PRIVATE is useless (was Re: Bug#569505: git-core: 'git add' corrupts repository if the working directory is modified as it runs) Paolo Bonzini
2010-02-14 1:53 ` mmap with MAP_PRIVATE is useless Junio C Hamano
2010-02-14 2:11 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7v8waniue8.fsf@alter.siamese.dyndns.org \
--to=gitster@pobox.com \
--cc=dpotapov@gmail.com \
--cc=git@vger.kernel.org \
--cc=ilari.liusvaara@elisanet.fi \
--cc=jrnieder@gmail.com \
--cc=trast@student.ethz.ch \
--cc=zblaxell@esightcorp.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).