From: "Kyle J. McKay" <mackyle@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>,
"Steffen Prohaska" <prohaska@zib.de>,
"Git Mailing List" <git@vger.kernel.org>,
"Johannes Sixt" <j6t@kdbg.org>,
"John Keeping" <john@keeping.me.uk>,
"Jonathan Nieder" <jrnieder@gmail.com>,
"Torsten Bögershausen" <tboegi@web.de>,
"Eric Sunshine" <sunshine@sunshineco.com>
Subject: Re: [PATCH v4] compat: Fix read() of 2GB and more on Mac OS X
Date: Mon, 19 Aug 2013 14:56:00 -0700 [thread overview]
Message-ID: <7E527329-230E-4954-9942-8BB0935ACE4D@gmail.com> (raw)
In-Reply-To: <xmqqeh9p8ut3.fsf@gitster.dls.corp.google.com>
On Aug 19, 2013, at 10:16, Junio C Hamano wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> So why isn't the patch much more straightforward? Like the attached
>> totally untested one that just limits the read/write size to 8MB
>> (which is totally arbitrary, but small enough to not have any latency
>> issues even on slow disks, and big enough that any reasonable IO
>> subsystem will still get good throughput).
>
> Ahh. OK, not noticing EINVAL unconditionally, but always feed IOs
> in chunks that are big enough for sane systems but small enough for
> broken ones.
>
> That makes sense. Could somebody on MacOS X test this?
I tested this on both i386 (OS X 32-bit intel) and x86_64 (OS X 64-bit
intel).
What I tested:
1. I started with branch pu:
(965adb10 Merge branch 'sg/bash-prompt-lf-in-cwd-test' into pu)
2. I added Steffen's additional test (modified to always run) to t0021:
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index e50f0f7..b92e6cb 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -190,4 +190,16 @@ test_expect_success 'required filter clean
failure' '
test_must_fail git add test.fc
'
+test_expect_success 'filter large file' '
+ git config filter.largefile.smudge cat &&
+ git config filter.largefile.clean cat &&
+ for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&
+ echo "2GB filter=largefile" >.gitattributes &&
+ git add 2GB 2>err &&
+ ! test -s err &&
+ rm -f 2GB &&
+ git checkout -- 2GB 2>err &&
+ ! test -s err
+'
+
test_done
3. I verified that the test fails with an unpatched build on both 32-
bit and 64-bit.
4. I applied Linus's unmodified patch to wrapper.c.
5. I tested again. The t0021 test now passes on 64-bit. It still
fails on 32-bit for another reason unrelated to Linus's patch.
It fails when attempting the "git add 2GB" line from the new 'filter
large file' part of the test. The failure with backtrace:
git(16806,0xa095c720) malloc: *** mmap(size=2147487744) failed (error
code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
# NOTE: error code 12 is ENOMEM on OS X
Breakpoint 1, 0x97f634b1 in malloc_error_break ()
(gdb) bt
#0 0x97f634b1 in malloc_error_break ()
#1 0x97f5e49f in szone_error ()
#2 0x97e8b876 in allocate_pages ()
#3 0x97e8c062 in large_and_huge_malloc ()
#4 0x97e831c8 in szone_malloc ()
#5 0x97e82fb8 in malloc_zone_malloc ()
#6 0x97e8c7b2 in realloc ()
#7 0x00128abe in xrealloc (ptr=0x0, size=2147483649) at wrapper.c:100
#8 0x00111a8c in strbuf_grow (sb=0xbfffe634, extra=2147483648) at
strbuf.c:74
#9 0x00112bb9 in strbuf_read (sb=0xbfffe634, fd=6, hint=2548572518)
at strbuf.c:349
#10 0x0009b899 in apply_filter (path=<value temporarily unavailable,
due to optimizations>, src=0x1000000 ' ' <repeats 200 times>...,
len=2147483648, dst=0xbfffe774, cmd=0x402980 "cat") at convert.c:407
#11 0x0009c6f6 in convert_to_git (path=0x4028b4 "2GB", src=0x1000000 '
' <repeats 200 times>..., len=2147483648, dst=0xbfffe774,
checksafe=SAFE_CRLF_WARN) at convert.c:764
#12 0x0010bb38 in index_mem (sha1=0x402330 "", buf=0x1000000,
size=2147483648, type=OBJ_BLOB, path=0x4028b4 "2GB", flags=1) at
sha1_file.c:3044
#13 0x0010bf57 in index_core [inlined] () at /private/var/tmp/src/git/
sha1_file.c:3101
#14 0x0010bf57 in index_fd (sha1=0x402330 "", fd=5, st=0xbfffe900,
type=OBJ_BLOB, path=0x4028b4 "2GB", flags=1) at sha1_file.c:3139
#15 0x0010c05e in index_path (sha1=0x402330 "", path=0x4028b4 "2GB",
st=0xbfffe900, flags=1) at sha1_file.c:3157
#16 0x000e82f4 in add_to_index (istate=0x1a8820, path=0x4028b4 "2GB",
st=0xbfffe900, flags=0) at read-cache.c:665
#17 0x000e87c8 in add_file_to_index (istate=0x1a8820, path=0x4028b4
"2GB", flags=0) at read-cache.c:694
#18 0x0000440a in cmd_add (argc=<value temporarily unavailable, due to
optimizations>, argv=0xbffff584, prefix=0x0) at builtin/add.c:299
#19 0x00002e1f in run_builtin [inlined] () at /private/var/tmp/src/git/
git.c:303
#20 0x00002e1f in handle_internal_command (argc=2, argv=0xbffff584) at
git.c:466
#21 0x000032d4 in run_argv [inlined] () at /private/var/tmp/src/git/
git.c:512
#22 0x000032d4 in main (argc=2, av=0xbfffe28c) at git.c:595
The size 2147487744 is 2GB + 4096 bytes. Apparently git does not
support a filter for a file unless the file can fit entirely into
git's memory space. Normally a single 2GB + 4096 byte allocation
works in an OS X 32-bit process, but something else is apparently
eating up a large portion of the memory space in this case (perhaps an
mmap'd copy?). In any case, if the file being filtered was closer to
4GB in size it would always fail on 32-bit regardless.
The fact that the entire file is read into memory when applying the
filter does not seem like a good thing (see #7-#10 above).
--Kyle
next prev parent reply other threads:[~2013-08-19 21:56 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-17 12:40 [PATCH] xread(): Fix read error when filtering >= 2GB on Mac OS X Steffen Prohaska
2013-08-17 15:27 ` John Keeping
2013-08-17 15:56 ` Torsten Bögershausen
2013-08-17 17:16 ` Johannes Sixt
2013-08-17 18:57 ` Jonathan Nieder
2013-08-17 20:25 ` Kyle J. McKay
2013-08-17 21:23 ` Jonathan Nieder
2013-08-19 6:38 ` [PATCH v2] compat: Fix read() of 2GB and more " Steffen Prohaska
2013-08-19 7:54 ` John Keeping
2013-08-19 8:20 ` Steffen Prohaska
2013-08-19 8:20 ` Johannes Sixt
2013-08-19 8:25 ` Stefan Beller
2013-08-19 8:40 ` Johannes Sixt
2013-08-19 8:28 ` Steffen Prohaska
2013-08-19 8:21 ` [PATCH v3] " Steffen Prohaska
2013-08-19 13:59 ` Eric Sunshine
2013-08-19 16:33 ` Junio C Hamano
2013-08-19 15:41 ` [PATCH v4] " Steffen Prohaska
2013-08-19 16:04 ` Linus Torvalds
2013-08-19 16:37 ` Steffen Prohaska
2013-08-19 17:24 ` Junio C Hamano
2013-08-19 17:16 ` Junio C Hamano
2013-08-19 17:28 ` Linus Torvalds
2013-08-19 21:56 ` Kyle J. McKay [this message]
2013-08-19 22:51 ` Linus Torvalds
2013-08-27 4:59 ` Junio C Hamano
2013-08-20 6:43 ` [PATCH v5 0/2] Fix IO of >=2GB on Mac OS X by limiting IO chunks Steffen Prohaska
2013-08-20 6:43 ` [PATCH v5 1/2] xread, xwrite: Limit size of IO, fixing IO of 2GB and more on Mac OS X Steffen Prohaska
2013-08-20 19:37 ` Junio C Hamano
2013-08-21 19:50 ` Torsten Bögershausen
2013-08-20 6:43 ` [PATCH v5 2/2] Revert "compate/clipped-write.c: large write(2) fails on Mac OS X/XNU" Steffen Prohaska
2013-08-21 13:46 ` [PATCH v6 0/2] Fix IO >= 2GB on Mac, fixed typo Steffen Prohaska
2013-08-21 13:46 ` [PATCH v5 1/2] xread, xwrite: Limit size of IO, fixing IO of 2GB and more on Mac OS X Steffen Prohaska
2013-08-21 13:46 ` [PATCH v5 2/2] Revert "compate/clipped-write.c: large write(2) fails on Mac OS X/XNU" Steffen Prohaska
2013-08-21 15:58 ` [PATCH v6 0/2] Fix IO >= 2GB on Mac, fixed typo Junio C Hamano
2013-08-19 8:27 ` [PATCH v2] compat: Fix read() of 2GB and more on Mac OS X Johannes Sixt
2013-08-19 14:41 ` Torsten Bögershausen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7E527329-230E-4954-9942-8BB0935ACE4D@gmail.com \
--to=mackyle@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=j6t@kdbg.org \
--cc=john@keeping.me.uk \
--cc=jrnieder@gmail.com \
--cc=prohaska@zib.de \
--cc=sunshine@sunshineco.com \
--cc=tboegi@web.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).