git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "Kyle J. McKay" <mackyle@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Linus Torvalds" <torvalds@linux-foundation.org>,
	"Steffen Prohaska" <prohaska@zib.de>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Johannes Sixt" <j6t@kdbg.org>,
	"John Keeping" <john@keeping.me.uk>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Torsten Bögershausen" <tboegi@web.de>,
	"Eric Sunshine" <sunshine@sunshineco.com>
Subject: Re: [PATCH v4] compat: Fix read() of 2GB and more on Mac OS X
Date: Mon, 19 Aug 2013 14:56:00 -0700	[thread overview]
Message-ID: <7E527329-230E-4954-9942-8BB0935ACE4D@gmail.com> (raw)
In-Reply-To: <xmqqeh9p8ut3.fsf@gitster.dls.corp.google.com>

On Aug 19, 2013, at 10:16, Junio C Hamano wrote:

> Linus Torvalds <torvalds@linux-foundation.org> writes:
>
>> So why isn't the patch much more straightforward? Like the attached
>> totally untested one that just limits the read/write size to 8MB
>> (which is totally arbitrary, but small enough to not have any latency
>> issues even on slow disks, and big enough that any reasonable IO
>> subsystem will still get good throughput).
>
> Ahh.  OK, not noticing EINVAL unconditionally, but always feed IOs
> in chunks that are big enough for sane systems but small enough for
> broken ones.
>
> That makes sense.  Could somebody on MacOS X test this?

I tested this on both i386 (OS X 32-bit intel) and x86_64 (OS X 64-bit  
intel).

What I tested:

1. I started with branch pu:
    (965adb10 Merge branch 'sg/bash-prompt-lf-in-cwd-test' into pu)

2. I added Steffen's additional test (modified to always run) to t0021:

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index e50f0f7..b92e6cb 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -190,4 +190,16 @@ test_expect_success 'required filter clean  
failure' '
	test_must_fail git add test.fc
'

+test_expect_success 'filter large file' '
+	git config filter.largefile.smudge cat &&
+	git config filter.largefile.clean cat &&
+	for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&
+	echo "2GB filter=largefile" >.gitattributes &&
+	git add 2GB 2>err &&
+	! test -s err &&
+	rm -f 2GB &&
+	git checkout -- 2GB 2>err &&
+	! test -s err
+'
+
test_done

3. I verified that the test fails with an unpatched build on both 32- 
bit and 64-bit.

4. I applied Linus's unmodified patch to wrapper.c.

5. I tested again.  The t0021 test now passes on 64-bit.  It still  
fails on 32-bit for another reason unrelated to Linus's patch.

It fails when attempting the "git add 2GB" line from the new 'filter  
large file' part of the test.  The failure with backtrace:

git(16806,0xa095c720) malloc: *** mmap(size=2147487744) failed (error  
code=12)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug

# NOTE: error code 12 is ENOMEM on OS X

Breakpoint 1, 0x97f634b1 in malloc_error_break ()
(gdb) bt
#0  0x97f634b1 in malloc_error_break ()
#1  0x97f5e49f in szone_error ()
#2  0x97e8b876 in allocate_pages ()
#3  0x97e8c062 in large_and_huge_malloc ()
#4  0x97e831c8 in szone_malloc ()
#5  0x97e82fb8 in malloc_zone_malloc ()
#6  0x97e8c7b2 in realloc ()
#7  0x00128abe in xrealloc (ptr=0x0, size=2147483649) at wrapper.c:100
#8  0x00111a8c in strbuf_grow (sb=0xbfffe634, extra=2147483648) at  
strbuf.c:74
#9  0x00112bb9 in strbuf_read (sb=0xbfffe634, fd=6, hint=2548572518)  
at strbuf.c:349
#10 0x0009b899 in apply_filter (path=<value temporarily unavailable,  
due to optimizations>, src=0x1000000 ' ' <repeats 200 times>...,  
len=2147483648, dst=0xbfffe774, cmd=0x402980 "cat") at convert.c:407
#11 0x0009c6f6 in convert_to_git (path=0x4028b4 "2GB", src=0x1000000 '  
' <repeats 200 times>..., len=2147483648, dst=0xbfffe774,  
checksafe=SAFE_CRLF_WARN) at convert.c:764
#12 0x0010bb38 in index_mem (sha1=0x402330 "", buf=0x1000000,  
size=2147483648, type=OBJ_BLOB, path=0x4028b4 "2GB", flags=1) at  
sha1_file.c:3044
#13 0x0010bf57 in index_core [inlined] () at /private/var/tmp/src/git/ 
sha1_file.c:3101
#14 0x0010bf57 in index_fd (sha1=0x402330 "", fd=5, st=0xbfffe900,  
type=OBJ_BLOB, path=0x4028b4 "2GB", flags=1) at sha1_file.c:3139
#15 0x0010c05e in index_path (sha1=0x402330 "", path=0x4028b4 "2GB",  
st=0xbfffe900, flags=1) at sha1_file.c:3157
#16 0x000e82f4 in add_to_index (istate=0x1a8820, path=0x4028b4 "2GB",  
st=0xbfffe900, flags=0) at read-cache.c:665
#17 0x000e87c8 in add_file_to_index (istate=0x1a8820, path=0x4028b4  
"2GB", flags=0) at read-cache.c:694
#18 0x0000440a in cmd_add (argc=<value temporarily unavailable, due to  
optimizations>, argv=0xbffff584, prefix=0x0) at builtin/add.c:299
#19 0x00002e1f in run_builtin [inlined] () at /private/var/tmp/src/git/ 
git.c:303
#20 0x00002e1f in handle_internal_command (argc=2, argv=0xbffff584) at  
git.c:466
#21 0x000032d4 in run_argv [inlined] () at /private/var/tmp/src/git/ 
git.c:512
#22 0x000032d4 in main (argc=2, av=0xbfffe28c) at git.c:595

The size 2147487744 is 2GB + 4096 bytes.  Apparently git does not  
support a filter for a file unless the file can fit entirely into  
git's memory space.  Normally a single 2GB + 4096 byte allocation  
works in an OS X 32-bit process, but something else is apparently  
eating up a large portion of the memory space in this case (perhaps an  
mmap'd copy?).  In any case, if the file being filtered was closer to  
4GB in size it would always fail on 32-bit regardless.

The fact that the entire file is read into memory when applying the  
filter does not seem like a good thing (see #7-#10 above).

--Kyle

  parent reply	other threads:[~2013-08-19 21:56 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-17 12:40 [PATCH] xread(): Fix read error when filtering >= 2GB on Mac OS X Steffen Prohaska
2013-08-17 15:27 ` John Keeping
2013-08-17 15:56 ` Torsten Bögershausen
2013-08-17 17:16 ` Johannes Sixt
2013-08-17 18:57 ` Jonathan Nieder
2013-08-17 20:25 ` Kyle J. McKay
2013-08-17 21:23   ` Jonathan Nieder
2013-08-19  6:38 ` [PATCH v2] compat: Fix read() of 2GB and more " Steffen Prohaska
2013-08-19  7:54   ` John Keeping
2013-08-19  8:20     ` Steffen Prohaska
2013-08-19  8:20   ` Johannes Sixt
2013-08-19  8:25     ` Stefan Beller
2013-08-19  8:40       ` Johannes Sixt
2013-08-19  8:28     ` Steffen Prohaska
2013-08-19  8:21   ` [PATCH v3] " Steffen Prohaska
2013-08-19 13:59     ` Eric Sunshine
2013-08-19 16:33       ` Junio C Hamano
2013-08-19 15:41     ` [PATCH v4] " Steffen Prohaska
2013-08-19 16:04       ` Linus Torvalds
2013-08-19 16:37         ` Steffen Prohaska
2013-08-19 17:24           ` Junio C Hamano
2013-08-19 17:16         ` Junio C Hamano
2013-08-19 17:28           ` Linus Torvalds
2013-08-19 21:56           ` Kyle J. McKay [this message]
2013-08-19 22:51             ` Linus Torvalds
2013-08-27  4:59               ` Junio C Hamano
2013-08-20  6:43       ` [PATCH v5 0/2] Fix IO of >=2GB on Mac OS X by limiting IO chunks Steffen Prohaska
2013-08-20  6:43         ` [PATCH v5 1/2] xread, xwrite: Limit size of IO, fixing IO of 2GB and more on Mac OS X Steffen Prohaska
2013-08-20 19:37           ` Junio C Hamano
2013-08-21 19:50           ` Torsten Bögershausen
2013-08-20  6:43         ` [PATCH v5 2/2] Revert "compate/clipped-write.c: large write(2) fails on Mac OS X/XNU" Steffen Prohaska
2013-08-21 13:46         ` [PATCH v6 0/2] Fix IO >= 2GB on Mac, fixed typo Steffen Prohaska
2013-08-21 13:46           ` [PATCH v5 1/2] xread, xwrite: Limit size of IO, fixing IO of 2GB and more on Mac OS X Steffen Prohaska
2013-08-21 13:46           ` [PATCH v5 2/2] Revert "compate/clipped-write.c: large write(2) fails on Mac OS X/XNU" Steffen Prohaska
2013-08-21 15:58           ` [PATCH v6 0/2] Fix IO >= 2GB on Mac, fixed typo Junio C Hamano
2013-08-19  8:27   ` [PATCH v2] compat: Fix read() of 2GB and more on Mac OS X Johannes Sixt
2013-08-19 14:41   ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7E527329-230E-4954-9942-8BB0935ACE4D@gmail.com \
    --to=mackyle@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=john@keeping.me.uk \
    --cc=jrnieder@gmail.com \
    --cc=prohaska@zib.de \
    --cc=sunshine@sunshineco.com \
    --cc=tboegi@web.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).