git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Steffen Prohaska <prohaska@zib.de>
To: Junio C Hamano <gitster@pobox.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: git@vger.kernel.org, "Johannes Sixt" <j6t@kdbg.org>,
	"John Keeping" <john@keeping.me.uk>,
	"Jonathan Nieder" <jrnieder@gmail.com>,
	"Kyle J. McKay" <mackyle@gmail.com>,
	"Torsten Bögershausen" <tboegi@web.de>,
	"Eric Sunshine" <sunshine@sunshineco.com>,
	"Steffen Prohaska" <prohaska@zib.de>
Subject: [PATCH v5 1/2] xread, xwrite: Limit size of IO, fixing IO of 2GB and more on Mac OS X
Date: Tue, 20 Aug 2013 08:43:54 +0200	[thread overview]
Message-ID: <1376981035-23284-2-git-send-email-prohaska@zib.de> (raw)
In-Reply-To: <1376981035-23284-1-git-send-email-prohaska@zib.de>

Previously, filtering 2GB or more through an external filter (see test)
failed on Mac OS X 10.8.4 (12E55) for a 64-bit executable with:

    error: read from external filter cat failed
    error: cannot feed the input to external filter cat
    error: cat died of signal 13
    error: external filter cat failed 141
    error: external filter cat failed

The reason was that read() immediately returns with EINVAL if nbyte >=
2GB.  According to POSIX [1], if the value of nbyte passed to read() is
greater than SSIZE_MAX, the result is implementation-defined.  The write
function has the same restriction [2].  Since OS X still supports
running 32-bit executables, the 32-bit limit (SSIZE_MAX = INT_MAX
= 2GB - 1) seems to be also imposed on 64-bit executables under certain
conditions.  For write, the problem has been addressed earlier [6c642a].

This commit addresses the problem for read() and write() by limiting
size of IO chunks unconditionally on all platforms in xread() and
xwrite().  Large chunks only cause problems, like triggering the OS
X bug or causing latencies when killing the process.  Reasonably sized
smaller chunks have no negative impact on performance.

The compat wrapper clipped_write() introduced earlier [6c642a] is not
needed anymore.  It will be reverted in a separate commit.  The new test
catches read and write problems.

Note that 'git add' exits with 0 even if it prints filtering errors to
stderr.  The test, therefore, checks stderr.  'git add' should probably
be changed (sometime in another commit) to exit with nonzero if
filtering fails.  The test could then be changed to use test_must_fail.

Thanks to the following people for suggestions and testing:

    Johannes Sixt <j6t@kdbg.org>
    John Keeping <john@keeping.me.uk>
    Jonathan Nieder <jrnieder@gmail.com>
    Kyle J. McKay <mackyle@gmail.com>
    Linus Torvalds <torvalds@linux-foundation.org>
    Torsten Bögershausen <tboegi@web.de>

[1] http://pubs.opengroup.org/onlinepubs/009695399/functions/read.html
[2] http://pubs.opengroup.org/onlinepubs/009695399/functions/write.html

[6c642a] commit 6c642a878688adf46b226903858b53e2d31ac5c3
    compate/clipped-write.c: large write(2) fails on Mac OS X/XNU

Signed-off-by: Steffen Prohaska <prohaska@zib.de>
---
 t/t0021-conversion.sh | 14 ++++++++++++++
 wrapper.c             | 12 ++++++++++++
 2 files changed, 26 insertions(+)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index e50f0f7..b92e6cb 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -190,4 +190,18 @@ test_expect_success 'required filter clean failure' '
 	test_must_fail git add test.fc
 '
 
+test -n "$GIT_TEST_LONG" && test_set_prereq EXPENSIVE
+
+test_expect_success EXPENSIVE 'filter large file' '
+	git config filter.largefile.smudge cat &&
+	git config filter.largefile.clean cat &&
+	for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&
+	echo "2GB filter=largefile" >.gitattributes &&
+	git add 2GB 2>err &&
+	! test -s err &&
+	rm -f 2GB &&
+	git checkout -- 2GB 2>err &&
+	! test -s err
+'
+
 test_done
diff --git a/wrapper.c b/wrapper.c
index 6a015de..97e3cf7 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -131,6 +131,14 @@ void *xcalloc(size_t nmemb, size_t size)
 }
 
 /*
+ * Limit size of IO chunks, because huge chunks only cause pain.  OS X 64-bit
+ * buggy, returning EINVAL if len >= INT_MAX; and even in the absense of bugs,
+ * large chunks can result in bad latencies when you decide to kill the
+ * process.
+ */
+#define MAX_IO_SIZE (8*1024*1024)
+
+/*
  * xread() is the same a read(), but it automatically restarts read()
  * operations with a recoverable error (EAGAIN and EINTR). xread()
  * DOES NOT GUARANTEE that "len" bytes is read even if the data is available.
@@ -138,6 +146,8 @@ void *xcalloc(size_t nmemb, size_t size)
 ssize_t xread(int fd, void *buf, size_t len)
 {
 	ssize_t nr;
+	if (len > MAX_IO_SIZE)
+	    len = MAX_IO_SIZE;
 	while (1) {
 		nr = read(fd, buf, len);
 		if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
@@ -154,6 +164,8 @@ ssize_t xread(int fd, void *buf, size_t len)
 ssize_t xwrite(int fd, const void *buf, size_t len)
 {
 	ssize_t nr;
+	if (len > MAX_IO_SIZE)
+	    len = MAX_IO_SIZE;
 	while (1) {
 		nr = write(fd, buf, len);
 		if ((nr < 0) && (errno == EAGAIN || errno == EINTR))
-- 
1.8.4.rc3.5.g4f480ff

  reply	other threads:[~2013-08-20  6:45 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-17 12:40 [PATCH] xread(): Fix read error when filtering >= 2GB on Mac OS X Steffen Prohaska
2013-08-17 15:27 ` John Keeping
2013-08-17 15:56 ` Torsten Bögershausen
2013-08-17 17:16 ` Johannes Sixt
2013-08-17 18:57 ` Jonathan Nieder
2013-08-17 20:25 ` Kyle J. McKay
2013-08-17 21:23   ` Jonathan Nieder
2013-08-19  6:38 ` [PATCH v2] compat: Fix read() of 2GB and more " Steffen Prohaska
2013-08-19  7:54   ` John Keeping
2013-08-19  8:20     ` Steffen Prohaska
2013-08-19  8:20   ` Johannes Sixt
2013-08-19  8:25     ` Stefan Beller
2013-08-19  8:40       ` Johannes Sixt
2013-08-19  8:28     ` Steffen Prohaska
2013-08-19  8:21   ` [PATCH v3] " Steffen Prohaska
2013-08-19 13:59     ` Eric Sunshine
2013-08-19 16:33       ` Junio C Hamano
2013-08-19 15:41     ` [PATCH v4] " Steffen Prohaska
2013-08-19 16:04       ` Linus Torvalds
2013-08-19 16:37         ` Steffen Prohaska
2013-08-19 17:24           ` Junio C Hamano
2013-08-19 17:16         ` Junio C Hamano
2013-08-19 17:28           ` Linus Torvalds
2013-08-19 21:56           ` Kyle J. McKay
2013-08-19 22:51             ` Linus Torvalds
2013-08-27  4:59               ` Junio C Hamano
2013-08-20  6:43       ` [PATCH v5 0/2] Fix IO of >=2GB on Mac OS X by limiting IO chunks Steffen Prohaska
2013-08-20  6:43         ` Steffen Prohaska [this message]
2013-08-20 19:37           ` [PATCH v5 1/2] xread, xwrite: Limit size of IO, fixing IO of 2GB and more on Mac OS X Junio C Hamano
2013-08-21 19:50           ` Torsten Bögershausen
2013-08-20  6:43         ` [PATCH v5 2/2] Revert "compate/clipped-write.c: large write(2) fails on Mac OS X/XNU" Steffen Prohaska
2013-08-21 13:46         ` [PATCH v6 0/2] Fix IO >= 2GB on Mac, fixed typo Steffen Prohaska
2013-08-21 13:46           ` [PATCH v5 1/2] xread, xwrite: Limit size of IO, fixing IO of 2GB and more on Mac OS X Steffen Prohaska
2013-08-21 13:46           ` [PATCH v5 2/2] Revert "compate/clipped-write.c: large write(2) fails on Mac OS X/XNU" Steffen Prohaska
2013-08-21 15:58           ` [PATCH v6 0/2] Fix IO >= 2GB on Mac, fixed typo Junio C Hamano
2013-08-19  8:27   ` [PATCH v2] compat: Fix read() of 2GB and more on Mac OS X Johannes Sixt
2013-08-19 14:41   ` Torsten Bögershausen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1376981035-23284-2-git-send-email-prohaska@zib.de \
    --to=prohaska@zib.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=john@keeping.me.uk \
    --cc=jrnieder@gmail.com \
    --cc=mackyle@gmail.com \
    --cc=sunshine@sunshineco.com \
    --cc=tboegi@web.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).