From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
"brian m. carlson" <sandals@crustytoothpaste.net>,
"Randall S. Becker" <randall.becker@nexbridge.ca>,
Phillip Wood <phillip.wood@dunelm.org.uk>,
Randall Becker <randall.becker@nexbridge.ca>
Subject: [PATCH] wrapper: properly handle MAX_IO_SIZE in `write_in_full()`
Date: Thu, 09 Apr 2026 14:47:59 +0200 [thread overview]
Message-ID: <20260409-b4-pks-writev-max-io-size-v1-1-81730e8f35df@pks.im> (raw)
Some systems like NonStop set a comparatively small `MAX_IO_SIZE`, which
limits the maximum number of bytes we're allowed to write in a single
call. We already handle this limit properly in `xwrite()`, but we have
recently introduced wrappers for writev(3p) where we don't. This will
cause the syscall to return EINVAL in case somebody passes an iovec
entry to writev(3p) that is larger than `MAX_IO_SIZE`.
Introduce a new function `xwritev()` that is similar to `xwrite()` in
that it handles such platform-specific nuances. The logic is rather
simple: we simply coalesce all iovecs that don't exceed `MAX_IO_SIZE`
and pass those to writev(3p). If the first iovec already exceeds the
limit, we'll instead pass it to `xwrite()`, which handles the limit for
us.
Adapt `writev_in_full()` to use this new wrapper. As this wrapper
already knows to to call writev(3p) in a loop already it doesn't need
any further adjustment.
Reported-by: Randall Becker <randall.becker@nexbridge.ca>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
Hi,
this fixes the issue reported by Randall in [1].
I mostly wanted to get this patch out there so that we can discuss a
proposed fix, but as said in the thread I'm also happy to revise course
and instead set NO_WRITEV on NonStop for now. I think we'll want to
eventually land a fix like the one proposed here though, and at that
point the workaround would not be required anymore.
Thanks!
Patrick
[1]: <00f401dcc6e6$7183c0f0$548b42d0$@nexbridge.com>
---
wrapper.c | 51 +++++++++++++++++++++++++++++++++++++++++++++------
wrapper.h | 1 +
2 files changed, 46 insertions(+), 6 deletions(-)
diff --git a/wrapper.c b/wrapper.c
index be8fa575e6..d989c78b4b 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -323,21 +323,60 @@ ssize_t write_in_full(int fd, const void *buf, size_t count)
return total;
}
+ssize_t xwritev(int fd, struct iovec *iov, int iovcnt)
+{
+ ssize_t bytes_written;
+ size_t total_length;
+ int i;
+
+ /*
+ * We need to make sure that writev(3p) call does not write more than
+ * `MAX_IO_SIZE` many bytes. If we do exceed that limit, we only pass
+ * those iovecs to writev(3p) that sum up to less than the limit.
+ *
+ * If on the other hand the first iovec entry already exceeds this
+ * limit we'll instead use xwrite() to write it, which knows to handle
+ * `MAX_IO_SIZE` for us.
+ */
+ for (i = 0, total_length = 0; i < iovcnt; i++) {
+ if (unsigned_add_overflows(total_length, iov[i].iov_len))
+ break;
+
+ total_length += iov[i].iov_len;
+ if (total_length > MAX_IO_SIZE)
+ break;
+ }
+
+ if (i < iovcnt) {
+ /*
+ * The first entry exceeds MAX_IO_SIZE, so we pass it to
+ * xwrite, which knows to handle this case.
+ */
+ if (!i)
+ return xwrite(fd, iov->iov_base, iov->iov_len);
+ iovcnt = i;
+ }
+
+ bytes_written = writev(fd, iov, iovcnt);
+ if (!bytes_written) {
+ errno = ENOSPC;
+ return -1;
+ }
+
+ return bytes_written;
+}
+
ssize_t writev_in_full(int fd, struct iovec *iov, int iovcnt)
{
ssize_t total_written = 0;
while (iovcnt) {
- ssize_t bytes_written = writev(fd, iov, iovcnt);
- if (bytes_written < 0) {
+ ssize_t bytes_written = xwritev(fd, iov, iovcnt);
+ if (bytes_written <= 0) {
if (errno == EINTR || errno == EAGAIN)
continue;
return -1;
}
- if (!bytes_written) {
- errno = ENOSPC;
- return -1;
- }
total_written += bytes_written;
diff --git a/wrapper.h b/wrapper.h
index 27519b32d1..a6287d7f4d 100644
--- a/wrapper.h
+++ b/wrapper.h
@@ -16,6 +16,7 @@ void *xmmap_gently(void *start, size_t length, int prot, int flags, int fd, off_
int xopen(const char *path, int flags, ...);
ssize_t xread(int fd, void *buf, size_t len);
ssize_t xwrite(int fd, const void *buf, size_t len);
+ssize_t xwritev(int fd, struct iovec *iov, int iovcnt);
ssize_t xpread(int fd, void *buf, size_t len, off_t offset);
int xdup(int fd);
FILE *xfopen(const char *path, const char *mode);
---
base-commit: b15384c06f77bc2d34d0d3623a8a58218313a561
change-id: 20260409-b4-pks-writev-max-io-size-e9b803439ae8
next reply other threads:[~2026-04-09 12:48 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 12:47 Patrick Steinhardt [this message]
2026-04-09 15:46 ` [PATCH] wrapper: properly handle MAX_IO_SIZE in `write_in_full()` Ben Knoble
2026-04-09 16:42 ` Junio C Hamano
2026-04-09 20:23 ` Jeff King
2026-04-09 20:40 ` Junio C Hamano
2026-04-09 20:59 ` Jeff King
2026-04-09 21:09 ` Junio C Hamano
2026-04-10 5:19 ` Patrick Steinhardt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: http://vger.kernel.org/majordomo-info.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260409-b4-pks-writev-max-io-size-v1-1-81730e8f35df@pks.im \
--to=ps@pks.im \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=phillip.wood@dunelm.org.uk \
--cc=randall.becker@nexbridge.ca \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://80x24.org/mirrors/git.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).