git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: Re: [PATCH] mingw: handle writes to non-blocking pipe
Date: Sun, 14 Aug 2022 17:37:08 +0200	[thread overview]
Message-ID: <c7c6524c-4f02-10f6-1a58-738cef5aecf2@web.de> (raw)
In-Reply-To: <YvVIYyA8Js0WDAMc@coredump.intra.peff.net>

Am 11.08.2022 um 20:20 schrieb Jeff King:
> On Thu, Aug 11, 2022 at 07:35:33PM +0200, René Scharfe wrote:
>
>> OK, but we can't just split anything up as we like: POSIX wants us to
>> keep writes up to PIPE_BUF atomic.  When I read that name I mistakenly
>> thought it was the size of the pipe buffer, but it's a different value.
>> The minimum value according to POSIX is 512 bytes; on Linux it's 4KB.
>>
>> And Windows doesn't seem to define it.  Bash's ulimit -p returns 8,
>> which is in units of 512 bytes, so it's 4KB like on Linux.  But that's
>> apparently not respected by Windows' write.
>>
>> I just realized that we can interprete the situation slightly
>> differently.  Windows has effectively PIPE_BUF = 2^32, which means all
>> writes are atomic.  I don't see POSIX specifying a maximum allowed
>> value, so this must be allowed.  Which means you cannot rely on write
>> performing partial writes.  Makes sense?
>
> Hmm, I'm not sure. POSIX does allow writing a single byte if that's all
> the space there is, but only if the original _request_ was for more than
> PIPE_BUF. Which I guess matches what you're saying.
>
> If the original request is smaller than PIPE_BUF, it does seem to say
> that EAGAIN is the correct output. But it also says:
>
>   There is no exception regarding partial writes when O_NONBLOCK is set.
>   With the exception of writing to an empty pipe, this volume of
>   POSIX.1-2017 does not specify exactly when a partial write is
>   performed since that would require specifying internal details of the
>   implementation. Every application should be prepared to handle partial
>   writes when O_NONBLOCK is set and the requested amount is greater than
>   {PIPE_BUF}, just as every application should be prepared to handle
>   partial writes on other kinds of file descriptors.
>
>   The intent of forcing writing at least one byte if any can be written
>   is to assure that each write makes progress if there is any room in
>   the pipe. If the pipe is empty, {PIPE_BUF} bytes must be written; if
>   not, at least some progress must have been made.
>
> So they are clearly aware of the "we need to make forward progress"
> problem. But how are you supposed to do that if you only have less than
> PIPE_BUF bytes left to write? And likewise, even if it is technically
> legal, having a PIPE_BUF of 2^32 seems like a quality-of-implementation
> issue.
>
> And that doesn't really match what poll() is saying. The response from
> poll() told us we could write without blocking, which implies at least
> PIPE_BUF bytes available. But clearly it isn't available, since the
> write fails (with EAGAIN, but that is equivalent to blocking in POSIX's
> view here).

Turns out that's not the case on Windows: 94f4d01932 (mingw: workaround
for hangs when sending STDIN, 2020-02-17) changed the compatibility
implementation to 'Make `poll()` always reply "writable" for write end
of the pipe.'.

An updated version of the test helper confirms it (patch below) by
reporting on Windows:

chunk size: 1 bytes
 EAGAIN after 8192 bytes
chunk size: 500 bytes
 EAGAIN after 8000 bytes
chunk size: 1000 bytes
 EAGAIN after 8000 bytes
chunk size: 1024 bytes
 EAGAIN after 8192 bytes
chunk size: 100000 bytes
 partial write of 8192 bytes after 0 bytes
 EAGAIN after 8192 bytes

On Debian I get this instead:

chunk size: 1 bytes
 POLLOUT gone after 61441 bytes
 EAGAIN after 65536 bytes
chunk size: 500 bytes
 POLLOUT gone after 60500 bytes
 EAGAIN after 64000 bytes
chunk size: 1000 bytes
 POLLOUT gone after 61000 bytes
 EAGAIN after 64000 bytes
chunk size: 1024 bytes
 POLLOUT gone after 62464 bytes
 EAGAIN after 65536 bytes
chunk size: 100000 bytes
 partial write of 65536 bytes after 0 bytes
 POLLOUT gone after 65536 bytes
 EAGAIN after 65536 bytes

So on Windows the POLLOUT bit is indeed never unset.

> Lawyering aside, I think it would be OK to move forward with cutting up
> writes at least to a size of 512 bytes. Git runs on lots of platforms,
> and none of the code can make an assumption that PIPE_BUF is larger than
> that. I.e., we are reducing atomicity provided by Windows, but that's
> OK.
>
> I don't think that solves our problem fully, though. We might need to
> write 5 bytes, and telling mingw_write() to do so means it's supposed to
> abide by PIPE_BUF conventions. But again, we are in control of the
> calling code here. I don't think there's any reason that we care about
> PIPE_BUF atomicity. Certainly we don't get those atomicity guarantees on
> the socket side, which is where much of our writing happens, and our
> code is prepared to handle partial writes of any size. So we could just
> ignore that guarantee here.
>
>> If we accept that, then we need a special write function for
>> non-blocking pipes that chops the data into small enough chunks.
>
> I'm not sure what "small enough" we can rely on, though. Really it is
> the interplay between poll() and write() that we care about here. We
> would like to know at what point poll() will tell us it's OK to write().
> But we don't know what the OS thinks of that.

Based on the output above I think Linux' poll() won't consider a pipe
writable that has less than PIPE_BUF (4096) available bytes.

> Or maybe we do, since I think poll() is our own compat layer. But it
> just seems to be based on select(). We do seem to use PeekNamedPipe()
> there to look on the reading side, but I don't know if there's an
> equivalent for writing.

This has been elusive so far (see 94f4d01932).

Perhaps we should take the advice about PIPE_NOWAIT in the docs serious
and use overlapping (asynchronous) writes on Windows instead.  This
would mean reimplementing the whole pipe_command() with Windows API
commands, I imagine.

>> Another oddity: t3701 with yesterday's patch finishes fine with -i, but
>> hangs without it (not just when running it via prove).  O_o
>
> Yes, definitely strange. I'd expect weirdness with "-v", for example,
> because of terminal stuff, but "-i" should have no impact on the running
> environment.
It's especially grating because test runs also take very looong.

Avoiding xwrite() in pump_io_round() on top lets the test suite
finish successfully.

René


---
 Makefile                 |  1 +
 t/helper/test-nonblock.c | 73 ++++++++++++++++++++++++++++++++++++++++
 t/helper/test-tool.c     |  1 +
 t/helper/test-tool.h     |  1 +
 4 files changed, 76 insertions(+)
 create mode 100644 t/helper/test-nonblock.c

diff --git a/Makefile b/Makefile
index d9c00cc05d..0bc028ca00 100644
--- a/Makefile
+++ b/Makefile
@@ -751,6 +751,7 @@ TEST_BUILTINS_OBJS += test-lazy-init-name-hash.o
 TEST_BUILTINS_OBJS += test-match-trees.o
 TEST_BUILTINS_OBJS += test-mergesort.o
 TEST_BUILTINS_OBJS += test-mktemp.o
+TEST_BUILTINS_OBJS += test-nonblock.o
 TEST_BUILTINS_OBJS += test-oid-array.o
 TEST_BUILTINS_OBJS += test-oidmap.o
 TEST_BUILTINS_OBJS += test-oidtree.o
diff --git a/t/helper/test-nonblock.c b/t/helper/test-nonblock.c
new file mode 100644
index 0000000000..c9350ce894
--- /dev/null
+++ b/t/helper/test-nonblock.c
@@ -0,0 +1,73 @@
+#include "test-tool.h"
+#include "compat/nonblock.h"
+
+static void fill_pipe(size_t write_len)
+{
+	void *buf = xcalloc(1, write_len);
+	int fds[2];
+	size_t total_written = 0;
+	int last = 0;
+	struct pollfd pfd;
+	int writable = 1;
+
+	if (pipe(fds))
+		die_errno("pipe failed");
+	if (enable_nonblock(fds[1]))
+		die_errno("enable_nonblock failed");
+	pfd.fd = fds[1];
+	pfd.events = POLLOUT;
+
+	printf("chunk size: %"PRIuMAX" bytes\n", write_len);
+	for (;;) {
+		ssize_t written;
+		if (poll(&pfd, 1, 0) < 0) {
+			if (errno == EINTR)
+				continue;
+			die_errno("poll failed");
+		}
+		if (writable && !(pfd.revents & POLLOUT)) {
+			writable = 0;
+			printf(" POLLOUT gone after %"PRIuMAX" bytes\n",
+			       total_written);
+		}
+		written = write(fds[1], buf, write_len);
+		if (written < 0) {
+			switch (errno) {
+			case EAGAIN:
+				printf(" EAGAIN after %"PRIuMAX" bytes\n",
+				       total_written);
+				break;
+			case ENOSPC:
+				printf(" ENOSPC after %"PRIuMAX" bytes\n",
+				       total_written);
+				break;
+			default:
+				printf(" errno %d after %"PRIuMAX" bytes\n",
+				       errno, total_written);
+			}
+			break;
+		} else if (written != write_len)
+			printf(" partial write of %"PRIuMAX" bytes"
+			       " after %"PRIuMAX" bytes\n",
+			       (uintmax_t)written, total_written);
+		if (last)
+			break;
+		if (written > 0)
+			total_written += written;
+		last = !written;
+	};
+
+	close(fds[1]);
+	close(fds[0]);
+	free(buf);
+}
+
+int cmd__nonblock(int argc, const char **argv)
+{
+	fill_pipe(1);
+	fill_pipe(500);
+	fill_pipe(1000);
+	fill_pipe(1024);
+	fill_pipe(100000);
+	return 0;
+}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 318fdbab0c..562d7a9161 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -45,6 +45,7 @@ static struct test_cmd cmds[] = {
 	{ "match-trees", cmd__match_trees },
 	{ "mergesort", cmd__mergesort },
 	{ "mktemp", cmd__mktemp },
+	{ "nonblock", cmd__nonblock },
 	{ "oid-array", cmd__oid_array },
 	{ "oidmap", cmd__oidmap },
 	{ "oidtree", cmd__oidtree },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index bb79927163..d9006a5298 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -36,6 +36,7 @@ int cmd__lazy_init_name_hash(int argc, const char **argv);
 int cmd__match_trees(int argc, const char **argv);
 int cmd__mergesort(int argc, const char **argv);
 int cmd__mktemp(int argc, const char **argv);
+int cmd__nonblock(int argc, const char **argv);
 int cmd__oidmap(int argc, const char **argv);
 int cmd__oidtree(int argc, const char **argv);
 int cmd__online_cpus(int argc, const char **argv);
--
2.37.1.windows.1

  reply	other threads:[~2022-08-14 15:58 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-02  4:13 [RFC/PATCH] pipe_command(): mark stdin descriptor as non-blocking Jeff King
2022-08-02 15:04 ` Junio C Hamano
2022-08-02 15:39 ` Jeff King
2022-08-02 16:16   ` Junio C Hamano
2022-08-03  3:53     ` [PATCH v2] " Jeff King
2022-08-03 16:45       ` René Scharfe
2022-08-03 17:20         ` Jeff King
2022-08-03 21:56           ` René Scharfe
2022-08-05 15:36             ` Jeff King
2022-08-05 21:13               ` René Scharfe
2022-08-07 10:15                 ` René Scharfe
2022-08-07 17:41                   ` Jeff King
2022-08-10  5:39                     ` René Scharfe
2022-08-10 19:53                       ` Jeff King
2022-08-10 22:35                         ` René Scharfe
2022-08-11  8:52                           ` Jeff King
2022-08-10  5:39                     ` [PATCH] mingw: handle writes to non-blocking pipe René Scharfe
2022-08-10  9:07                       ` Johannes Schindelin
2022-08-10 20:02                       ` Jeff King
2022-08-10 22:34                         ` René Scharfe
2022-08-11  8:47                           ` Jeff King
2022-08-11 17:35                             ` René Scharfe
2022-08-11 18:20                               ` Jeff King
2022-08-14 15:37                                 ` René Scharfe [this message]
2022-08-17  5:39                                   ` Jeff King
2022-08-17  6:04                                     ` [PATCH v2 0/6] fix pipe_command() deadlock Jeff King
2022-08-17  6:04                                       ` [PATCH v2 1/6] compat: add function to enable nonblocking pipes Jeff King
2022-08-17 20:23                                         ` Junio C Hamano
2022-08-18  5:41                                           ` Jeff King
2022-08-17  6:05                                       ` [PATCH v2 2/6] nonblock: support Windows Jeff King
2022-08-17  6:06                                       ` [PATCH v2 3/6] git-compat-util: make MAX_IO_SIZE define globally available Jeff King
2022-08-17  6:08                                       ` [PATCH v2 4/6] pipe_command(): avoid xwrite() for writing to pipe Jeff King
2022-08-17  6:09                                       ` [PATCH v2 5/6] pipe_command(): handle ENOSPC when writing to a pipe Jeff King
2022-08-17 18:57                                         ` Junio C Hamano
2022-08-18  5:38                                           ` Jeff King
2022-08-17  6:10                                       ` [PATCH v2 6/6] pipe_command(): mark stdin descriptor as non-blocking Jeff King
2022-08-17  6:20                                       ` [PATCH v2 0/6] fix pipe_command() deadlock Jeff King
2022-08-19 21:19                                       ` René Scharfe
2022-08-20  7:04                                         ` Jeff King
2022-08-07 10:14               ` [PATCH v2] pipe_command(): mark stdin descriptor as non-blocking René Scharfe
2022-08-08 12:55                 ` Johannes Schindelin
2022-08-08 12:59       ` Johannes Schindelin
2022-08-09 13:04         ` Jeff King
2022-08-09 22:10           ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c7c6524c-4f02-10f6-1a58-738cef5aecf2@web.de \
    --to=l.s.r@web.de \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).