git@vger.kernel.org mailing list mirror (one of many)
 help / color / mirror / code / Atom feed
From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Johannes Schindelin <Johannes.Schindelin@gmx.de>
Subject: [PATCH v2] pipe_command(): mark stdin descriptor as non-blocking
Date: Tue, 2 Aug 2022 23:53:00 -0400	[thread overview]
Message-ID: <YunxHOa2sJeEpJxd@coredump.intra.peff.net> (raw)
In-Reply-To: <xmqq5yjahb8u.fsf@gitster.g>

On Tue, Aug 02, 2022 at 09:16:17AM -0700, Junio C Hamano wrote:

> > But I'm not sure what should go on the Windows side of that #ifdef.
> > Unlike some other spots, I don't think we can just make it a noop, or
> > Windows will be subject to the same deadlock (unless for some reason its
> > write() does behave differently?).
> 
> Let them deadlock so that they can fix it, and until then leave it a
> noop?  That may break the CI tests for them so we could hide the
> known-to-be-broken test behind prerequisite to buy them time,
> perhaps?

I was skeptical of including a test, but I was able to come up with a
reproduction that triggers for me on Linux, but isn't too expensive to
keep around.

Much to my surprise (and delight), it passes on the Windows CI with the
noop! So I guess their pipes behave in the way we want by default.

It does feel like a bit of a landmine to have an enable_nonblock()
function which might do nothing. I'm not sure if other descriptor types
might behave differently on Windows. But hopefully the comment above the
function is sufficient to make people think twice.

-- >8 --
Subject: [PATCH] pipe_command(): mark stdin descriptor as non-blocking

Our pipe_command() helper lets you both write to and read from a child
process on its stdin/stdout. It's supposed to work without deadlocks
because we use poll() to check when descriptors are ready for reading or
writing. But there's a bug: if both the data to be written and the data
to be read back exceed the pipe buffer, we'll deadlock.

The issue is that the code assumes that if you have, say, a 2MB buffer
to write and poll() tells you that the pipe descriptor is ready for
writing, that calling:

  write(cmd->in, buf, 2*1024*1024);

will do a partial write, filling the pipe buffer and then returning what
it did write. And that is what it would do on a socket, but not for a
pipe. When writing to a pipe, at least on Linux, it will block waiting
for the child process to read() more. And now we have a potential
deadlock, because the child may be writing back to us, waiting for us to
read() ourselves.

An easy way to trigger this is:

  git -c add.interactive.useBuiltin=true \
      -c interactive.diffFilter=cat \
      checkout -p HEAD~200

The diff against HEAD~200 will be big, and the filter wants to write all
of it back to us (obviously this is a dummy filter, but in the real
world something like diff-highlight would similarly stream back a big
output).

If you set add.interactive.useBuiltin to false, the problem goes away,
because now we're not using pipe_command() anymore (instead, that part
happens in perl). But this isn't a bug in the interactive code at all.
It's the underlying pipe_command() code which is broken, and has been
all along.

We presumably didn't notice because most calls only do input _or_
output, not both. And the few that do both, like gpg calls, may have
large inputs or outputs, but never both at the same time (e.g., consider
signing, which has a large payload but a small signature comes back).

The obvious fix is to put the descriptor into non-blocking mode, and
indeed, that makes the problem go away. Callers shouldn't need to
care, because they never see the descriptor (they hand us a buffer to
feed into it).

Unfortunately O_NONBLOCK isn't available everywhere, especially on
Windows. However, the included test seems to work fine there, which
implies that pipes there do not behave in the same way (they will do the
partial write by default, which is what we want). This is true even if I
size up the diff for a larger pipe buffer (the value chosen here
triggers the deadlock on Linux, but isn't too expensive to keep as a
regression test).

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile                   |  1 +
 compat/nonblock.c          | 22 ++++++++++++++++++++++
 compat/nonblock.h          | 10 ++++++++++
 run-command.c              | 10 ++++++++++
 t/t3701-add-interactive.sh | 14 ++++++++++++++
 5 files changed, 57 insertions(+)
 create mode 100644 compat/nonblock.c
 create mode 100644 compat/nonblock.h

diff --git a/Makefile b/Makefile
index 1624471bad..78bedf26e0 100644
--- a/Makefile
+++ b/Makefile
@@ -918,6 +918,7 @@ LIB_OBJS += combine-diff.o
 LIB_OBJS += commit-graph.o
 LIB_OBJS += commit-reach.o
 LIB_OBJS += commit.o
+LIB_OBJS += compat/nonblock.o
 LIB_OBJS += compat/obstack.o
 LIB_OBJS += compat/terminal.o
 LIB_OBJS += compat/zlib-uncompress2.o
diff --git a/compat/nonblock.c b/compat/nonblock.c
new file mode 100644
index 0000000000..897c099010
--- /dev/null
+++ b/compat/nonblock.c
@@ -0,0 +1,22 @@
+#include "git-compat-util.h"
+#include "nonblock.h"
+
+#ifdef O_NONBLOCK
+
+int enable_nonblock(int fd)
+{
+	int flags = fcntl(fd, F_GETFL);
+	if (flags < 0)
+		return -1;
+	flags |= O_NONBLOCK;
+	return fcntl(fd, F_SETFL, flags);
+}
+
+#else
+
+int enable_nonblock(int fd)
+{
+	return 0;
+}
+
+#endif
diff --git a/compat/nonblock.h b/compat/nonblock.h
new file mode 100644
index 0000000000..2721f72187
--- /dev/null
+++ b/compat/nonblock.h
@@ -0,0 +1,10 @@
+#ifndef COMPAT_NONBLOCK_H
+#define COMPAT_NONBLOCK_H
+
+/*
+ * Enable non-blocking I/O for the passed-in descriptor. Note that this is a
+ * noop on systems without O_NONBLOCK, like Windows! Use with caution.
+ */
+int enable_nonblock(int fd);
+
+#endif
diff --git a/run-command.c b/run-command.c
index 14f17830f5..ed99503b22 100644
--- a/run-command.c
+++ b/run-command.c
@@ -10,6 +10,7 @@
 #include "config.h"
 #include "packfile.h"
 #include "hook.h"
+#include "compat/nonblock.h"
 
 void child_process_init(struct child_process *child)
 {
@@ -1438,6 +1439,15 @@ int pipe_command(struct child_process *cmd,
 		return -1;
 
 	if (in) {
+		if (enable_nonblock(cmd->in) < 0) {
+			error_errno("unable to make pipe non-blocking");
+			close(cmd->in);
+			if (out)
+				close(cmd->out);
+			if (err)
+				close(cmd->err);
+			return -1;
+		}
 		io[nr].fd = cmd->in;
 		io[nr].type = POLLOUT;
 		io[nr].u.out.buf = in;
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index b354fb39de..01d6ce9c83 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -766,6 +766,20 @@ test_expect_success 'detect bogus diffFilter output' '
 	force_color test_must_fail git add -p <y
 '
 
+test_expect_success 'handle very large filtered diff' '
+	git reset --hard &&
+	# The specific number here is not important, but it must
+	# be large enough that the output of "git diff --color"
+	# fills up the pipe buffer. 10,000 results in ~200k of
+	# colored output.
+	test_seq 10000 >test &&
+	false &&
+	test_config interactive.diffFilter cat &&
+	printf y >y &&
+	force_color git add -p >output 2>&1 <y &&
+	git diff-files --exit-code -- test
+'
+
 test_expect_success 'diff.algorithm is passed to `git diff-files`' '
 	git reset --hard &&
 
-- 
2.37.1.810.g8c5f98b46b



  reply	other threads:[~2022-08-03  3:53 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-02  4:13 [RFC/PATCH] pipe_command(): mark stdin descriptor as non-blocking Jeff King
2022-08-02 15:04 ` Junio C Hamano
2022-08-02 15:39 ` Jeff King
2022-08-02 16:16   ` Junio C Hamano
2022-08-03  3:53     ` Jeff King [this message]
2022-08-03 16:45       ` [PATCH v2] " René Scharfe
2022-08-03 17:20         ` Jeff King
2022-08-03 21:56           ` René Scharfe
2022-08-05 15:36             ` Jeff King
2022-08-05 21:13               ` René Scharfe
2022-08-07 10:15                 ` René Scharfe
2022-08-07 17:41                   ` Jeff King
2022-08-10  5:39                     ` René Scharfe
2022-08-10 19:53                       ` Jeff King
2022-08-10 22:35                         ` René Scharfe
2022-08-11  8:52                           ` Jeff King
2022-08-10  5:39                     ` [PATCH] mingw: handle writes to non-blocking pipe René Scharfe
2022-08-10  9:07                       ` Johannes Schindelin
2022-08-10 20:02                       ` Jeff King
2022-08-10 22:34                         ` René Scharfe
2022-08-11  8:47                           ` Jeff King
2022-08-11 17:35                             ` René Scharfe
2022-08-11 18:20                               ` Jeff King
2022-08-14 15:37                                 ` René Scharfe
2022-08-17  5:39                                   ` Jeff King
2022-08-17  6:04                                     ` [PATCH v2 0/6] fix pipe_command() deadlock Jeff King
2022-08-17  6:04                                       ` [PATCH v2 1/6] compat: add function to enable nonblocking pipes Jeff King
2022-08-17 20:23                                         ` Junio C Hamano
2022-08-18  5:41                                           ` Jeff King
2022-08-17  6:05                                       ` [PATCH v2 2/6] nonblock: support Windows Jeff King
2022-08-17  6:06                                       ` [PATCH v2 3/6] git-compat-util: make MAX_IO_SIZE define globally available Jeff King
2022-08-17  6:08                                       ` [PATCH v2 4/6] pipe_command(): avoid xwrite() for writing to pipe Jeff King
2022-08-17  6:09                                       ` [PATCH v2 5/6] pipe_command(): handle ENOSPC when writing to a pipe Jeff King
2022-08-17 18:57                                         ` Junio C Hamano
2022-08-18  5:38                                           ` Jeff King
2022-08-17  6:10                                       ` [PATCH v2 6/6] pipe_command(): mark stdin descriptor as non-blocking Jeff King
2022-08-17  6:20                                       ` [PATCH v2 0/6] fix pipe_command() deadlock Jeff King
2022-08-19 21:19                                       ` René Scharfe
2022-08-20  7:04                                         ` Jeff King
2022-08-07 10:14               ` [PATCH v2] pipe_command(): mark stdin descriptor as non-blocking René Scharfe
2022-08-08 12:55                 ` Johannes Schindelin
2022-08-08 12:59       ` Johannes Schindelin
2022-08-09 13:04         ` Jeff King
2022-08-09 22:10           ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: http://vger.kernel.org/majordomo-info.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YunxHOa2sJeEpJxd@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/mirrors/git.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).