From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on starla X-Spam-Level: X-Spam-Status: No, score=-1.0 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 194A41F44D for ; Thu, 14 Mar 2024 10:58:12 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=pass (2048-bit key; secure) header.d=cs.wisc.edu header.i=@cs.wisc.edu header.a=rsa-sha256 header.s=csl-2018021300 header.b=aZfgwoi+; dkim-atps=neutral Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EEFD93858C39 for ; Thu, 14 Mar 2024 10:58:09 +0000 (GMT) Received: from smtpout2.cs.wisc.edu (smtpout2.cs.wisc.edu [128.105.6.54]) by sourceware.org (Postfix) with ESMTPS id 370E73858C31 for ; Thu, 14 Mar 2024 10:57:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 370E73858C31 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=cs.wisc.edu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=cs.wisc.edu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 370E73858C31 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=128.105.6.54 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710413868; cv=none; b=aQCThsDJEqHtvr/Oa9Eq7mnDk8LRw9SNL9ELXTlxHb/mx9LbHvhJ2RtF9hcq7KEIen2uN0SfHZjDSVFdY5whVwbeb7iFwXYqznwptMEu76ymugDnf3Ey0v88FAiqfIiH5wGRhRC7CmbrXmveoVrgZkNIcdbXWkeoMgvmTB7NZVg= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710413868; c=relaxed/simple; bh=AOiEKp6Pv4iGKmaP0WZmWL5BsCmc96GbVdORjluxbSU=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=qbjadOcZwL94lSeaRDGFw1Ze4lLqGHivdwsSE3Dd5BocqZaUnOugsPShg3ElASZ+Kg6jnX4sVWgV8bLwjyF+7RCghuKprfO38SeZ4Oza9YlxrID/XhjjthjRHmRNogfca2HCRZ6XyEzAKlTJKNavpfS2VhBZa38WSLMrQ8KU7Bg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from alumni.cs.wisc.edu (alumni.cs.wisc.edu [128.105.2.11]) by flint.cs.wisc.edu (8.14.7/8.14.4) with ESMTP id 42EAvYEX002476; Thu, 14 Mar 2024 05:57:34 -0500 DKIM-Filter: OpenDKIM Filter v2.11.0 flint.cs.wisc.edu 42EAvYEX002476 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.wisc.edu; s=csl-2018021300; t=1710413854; bh=UtzdYJVa5AwaVlwbuMnXtaDQFkGJh6pIkLDzBcTJGC4=; h=Date:From:To:cc:Subject:In-Reply-To:References:From; b=aZfgwoi+CVHLYZ7OWXEB7kQgqp3BVtlQlAHgMgekynOA90QpmhVZmRNPGY3GfClxw lFbb/orAx+Bx/NB1b+Nt6Rwpr4KdEGDFw088s4N0h75CUdeEh2xWM7w8svSTp+pGOZ TuYnj1N9PNX+y0sXbn2tqiYnicBeDejFsl1F8soUHcVRN5c2cY1WCIyP5KgyJydMtl 7eNP7t0hHzh5wOJ99VR/dmL7Zgiz/USe8vAquVYZSNFJ3dnEwsw3KYERAS7STtvEI7 sJMGFbXWB5y7UgawuLdUxmgwzPDqAjnb0qVWztCfTRhx9Svs5IB7hIhT0FtgqZ1c3N W+mOWvVXrNTxw== Received: from localhost (localhost.localdomain [127.0.0.1]) by alumni.cs.wisc.edu (Postfix) with ESMTP id 2034A1E080F; Thu, 14 Mar 2024 05:57:32 -0500 (CDT) Date: Thu, 14 Mar 2024 04:58:48 -0500 (CDT) From: Carl Edquist To: Zachary Santer cc: bug-bash , libc-alpha@sourceware.org Subject: Re: Examples of concurrent coproc usage? In-Reply-To: Message-ID: <88a67f36-2a56-a838-f763-f55b3073bb50@lando.namek.net> References: <9831afe6-958a-fbd3-9434-05dd0c9b602a@draigBrady.com> <317fe0e2-8cf9-d4ac-ed56-e6ebcc2baa55@cs.wisc.edu> <8c490a55-598a-adf6-67c2-eb2a6099620a@cs.wisc.edu> MIME-Version: 1.0 Content-Type: multipart/mixed; BOUNDARY="-1463761075-1181273492-1710188823=:8463" Content-ID: <926e7c8-985d-70d1-54d3-c6af4396b54@lando.namek.net> X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+e=80x24.org@sourceware.org This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1463761075-1181273492-1710188823=:8463 Content-Type: text/plain; CHARSET=utf-8; format=flowed Content-Transfer-Encoding: 8BIT Content-ID: [My apologies up front for the length of this email. The short story is I played around with the multi-coproc support: the fd closing seems to work fine to prevent deadlock, but I found one bug apparently introduced with multi-coproc support, and one other coproc bug that is not new.] On Mon, 11 Mar 2024, Zachary Santer wrote: > Was "RFE: enable buffering on null-terminated data" > > On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist wrote: >> >> (Kind of a side-note ... bash's limited coprocess handling was a long >> standing annoyance for me in the past, to the point that I wrote a bash >> coprocess management library to handle multiple active coprocess and >> give convenient methods for interaction. Perhaps the trickiest bit >> about multiple coprocesses open at once (which I suspect is the reason >> support was never added to bash) is that you don't want the second and >> subsequent coprocesses to inherit the pipe fds of prior open >> coprocesses. This can result in deadlock if, for instance, you close >> your write end to coproc1, but coproc1 continues to wait for input >> because coproc2 also has a copy of a write end of the pipe to coproc1's >> input. So you need to be smart about subsequent coprocesses first >> closing all fds associated with other coprocesses. > > https://lists.gnu.org/archive/html/help-bash/2021-03/msg00296.html > https://lists.gnu.org/archive/html/help-bash/2021-04/msg00136.html Oh hey! Look at that. Thanks for the links to this thread - I gave them a read (along with the old thread from 2011-04). I feel a little bad I missed the 2021 discussion. > You're on the money, though there is a preprocessor directive you can > build bash with that will allow it to handle multiple concurrent > coprocesses without complaining: MULTIPLE_COPROCS=1. Who knew! Thanks for mentioning it. When I saw that "only one active coprocess at a time" was _still_ listed in the bugs section in bash 5, I figured multiple coprocess support had just been abandoned. Chet, that's cool that you implemented it. I kind of went all-out on my bash coprocess management library though (mostly back in 2014-2016) ... It's pretty feature-rich and pleasant to use -- to the point that I don't think there is any going-back to bash's internal coproc for me, even with multiple coprocess are support. I implemented it with shell functions, so it doesn't rely on compiling anything or the latest version of bash being present. (I even added bash3 support for older systems.) > Chet Ramey's sticking point was that he hadn't seen coprocesses used > enough in the wild to satisfactorily test that his implementation did in > fact keep the coproc file descriptors out of subshells. To be fair coproc is kind of a niche feature. But I think more people would play with it if it were less awkward to use and if they felt free to experiment with multiple coprocs. By the way, I agree with the Chet's exact description of the problems here: https://lists.gnu.org/archive/html/help-bash/2021-03/msg00282.html The issue is separate from the stdio buffering discussion; the issue here is with child processes (and I think not foreground subshells, but specifically background processes, including coprocesses) inheriting the shell's fds that are open to pipes connected to an active coprocess. Not getting a sigpipe/write failure results in a coprocess sitting around longer than it ought to, but it's not obvious (to me) how this leads to deadlock, since the shell at least has closed its read end of the pipe to that coprocess, so at least you aren't going to hang trying to read from it. On the other hand, a coprocess not seeing EOF will cause deadlock pretty readily, especially if it processes all its input before producing output (as with wc, sort, sha1sum). Trying to read from the coprocess will hang indefinitely if the coprocess is still waiting for input, which is the case if there is another copy of the write end of its read pipe open somewhere. > If you've got examples you can direct him to, I'd really appreciate it. [My original use cases for multiple coprocesses were (1) for programmatically interacting with multiple command-line database clients together, and (2) for talking to multiple interactive command-line game engines (othello) to play each other. Perl's IPC::Open2 works, too, but it's easier to experiment on the fly in bash. And in general having the freedom to play with multiple coprocesses helps mock up more complicated pipelines, or even webs of interconnected processes.] But you can create a deadlock without doing anything fancy. Well, *without multi-coproc support*, here's a simple wc example; first with a single coproc: $ coproc WC { wc; } $ exec {WC[1]}>&- $ read -u ${WC[0]} X $ echo $X 0 0 0 This works as expected. But if you try it with a second coproc (again, without multi-coproc support), the second coproc will inherit copies of the shell's read and write pipe fds to the first coproc, and the read will hang (as described above), as the first coproc doesn't see EOF: $ coproc WC { wc; } $ coproc CAT { cat; } $ exec {WC[1]}>&- $ read -u ${WC[0]} X # HANGS But, this can be observed even before attempting the read that hangs. You can 'ps' to see the user shell (bash), the coprocs' shells (bash), and the coprocs' commands (wc & cat). Then 'ls -l /proc/PID/fd/' to see what they have open: - The user shell has its copies of the read & write fds open for both coprocs (as it should) - The coproc commands (wc & cat) each have only a single read & write pipe open, on fd 0 & 1 (as they should) - The first coproc's shell (WC) has only a single read & write pipe open, on fd 0 & 1 (as it should) - The second coproc's shell (CAT) has its own read & write pipes open, on fd 0 & 1 (good), but it also has a copy of the user shell's read & write pipe fds to the first coproc (WC) open (on fd 60 & 63 in this case, which it inherited when forking from the user shell) (And in general, latter coproc shells will have stray copies of the user shell's r/w ends from all previous coprocs.) So, you can examine the situation after setting up coprocs, to see if all the coproc-related processes have just two pipes open (on fd 0 & 1). If this is the case, I think that suffices to convince me anyway that no deadlocks related to stray open fds can happen. But if any of them has other pipes open (inherited from the user shell), that indicates the problem. I tried compiling the latest bash with MULTIPLE_COPROCS=1 (version 5.2.21(1)) to test out the multi-coproc support. I tried standing up the above WC and CAT coprocs, together with some others to check that the behavior looked ok for pipelines also (which I think was one of Chet's concerns) $ coproc WC { wc; } $ coproc CAT { cat; } $ coproc CAT3 { cat | cat | cat; } $ coproc CAT4 { cat | cat | cat | cat; } $ coproc CATX { cat ; } And as far as the fd situation, everything checks out: the user shell has fds open to all the coprocs, and the coproc shells & coproc commands (including all the cat's in the pipelines) have only a single read & write pipe open on fd 0 & 1. So, the multi-coproc code seems to be closing the shell's copies correctly. [The examples are boring, but their point is just to investigate the stray-fd question.] HOWEVER!!! Unexpectedly, the new multi-coproc code seems to close the user shell's end of a coprocess's pipes, once the coprocess has terminated. When compiled with MULTIPLE_COPROCS=1, this is true even if there is only a single coproc: $ coproc WC { wc; } $ exec {WC[1]}>&- [1]+ Done coproc WC { wc; } # WC var gets cleared!! # shell's ${WC[0]} is also closed! # now, can't do: $ read -u ${WC[0]} X $ echo $X I'm attaching a "bad-coproc-log.txt" with more detailed ps & ls output examining the open fds at each step, to make it clear what's happening. This is a bug. The shell should not automatically close its read pipe to a coprocess that has terminated -- it should stay open to read the final output, and the user should be responsible for closing the read end explicitly. This is more obvious for commands that wait until they see EOF before generating any output (wc, sort, sha1sum). But it's also true for any command that produces output (filters (sed) or generators (ls)). If the shell's read end is closed automatically, any final output waiting in the pipe will be discarded. It also invites trouble if the shell variable that holds the fds gets removed unexpectedly when the coprocess terminates. (Suddenly the variable expands to an empty string.) It seems to me that the proper time to clear the coproc variable (if at all) is after the user has explicitly closed both of the fds. *Or* else add an option to the coproc keyword to explicitly close the coproc - which will close both fds and clear the variable. ... Separately, I consider the following coproc behavior to be weird, fragile, and broken. If you fg a coproc, then stop and bg it, it dies. Why? Apparently the shell abandons the coproc when it is stopped, closes the pipe fds for it, and clears the fd variable. $ coproc CAT { cat; } [1] 10391 $ fg coproc CAT { cat; } # oops! ^Z [1]+ Stopped coproc CAT { cat; } $ echo ${CAT[@]} # what happened to the fds? $ ls -lgo /proc/$$/fd/ total 0 lrwx------ 1 64 Mar 14 02:26 0 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:26 1 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:25 2 -> /dev/pts/3 lrwx------ 1 64 Mar 14 02:26 255 -> /dev/pts/3 $ bg [1]+ coproc CAT { cat; } & $ [1]+ Done coproc CAT { cat; } $ # sad user :( This behavior is not new to the multi-coproc support. But just the same it seems broken for the shell to automatically close the fds to coprocesses. That should be done explicitly by the user. >> Word to the wise: you might encounter this issue (coproc2 prevents >> coproc1 from seeing its end-of-input) even though you are rigging this >> up yourself with FIFOs rather than bash's coproc builtin.) > > In my case, it's mostly a non-issue, because I fork the - now three - > background processes before exec'ing automatic fds redirecting to/from > their FIFO's in the parent process. All the automatic fds get put in an > array, and I do close them all at the beginning of a subsequent process > substitution. That's a nice trick with the shell backgrounding all the coprocesses before connecting the fifos. But yeah, to make subsequent coprocesses you do still have to close the copy of the user shell's fds that the coprocess shell inherits. It sounds like you are doing that (nice!), but in any case it requires some care, and as these stack up it is really handy to have something manage it all for you. (Perhaps this is where I ask if you are happy with your solution or if you would like to try out something wildly more flexible...) Happy coprocessing! :) Carl ---1463761075-1181273492-1710188823=:8463 Content-Type: text/plain; CHARSET=US-ASCII; NAME=bad-coproc-log.txt Content-Transfer-Encoding: BASE64 Content-ID: <5d273f95-7f65-e1d3-d68b-5587e1bca915@cs.wisc.edu> Content-Description: Content-Disposition: ATTACHMENT; FILENAME=bad-coproc-log.txt JCBjb3Byb2MgV0MgeyB3YzsgfQ0KWzFdIDEwMDM4DQoNCiQgcHMNCiAgUElE IFRUWSAgICAgICAgICBUSU1FIENNRA0KIDk5MjYgcHRzLzMgICAgMDA6MDA6 MDAgYmFzaA0KMTAwMzggcHRzLzMgICAgMDA6MDA6MDAgYmFzaA0KMTAwMzkg cHRzLzMgICAgMDA6MDA6MDAgd2MNCjEwMDQwIHB0cy8zICAgIDAwOjAwOjAw IHBzDQoNCiQgbHMgLWxnbyAvcHJvYy97JCQsMTAwMzgsMTAwMzl9L2ZkLw0K L3Byb2MvMTAwMzgvZmQvOg0KdG90YWwgMA0KbHIteC0tLS0tLSAxIDY0IE1h ciAxNCAwMjoyOSAwIC0+IHBpcGU6WzgxMjE0XQ0KbC13eC0tLS0tLSAxIDY0 IE1hciAxNCAwMjoyOSAxIC0+IHBpcGU6WzgxMjEzXQ0KbHJ3eC0tLS0tLSAx IDY0IE1hciAxNCAwMjoyOCAyIC0+IC9kZXYvcHRzLzMNCmxyd3gtLS0tLS0g MSA2NCBNYXIgMTQgMDI6MjkgMjU1IC0+IC9kZXYvcHRzLzMNCg0KL3Byb2Mv MTAwMzkvZmQvOg0KdG90YWwgMA0KbHIteC0tLS0tLSAxIDY0IE1hciAxNCAw MjoyOSAwIC0+IHBpcGU6WzgxMjE0XQ0KbC13eC0tLS0tLSAxIDY0IE1hciAx NCAwMjoyOSAxIC0+IHBpcGU6WzgxMjEzXQ0KbHJ3eC0tLS0tLSAxIDY0IE1h ciAxNCAwMjoyOCAyIC0+IC9kZXYvcHRzLzMNCg0KL3Byb2MvOTkyNi9mZC86 DQp0b3RhbCAwDQpscnd4LS0tLS0tIDEgNjQgTWFyIDE0IDAyOjI2IDAgLT4g L2Rldi9wdHMvMw0KbHJ3eC0tLS0tLSAxIDY0IE1hciAxNCAwMjoyNiAxIC0+ IC9kZXYvcHRzLzMNCmxyd3gtLS0tLS0gMSA2NCBNYXIgMTQgMDI6MjUgMiAt PiAvZGV2L3B0cy8zDQpscnd4LS0tLS0tIDEgNjQgTWFyIDE0IDAyOjI2IDI1 NSAtPiAvZGV2L3B0cy8zDQpsLXd4LS0tLS0tIDEgNjQgTWFyIDE0IDAyOjI2 IDYwIC0+IHBpcGU6WzgxMjE0XQ0KbHIteC0tLS0tLSAxIDY0IE1hciAxNCAw MjoyNiA2MyAtPiBwaXBlOls4MTIxM10NCg0KJCBlY2hvICR7V0NbQF19DQo2 MyA2MA0KDQokIGV4ZWMge1dDWzFdfT4mLQ0KWzFdKyAgRG9uZSAgICAgICAg ICAgICAgICAgICAgY29wcm9jIFdDIHsgd2M7IH0NCg0KJCBwcw0KICBQSUQg VFRZICAgICAgICAgIFRJTUUgQ01EDQogOTkyNiBwdHMvMyAgICAwMDowMDow MCBiYXNoDQoxMDA0MiBwdHMvMyAgICAwMDowMDowMCBwcw0KDQokIGVjaG8g JHtXQ1tAXX0NCg0KJCBscyAtbGdvIC9wcm9jLyQkL2ZkLw0KdG90YWwgMA0K bHJ3eC0tLS0tLSAxIDY0IE1hciAxNCAwMjoyNiAwIC0+IC9kZXYvcHRzLzMN Cmxyd3gtLS0tLS0gMSA2NCBNYXIgMTQgMDI6MjYgMSAtPiAvZGV2L3B0cy8z DQpscnd4LS0tLS0tIDEgNjQgTWFyIDE0IDAyOjI1IDIgLT4gL2Rldi9wdHMv Mw0KbHJ3eC0tLS0tLSAxIDY0IE1hciAxNCAwMjoyNiAyNTUgLT4gL2Rldi9w dHMvMw0KDQo= ---1463761075-1181273492-1710188823=:8463--