unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Carl Edquist <edquist@cs.wisc.edu>
To: Chet Ramey <chet.ramey@case.edu>,
	Martin D Kealey <martin@kurahaupo.gen.nz>
Cc: Zachary Santer <zsanter@gmail.com>, bug-bash <bug-bash@gnu.org>,
	libc-alpha@sourceware.org
Subject: Re: Examples of concurrent coproc usage?
Date: Tue, 9 Apr 2024 10:58:41 -0500 (CDT)	[thread overview]
Message-ID: <e56a7b66-f015-d33c-dbd7-70ce4c71fdd8@cs.wisc.edu> (raw)
In-Reply-To: <b6ee7832-e927-4817-9582-11de52797b3e@case.edu>

On 4/4/24 7:23 PM, Martin D Kealey wrote:

> I'm somewhat uneasy about having coprocs inaccessible to each other. I 
> can foresee reasonable cases where I'd want a coproc to utilize one or 
> more other coprocs.
>
> In particular, I can see cases where a coproc is written to by one 
> process, and read from by another.
>
> Can we at least have the auto-close behaviour be made optional, so that 
> it can be turned off when we want to do something more sophisticated?

With support for multiple coprocs, auto-closing the fds to other coprocs 
when creating new ones is important in order to avoid deadlocks.

But if you're willing to take on management of those coproc fds yourself, 
you can expose them to new coprocs by making your own copies with exec 
redirections.

But this only "kind of" works, because for some reason bash seems to close 
all pipe fds for external commands in coprocs, even the ones that the user 
explicitly copies with exec redirections.

(More on that in a bit.)


On Mon, 8 Apr 2024, Chet Ramey wrote:

> On 4/4/24 7:23 PM, Martin D Kealey wrote:
>> I'm somewhat uneasy about having coprocs inaccessible to each other. I 
>> can foresee reasonable cases where I'd want a coproc to utilize one or 
>> more other coprocs.
>
> That's not the intended purpose,

Just a bit of levity here - i can picture Doc from back to the future 
exclaiming, "Marty, it's perfect!  You're just not thinking 4th 
dimensionally!"

> so I don't think not fixing a bug to accommodate some future 
> hypothetical use case is a good idea. That's why there's a warning 
> message when you try to use more than one coproc -- the shell doesn't 
> keep track of more than one.
>
> If you want two processes to communicate (really three), you might want
> to build with the multiple coproc support and use the shell as the
> arbiter.


For what it's worth, my experience is that coprocesses in bash (rigged up 
by means other than the coproc keyword) become very fun and interesting 
when you allow for the possibility of communication between coprocesses. 
(Most of my use cases for coprocesses fall under this category, actually.)

The most basic commands for tying multiple coprocesses together are tee(1) 
and paste(1), for writing to or reading from multiple coprocesses at once.

You can do this already with process substitutions like

 	tee >(cmd1) >(cmd2)

 	paste <(cmd3) <(cmd4)


My claim here is that there are uses for this where these commands are all 
separate coprocesses; that is, you'd want to read the output from cmd1 and 
cmd2 separately, and provide input for cmd3 and cmd4 separately.

(I'll try to send some examples in a later email.)


Nevertheless it's still crucial to keep the shell's existing coprocess fds 
out of new coprocesses, otherwise you easily run yourself into deadlock.


Now, if you built bash with multiple coproc support, I would have expected 
you could still rig this up, by doing the redirection work explicitly 
yourself.  Something like this:

 	coproc UP   { stdbuf -oL tr a-z A-Z; }
 	coproc DOWN { stdbuf -oL tr A-Z a-z; }

 	# make user-managed backup copies of coproc fds
 	exec {up_r}<&${UP[0]} {up_w}>&${UP[1]}
 	exec {down_r}<&${DOWN[0]} {down_w}>&${DOWN[1]}

 	coproc THREEWAY { tee /dev/fd/$up_w  /dev/fd/$down_w; }


But the above doesn't actually work, as it seems that the coproc shell 
(THREEWAY) closes specifically all the pipe fds (beyond 0,1,2), even the 
user-managed ones explicitly copied with exec.

As a result, you get back errors like this:

 	tee: /dev/fd/11: No such file or directory
 	tee: /dev/fd/13: No such file or directory


That's the case even if you do something more explicit like:

 	coproc UP_AND_OUT { tee /dev/fd/99  99>&$up_w; }

the '99>&$up_w' redirection succeeds, showing that the coproc does have 
access to its backup fd $up_w (*), but apparently the shell closes fd 99 
(as well as $up_w) before exec'ing the tee command.

Note the coproc shell only does this with pipes; it leaves other user 
managed fds like files or directories alone.

I have no idea why that's the case, and i wonder whether it's intentional 
or an oversight.


But anyway, i imagine that if one wants to use multi coproc support (which 
requires automatically closing the shell's coproc fds for new coprocs), 
and wants to set up multiple coprocs to communicate amongst themselves, 
then the way to go would be explicit redirections.

(But again, this requires fixing this peculiar behavior where the coproc 
shell closes even the user managed copies of pipe fds before exec'ing 
external commands.)


(*) to prove that the coproc shell does have access to $up_w, we can make 
a shell-only replacement for tee(1) :  (actually works)

 	fdtee () {
 	  local line fd
 	  while read -r line; do
 	    for fd; do
 	      printf '%s\n' "$line" >&$fd;
 	    done;
 	  done;
 	}

 	coproc UP   { stdbuf -oL tr a-z A-Z; }
 	coproc DOWN { stdbuf -oL tr A-Z a-z; }

 	# make user-managed backup copies of coproc fds
 	exec {up_r}<&${UP[0]} {up_w}>&${UP[1]}
 	exec {down_r}<&${DOWN[0]} {down_w}>&${DOWN[1]}

 	stdout=1
 	coproc THREEWAY { fdtee $stdout $up_w $down_w; }

 	# save these too, for safe keeping
 	exec {tee_r}<&${THREEWAY[0]} {tee_w}>&${THREEWAY[1]}


Then:  (actually works)

 	$ echo 'Greetings!' >&$tee_w
 	$ read -u $tee_r  plain
 	$ read -u $up_r   upped
 	$ read -u $down_r downed
 	$ echo "[$plain] [$upped] [$downed]"
 	[Greetings!] [GREETINGS!] [greetings!]


This is a pretty trivial example just to demonstrate the concept.  But 
once you have the freedom to play with it, you find more interesting, 
useful applications.

Of course, for the above technique to be generally useful, external 
commands need access to these user-managed fds (copied with exec).  (I 
have no idea why the coproc shell closes them.)  The shell is crippled 
when limited to builtins.

(I'll try to tidy up some working examples with my coprocess management 
library this week, for the curious.)


Juicy thread hey?  I can hardly keep up!  :)

Carl

  parent reply	other threads:[~2024-04-09 15:57 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CABkLJULa8c0zr1BkzWLTpAxHBcpb15Xms0-Q2OOVCHiAHuL0uA@mail.gmail.com>
     [not found] ` <9831afe6-958a-fbd3-9434-05dd0c9b602a@draigBrady.com>
2024-03-10 15:29   ` RFE: enable buffering on null-terminated data Zachary Santer
2024-03-10 20:36     ` Carl Edquist
2024-03-11  3:48       ` Zachary Santer
2024-03-11 11:54         ` Carl Edquist
2024-03-11 15:12           ` Examples of concurrent coproc usage? Zachary Santer
2024-03-14  9:58             ` Carl Edquist
2024-03-17 19:40               ` Zachary Santer
2024-04-01 19:24               ` Chet Ramey
2024-04-01 19:31                 ` Chet Ramey
2024-04-02 16:22                   ` Carl Edquist
2024-04-03 13:54                     ` Chet Ramey
2024-04-03 14:32               ` Chet Ramey
2024-04-03 17:19                 ` Zachary Santer
2024-04-08 15:07                   ` Chet Ramey
2024-04-09  3:44                     ` Zachary Santer
2024-04-13 18:45                       ` Chet Ramey
2024-04-14  2:09                         ` Zachary Santer
2024-04-04 12:52                 ` Carl Edquist
2024-04-04 23:23                   ` Martin D Kealey
2024-04-08 19:50                     ` Chet Ramey
2024-04-09 14:46                       ` Zachary Santer
2024-04-13 18:51                         ` Chet Ramey
2024-04-09 15:58                       ` Carl Edquist [this message]
2024-04-13 20:10                         ` Chet Ramey
2024-04-14 18:43                           ` Zachary Santer
2024-04-15 18:55                             ` Chet Ramey
2024-04-15 17:01                           ` Carl Edquist
2024-04-17 14:20                             ` Chet Ramey
2024-04-20 22:04                               ` Carl Edquist
2024-04-22 16:06                                 ` Chet Ramey
2024-04-27 16:56                                   ` Carl Edquist
2024-04-28 17:50                                     ` Chet Ramey
2024-04-08 16:21                   ` Chet Ramey
2024-04-12 16:49                     ` Carl Edquist
2024-04-16 15:48                       ` Chet Ramey
2024-04-20 23:11                         ` Carl Edquist
2024-04-22 16:12                           ` Chet Ramey
2024-04-17 14:37               ` Chet Ramey
2024-04-20 22:04                 ` Carl Edquist
2024-03-12  3:34           ` RFE: enable buffering on null-terminated data Zachary Santer
2024-03-14 14:15             ` Carl Edquist
2024-03-18  0:12               ` Zachary Santer
2024-03-19  5:24                 ` Kaz Kylheku
2024-03-19 12:50                   ` Zachary Santer
2024-03-20  8:55                     ` Carl Edquist
2024-04-19  0:16                       ` Modify buffering of standard streams via environment variables (not LD_PRELOAD)? Zachary Santer
2024-04-19  9:32                         ` Pádraig Brady
2024-04-19 11:36                           ` Zachary Santer
2024-04-19 12:26                             ` Pádraig Brady
2024-04-19 16:11                               ` Zachary Santer
2024-04-20 16:00                         ` Carl Edquist
2024-04-20 20:00                           ` Zachary Santer
2024-04-20 21:45                             ` Carl Edquist

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e56a7b66-f015-d33c-dbd7-70ce4c71fdd8@cs.wisc.edu \
    --to=edquist@cs.wisc.edu \
    --cc=bug-bash@gnu.org \
    --cc=chet.ramey@case.edu \
    --cc=libc-alpha@sourceware.org \
    --cc=martin@kurahaupo.gen.nz \
    --cc=zsanter@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).