On Mon, 11 Mar 2024, Zachary Santer wrote: > On Mon, Mar 11, 2024 at 7:54 AM Carl Edquist > wrote: >> >> (In my coprocess management library, I effectively run every coproc >> with --output=L by default, by eval'ing the output of 'env -i stdbuf >> -oL env', because most of the time for a coprocess, that's whats >> wanted/necessary.) > > Surrounded by 'set -a' and 'set +a', I guess? Now that's interesting. Ah, no - I use the 'VAR=VAL command line' syntax so that it's specific to the command (it's not left exported to the shell). Effectively the coprocess commands are run with LD_PRELOAD=... _STDBUF_O=L command line This allow running shell functions for the command line, which will all get the desired stdbuf behavior. Because you can't pass a shell function (within the context of the current shell) as the command to stdbuf. As far as I can tell, the stdbuf tool sets LD_PRELOAD (to point to libstdbuf.so) and your custom buffering options in _STDBUF_{I,O,E}, in the environment for the program it runs. The double-env thing there is just a way to cleanly get exactly the env vars that stdbuf sets. The values don't change, but since they are an implementation detail of stdbuf, it's a bit more portable to grab the values this way rather than hard code them. This is done only once per shell session to extract the values, and save them to a private variable, and then they are used for the command line as show above. Of course, if "command line" starts with "stdbuf --output=0" or whatever, that will override the new line-buffered default. You can definitely export it to your shell though, either with 'set -a' like you said, or with the export command. After that everything you run should get line-buffered stdio by default. > I just added that to a script I have that prints lines output by another > command that it runs, generally a build script, to the command line, but > updating the same line over and over again. I want to see if it updates > more continuously like that. So, a lot of times build scripts run a bunch of individual commands. Each of those commands has an implied flush when it terminates, so you will get the output from each of them promptly (as each command completes), even without using stdbuf. Where things get sloppy is if you add some stuff in a pipeline after your build script, which results in things getting block-buffered along the way: $ ./build.sh | sed s/what/ever/ | tee build.log And there you will definitely see a difference. sloppy () { for x in {1..10}; do sleep .2; echo $x; done | sed s/^/:::/ | cat } { echo before: sloppy echo export $(env -i stdbuf -oL env) echo after: sloppy } > Yeah, there's really no way to break what I'm doing into a standard > pipeline. I admit I'm curious what you're up to :) > Of course, using line-buffered or unbuffered output in this situation > makes no sense. Where it might be useful in a pipeline is when an > earlier command in a pipeline might only print things occasionally, and > you want those things transformed and printed to the command line > immediately. Right ... And in that case, losing the performance benefit of a larger block buffer is a smaller price to pay. > My assumption is that line-buffering through setbuf(3) was implemented > for printing to the command line, so its availability to stdbuf(1) is > just a useful side effect. Right, stdbuf(1) leverages setbuf(3). setbuf(3) tweaks the buffering behavior of stdio streams (stdin, stdout, stderr, and anything else you open with, eg, fopen(3)). It's not really limited to terminal applications, but yeah it makes it easier to ensure that your calls to printf(3) actually get output after each line (whether that's to a file or a pipe or a tty), without having to call an explicit fflush(3) of stdout every time. stdbuf(1) sets LD_PRELOAD to libstdbuf.so for your program, causing it to call setbuf(3) at program startup based on the values of _STDBUF_* in the environment (which stdbuf(1) also sets). (That's my read of it anyway.) > In the BUGS section in the man page for stdbuf(1), we see: On GLIBC > platforms, specifying a buffer size, i.e., using fully buffered mode > will result in undefined operation. Eheh xD Oh, I imagine "undefined operation" means something more like "unspecified" here. stdbuf(1) uses setbuf(3), so the behavior you'll get should be whatever the setbuf(3) from the libc on your system does. I think all this means is that the C/POSIX standards are a bit loose about what is required of setbuf(3) when a buffer size is specified, and there is room in the standard for it to be interpreted as only a hint. > If I'm not mistaken, then buffer modes other than 0 and L don't actually > work. Maybe I should count my blessings here. I don't know what's going > on in the background that would explain glibc not supporting any of > that, or stdbuf(1) implementing features that aren't supported on the > vast majority of systems where it will be installed. Hey try it right? Works for me (on glibc-2.23) $ for s in 8k 16k 32k 1M; do echo ::: $s ::: { stdbuf -o$s strace -ewrite tr 1 2 } < /dev/zero 2>&1 > /dev/null | head -3 echo done ::: 8k ::: write(1, "\0\0\0\0\0\0\0\0"..., 8192) = 8192 write(1, "\0\0\0\0\0\0\0\0"..., 8192) = 8192 write(1, "\0\0\0\0\0\0\0\0"..., 8192) = 8192 ::: 16k ::: write(1, "\0\0\0\0\0\0\0\0"..., 16384) = 16384 write(1, "\0\0\0\0\0\0\0\0"..., 16384) = 16384 write(1, "\0\0\0\0\0\0\0\0"..., 16384) = 16384 ::: 32k ::: write(1, "\0\0\0\0\0\0\0\0"..., 32768) = 32768 write(1, "\0\0\0\0\0\0\0\0"..., 32768) = 32768 write(1, "\0\0\0\0\0\0\0\0"..., 32768) = 32768 ::: 1M ::: write(1, "\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 write(1, "\0\0\0\0\0\0\0\0"..., 1048576) = 1048576 >> It may just be that nobody has actually had a real need for it. >> (Yet?) > > I imagine if anybody has, they just set --output=0 and moved on. Bash > scripts aren't the fastest thing in the world, anyway. Ouch. Ouch. Ouuuuch. :) While that's true if you're talking about bash itself doing the actual computation and data processing, the main work of the shell is making it easy to set up pipelines for other (very fast) programs to pass their data around. The stdbuf tool is not meant for the shell! It's meant for those very fast programs that the shell stands up. Using stdbuf to tweak a very fast program, causing it to output more often at newlines over pipes rather than at block boundaries, does slow down those programs somewhat. But as we've discussed, this is necessary for certain pipelines that have two-way communication (including coprocesses), or in general any time you want the output immediately. What may not be obvious is that the shell does not need to get involved with writing input for a coprocess or reading its output - the shell can start other (very fast) programs with input/output redirected to/from the coprocess pipes to do that processing. My point though earlier was that a null-terminated record buffering mode, as useful as it sounds on the surface (for null-terminated paths), may actually be something _nobody_ has ever actually needed for an actual (not contrived) workflow. But then again I say "Yet?" - because, never say never. Happy line-buffering :) Carl