unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: "Zack Weinberg" <zack@owlfolio.org>
To: "Štěpán Němec" <stepnem@smrk.net>, "Florian Weimer" <fweimer@redhat.com>
Cc: "GNU libc development" <libc-alpha@sourceware.org>
Subject: Re: [PATCH] manual: Drop incorrect statement on PIPE_BUF and blocking writes
Date: Mon, 25 Mar 2024 12:20:14 -0400	[thread overview]
Message-ID: <3bbbaba9-bf0a-4df5-b912-d0aa53e5e427@app.fastmail.com> (raw)
In-Reply-To: <20240325131356+0100.708751-stepnem@smrk.net>

On Mon, Mar 25, 2024, at 8:13 AM, Štěpán Němec wrote:
>>>  Reading or writing a larger amount of data may not be atomic; for
>>>  example, output data from other processes sharing the descriptor may be
>>> -interspersed.  Also, once @code{PIPE_BUF} characters have been written,
>>> -further writes will block until some characters are read.
>>> +interspersed.
>>
>> Maybe “further may block” instead?  I think the reference to PIPE_BUF
>> and blocking could still be helpful, except that it's not a guarantee,
>> as you correctly point out.

It's not correct to say that a write of 65536 bytes will _never_
block.  Rather, the pipe capacity on Linux is (by default) 65536
bytes, and, if nothing is reading, _any write_ that tries to put a
65537th byte into the pipe will block.  For example, both of these
will wait 1s before printing "all written":

{ dd if=/dev/zero bs=1 count=1 status=none;
  dd if=/dev/zero bs=65536 count=1 status=none;
  echo 'all written' >&2; } |
    { sleep 1; wc -c; }

{ dd if=/dev/zero bs=1 count=1 status=none;
  dd if=/dev/zero bs=65535 count=1 status=none;
  dd if=/dev/zero bs=1 count=1 status=none;
  echo 'all written' >&2; } |
    { sleep 1; wc -c; }

I agree that it is weird to talk about this in a section that's
nominally about atomicity.  But I think we shouldn't be calling the
"no interspersed data from other processes" behavior that we're trying
to describe here "atomicity" at all!  Quoting
<https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html>:

# Write requests to a pipe or FIFO shall be handled in the same way as
# a regular file with the following exceptions:
...
# * Write requests of {PIPE_BUF} bytes or less shall not be
#   interleaved with data from other processes doing writes on the
#   same pipe. Writes of greater than {PIPE_BUF} bytes may have data
#   interleaved, on arbitrary boundaries, with writes by other
#   processes, whether or not the O_NONBLOCK flag of the file status
#   flags is set.

This is a weak statement.  It does *not* guarantee "that nothing else
in the system can observe a state in which it is partially complete,"
as the manual currently puts it.  Nor does it guarantee anything
about how much a process reading from the pipe will receive if it
does a larger read than the write.  (To put that another way, if you
write data packets to a pipe, the reader cannot use the return value
of read() to tell how big the packets were.)

Also, it's not clear to me from what you wrote, whether Linux extends
the no-interleaved-data guarantee writes larger than PIPE_BUF as long
as they are smaller than the pipe capacity, but if it does, we should
say so only in a way that makes it clear it's not portable to rely on
that.

So I propose the appended revision to pipe.texi instead of what you
proposed.  It moves all this discussion to the beginning of the
chapter and explains everything more thoroughly, and hopefully
also correctly.

zw

diff --git a/manual/pipe.texi b/manual/pipe.texi
index 483c40c5c3..92c1733c75 100644
--- a/manual/pipe.texi
+++ b/manual/pipe.texi
@@ -9,30 +9,58 @@ handled in a first-in, first-out (FIFO) order.  The pipe has no name; it
 is created for one use and both ends must be inherited from the single
 process which created the pipe.
 
+@cindex FIFO
 @cindex FIFO special file
-A @dfn{FIFO special file} is similar to a pipe, but instead of being an
-anonymous, temporary connection, a FIFO has a name or names like any
-other file.  Processes open the FIFO by name in order to communicate
-through it.
+A @dfn{FIFO special file}, commonly shortened to @dfn{FIFO}, is
+similar to a pipe, but instead of being an anonymous, temporary
+connection, a FIFO has a name or names like any other file.
+Processes open the FIFO by name in order to communicate through it.
 
-A pipe or FIFO has to be open at both ends simultaneously.  If you read
-from a pipe or FIFO file that doesn't have any processes writing to it
+A pipe or FIFO has to be open at both ends simultaneously.  If you
+read from a pipe or FIFO that doesn't have any processes writing to it
 (perhaps because they have all closed the file, or exited), the read
 returns end-of-file.  Writing to a pipe or FIFO that doesn't have a
 reading process is treated as an error condition; it generates a
 @code{SIGPIPE} signal, and fails with error code @code{EPIPE} if the
 signal is handled or blocked.
 
-Neither pipes nor FIFO special files allow file positioning.  Both
-reading and writing operations happen sequentially; reading from the
-beginning of the file and writing at the end.
+Neither pipes nor FIFOs allow file positioning.  Both reading and
+writing operations happen sequentially; reading from the beginning of
+the file and writing at the end.
+
+If two or more processes are writing to the same pipe or FIFO, the
+data written by each process may be interleaved arbitrarily with data
+written by the others.  There is only one exception: Each time a
+process makes a call to @code{write}, @code{writev}, or other
+primitive I/O function (@pxref{I/O Primitives}) that writes, in total,
+no more than @code{PIPE_BUF} bytes of data, @emph{that data} will not
+be split by data written by other processes.  But data written by
+other processes could appear immediately before or afterward.
+
+@xref{Limits for Files}, for information about the @code{PIPE_BUF}
+parameter.  Note that @code{PIPE_BUF} is usually smaller than the
+default buffer size used by I/O on streams (i.e.@: @code{BUFSIZ});
+@xref{Stream Buffering}, for how to control the stream buffer size.
+
+Pipes and FIFOs may have a limit on the amount of data that's been
+written, but not yet read, that they can store.  This limit is called
+the @dfn{capacity} of the pipe or FIFO. A write that would overfill
+the pipe---put more data into it than its capacity---will block until
+something reads from the pipe (unless the @code{O_NONBLOCK} flag is
+set; @pxref{Operating Modes}).  If the write is smaller than
+@code{PIPE_BUF}, none of the data will enter the pipe until all of it
+can; if the write is larger, there is no guarantee about how much
+data enters the pipe and when.
+
+The capacity must be @emph{at least} @code{PIPE_BUF}.  Often it is
+bigger.  Some systems provide a way to query what the capacity is,
+or to set it for individual pipes and FIFOs.
 
 @menu
 * Creating a Pipe::             Making a pipe with the @code{pipe} function.
 * Pipe to a Subprocess::        Using a pipe to communicate with a
 				 child process.
 * FIFO Special Files::          Making a FIFO special file.
-* Pipe Atomicity::		When pipe (or FIFO) I/O is atomic.
 @end menu
 
 @node Creating a Pipe
@@ -106,6 +134,16 @@ The advantage of using @code{popen} and @code{pclose} is that the
 interface is much simpler and easier to use.  But it doesn't offer as
 much flexibility as using the low-level functions directly.
 
+When using pipes to receive data from a subprocess, either with the
+low-level functions or with @code{popen} and @code{pclose}, you must
+make sure to read all the data @emph{before} you wait for the
+subprocess to complete (by calling @code{pclose}, or any of the
+functions described in @pxref{Process Completion}).  This is because,
+if the subprocess writes more data than the pipe's capacity, it will
+block until you read some of it.  If you're waiting for the subprocess
+to complete, you're not doing any reading, so the subprocess will
+never exit, and you'll never read any data---a deadlock condition.
+
 @deftypefun {FILE *} popen (const char *@var{command}, const char *@var{mode})
 @standards{POSIX.2, stdio.h}
 @standards{SVID, stdio.h}
@@ -299,21 +337,3 @@ The directory that would contain the file resides on a read-only file
 system.
 @end table
 @end deftypefun
-
-@node Pipe Atomicity
-@section Atomicity of Pipe I/O
-
-Reading or writing pipe data is @dfn{atomic} if the size of data written
-is not greater than @code{PIPE_BUF}.  This means that the data transfer
-seems to be an instantaneous unit, in that nothing else in the system
-can observe a state in which it is partially complete.  Atomic I/O may
-not begin right away (it may need to wait for buffer space or for data),
-but once it does begin it finishes immediately.
-
-Reading or writing a larger amount of data may not be atomic; for
-example, output data from other processes sharing the descriptor may be
-interspersed.  Also, once @code{PIPE_BUF} characters have been written,
-further writes will block until some characters are read.
-
-@xref{Limits for Files}, for information about the @code{PIPE_BUF}
-parameter.

  reply	other threads:[~2024-03-25 16:21 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25  8:59 [PATCH] manual: Drop incorrect statement on PIPE_BUF and blocking writes Štěpán Němec
2024-03-25 11:46 ` Florian Weimer
2024-03-25 12:13   ` Štěpán Němec
2024-03-25 16:20     ` Zack Weinberg [this message]
2024-03-25 21:32       ` Štěpán Němec

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3bbbaba9-bf0a-4df5-b912-d0aa53e5e427@app.fastmail.com \
    --to=zack@owlfolio.org \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=stepnem@smrk.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).