unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Rich Felker <dalias@libc.org>
To: "Richard W.M. Jones" <rjones@redhat.com>
Cc: Florian Weimer <fweimer@redhat.com>,
	glibc list <libc-alpha@sourceware.org>,
	Eric Blake <eblake@redhat.com>,
	"libguestfs@redhat.com" <libguestfs@redhat.com>
Subject: Re: RFC: *scanf vs. overflow
Date: Sat, 23 May 2020 12:21:12 -0400	[thread overview]
Message-ID: <20200523162112.GJ1079@brightrain.aerifal.cx> (raw)
In-Reply-To: <20200523070654.GO3888@redhat.com>

On Sat, May 23, 2020 at 08:06:54AM +0100, Richard W.M. Jones via Libc-alpha wrote:
> The context to this is that nbdkit uses sscanf to parse simple file
> formats in various places, eg:
> 
> https://github.com/libguestfs/nbdkit/blob/b23f4f53cf71326f1dba481f64f7f182c20fa3dc/plugins/data/format.c#L171-L172
> https://github.com/libguestfs/nbdkit/blob/b23f4f53cf71326f1dba481f64f7f182c20fa3dc/filters/ddrescue/ddrescue.c#L98
> 
> We can only do this safely where we can prove that overflow does not
> matter.

Being that it's specified as UB, it can never "not matter";
arbitrarily bad side effects are permitted. So it's only safe to use
scanf where the input is *trusted not to contain overflowing values*.

What would be really nice to fix here is getting the standard to
specify that overflow has behavior like strto* or at least
"unspecified value" rather than "undefined behavior" so that it's safe
to let it overflow in cases where you don't care (e.g. you'll be
consistency-checking the value afterwards anyway).

> In other cases we've had to change sscanf uses to strto* etc
> which is much more difficult to use correctly.  Just look at how much
> code is required to wrap strto* functions to use them safely:
> 
> https://github.com/libguestfs/nbdkit/blob/b23f4f53cf71326f1dba481f64f7f182c20fa3dc/server/public.c#L113-L296

Really that much code is just for the sake of verbose error messages,
and they're not even accurate. "errno!=0" does not mean "could not
parse"; it can also be overflow of a perfectly parseable value. And if
you've already caught errno!=0 then end==str is impossible (dead
code). The last case, not hitting null, is also likely spurious/wrong;
you usually *want* to pick up where strto* stopped, and the next thing
the parser does will catch whether the characters after the number are
valid there or not.

strto* do have some annoying design flaws in error reporting, but
they're not really that hard to use right, and much easier than scanf
which just *lacks the reporting channels* for the kind of fine-grained
error reporting you're insisting on doing here.

FWIW this code would also be a lot cleaner as a static inline function
rather than a many-line macro.

Rich

      parent reply	other threads:[~2020-05-23 16:21 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-22 20:59 RFC: *scanf vs. overflow Eric Blake via Libc-alpha
2020-05-23  1:16 ` Rich Felker
2020-05-23  3:06   ` Paul Eggert
2020-05-23 16:11     ` Rich Felker
2020-05-23 16:28       ` Paul Eggert
2020-05-23 16:45         ` Rich Felker
2020-05-23 17:18           ` Paul Eggert
2020-05-26  9:30           ` [Libguestfs] " Richard W.M. Jones via Libc-alpha
2020-05-23  7:06 ` Richard W.M. Jones via Libc-alpha
2020-05-23 15:25   ` Paul Eggert
2020-05-23 16:21   ` Rich Felker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200523162112.GJ1079@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=eblake@redhat.com \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=libguestfs@redhat.com \
    --cc=rjones@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).