From: Rich Felker <dalias@libc.org>
To: "Richard W.M. Jones" <rjones@redhat.com>
Cc: Florian Weimer <fweimer@redhat.com>,
glibc list <libc-alpha@sourceware.org>,
Eric Blake <eblake@redhat.com>,
"libguestfs@redhat.com" <libguestfs@redhat.com>
Subject: Re: RFC: *scanf vs. overflow
Date: Sat, 23 May 2020 12:21:12 -0400 [thread overview]
Message-ID: <20200523162112.GJ1079@brightrain.aerifal.cx> (raw)
In-Reply-To: <20200523070654.GO3888@redhat.com>
On Sat, May 23, 2020 at 08:06:54AM +0100, Richard W.M. Jones via Libc-alpha wrote:
> The context to this is that nbdkit uses sscanf to parse simple file
> formats in various places, eg:
>
> https://github.com/libguestfs/nbdkit/blob/b23f4f53cf71326f1dba481f64f7f182c20fa3dc/plugins/data/format.c#L171-L172
> https://github.com/libguestfs/nbdkit/blob/b23f4f53cf71326f1dba481f64f7f182c20fa3dc/filters/ddrescue/ddrescue.c#L98
>
> We can only do this safely where we can prove that overflow does not
> matter.
Being that it's specified as UB, it can never "not matter";
arbitrarily bad side effects are permitted. So it's only safe to use
scanf where the input is *trusted not to contain overflowing values*.
What would be really nice to fix here is getting the standard to
specify that overflow has behavior like strto* or at least
"unspecified value" rather than "undefined behavior" so that it's safe
to let it overflow in cases where you don't care (e.g. you'll be
consistency-checking the value afterwards anyway).
> In other cases we've had to change sscanf uses to strto* etc
> which is much more difficult to use correctly. Just look at how much
> code is required to wrap strto* functions to use them safely:
>
> https://github.com/libguestfs/nbdkit/blob/b23f4f53cf71326f1dba481f64f7f182c20fa3dc/server/public.c#L113-L296
Really that much code is just for the sake of verbose error messages,
and they're not even accurate. "errno!=0" does not mean "could not
parse"; it can also be overflow of a perfectly parseable value. And if
you've already caught errno!=0 then end==str is impossible (dead
code). The last case, not hitting null, is also likely spurious/wrong;
you usually *want* to pick up where strto* stopped, and the next thing
the parser does will catch whether the characters after the number are
valid there or not.
strto* do have some annoying design flaws in error reporting, but
they're not really that hard to use right, and much easier than scanf
which just *lacks the reporting channels* for the kind of fine-grained
error reporting you're insisting on doing here.
FWIW this code would also be a lot cleaner as a static inline function
rather than a many-line macro.
Rich
prev parent reply other threads:[~2020-05-23 16:21 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-22 20:59 RFC: *scanf vs. overflow Eric Blake via Libc-alpha
2020-05-23 1:16 ` Rich Felker
2020-05-23 3:06 ` Paul Eggert
2020-05-23 16:11 ` Rich Felker
2020-05-23 16:28 ` Paul Eggert
2020-05-23 16:45 ` Rich Felker
2020-05-23 17:18 ` Paul Eggert
2020-05-26 9:30 ` [Libguestfs] " Richard W.M. Jones via Libc-alpha
2020-05-23 7:06 ` Richard W.M. Jones via Libc-alpha
2020-05-23 15:25 ` Paul Eggert
2020-05-23 16:21 ` Rich Felker [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/libc/involved.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200523162112.GJ1079@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=eblake@redhat.com \
--cc=fweimer@redhat.com \
--cc=libc-alpha@sourceware.org \
--cc=libguestfs@redhat.com \
--cc=rjones@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).