From: Eric Blake via Libc-alpha <libc-alpha@sourceware.org>
To: glibc list <libc-alpha@sourceware.org>
Cc: Florian Weimer <fweimer@redhat.com>,
"libguestfs@redhat.com" <libguestfs@redhat.com>
Subject: RFC: *scanf vs. overflow
Date: Fri, 22 May 2020 15:59:14 -0500 [thread overview]
Message-ID: <f3e5f1dc-d8cf-4fba-fa7f-97e6e5218660@redhat.com> (raw)
It has long been known that the C specification of *scanf() leaves
behavior undefined for things like
int i;
sscanf("9999999999999999", "%i", &i);
C11 7.21.6.2 P12
"Matches an optionally signed integer, whose format is the same as
expected for the subject sequence of the strtol function with the value
0 for the base argument."
C11 7.21.6.2 P10
"If this object does not have an appropriate type, or if the result of
the conversion cannot be represented in the object, the behavior is
undefined."
as there is an overflow when consuming the input which matches the
strtol subject sequence but does not fit in the width of an int. On my
Linux system, 'man sscanf' mentions that ERANGE might be set in such a
case, but neither C nor POSIX actually requires this behavior; other
likely behaviors is storing the value mod 2^32 into i, or storing
INT_MAX into i, or ...
This is annoying - the only safe way to parse integers from
untrustworthy sources, where overflow MUST be detected, is to manually
open-code strtol() calls, which can get quite lengthy in comparison to
the concise representations possible with *scanf.
Would glibc be willing to consider a GNU extension to add an optional
flag character between '%' and the various numeric conversion specifiers
(both integral based on strto*l, and floating point based on strtod),
where we could force *scanf to treat numeric overflow as a matching
failure, rather than undefined behavior? Or even a second flag to
request that printf stop consuming characters if the next character in
input would cause overflow in the current specifier, leaving that
character to instead be matched to the remainder of the format string?
Let's suppose for arguments that we add '^' as a request to force
overflow to be a matching error. Then sscanf("9999999999999999", "%^i",
&i) would be well-specified to return 0, rather than returning 1 with an
unknown value assigned into i or any other behavior that other libc do
with the undefined behavior when the ^ is not present.
And if glibc likes the idea of such an extension, and we see an uptick
in applications actually using it, I'd also be happy to champion the
addition of such an extension in POSIX (but the POSIX folks will
definitely want to see existing practice first - both an implementation
and applications that use that implementation). The libguestfs suite of
programs is willing to be an early adopter, if glibc is willing to
pursue adding such a safety valve.
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
next reply other threads:[~2020-05-22 20:59 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-22 20:59 Eric Blake via Libc-alpha [this message]
2020-05-23 1:16 ` RFC: *scanf vs. overflow Rich Felker
2020-05-23 3:06 ` Paul Eggert
2020-05-23 16:11 ` Rich Felker
2020-05-23 16:28 ` Paul Eggert
2020-05-23 16:45 ` Rich Felker
2020-05-23 17:18 ` Paul Eggert
2020-05-26 9:30 ` [Libguestfs] " Richard W.M. Jones via Libc-alpha
2020-05-23 7:06 ` Richard W.M. Jones via Libc-alpha
2020-05-23 15:25 ` Paul Eggert
2020-05-23 16:21 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://www.gnu.org/software/libc/involved.html
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f3e5f1dc-d8cf-4fba-fa7f-97e6e5218660@redhat.com \
--to=libc-alpha@sourceware.org \
--cc=eblake@redhat.com \
--cc=fweimer@redhat.com \
--cc=libguestfs@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).