unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
From: Rich Felker <dalias@libc.org>
To: Paul Eggert <eggert@cs.ucla.edu>
Cc: Florian Weimer <fweimer@redhat.com>,
	glibc list <libc-alpha@sourceware.org>,
	Eric Blake <eblake@redhat.com>,
	"libguestfs@redhat.com" <libguestfs@redhat.com>
Subject: Re: RFC: *scanf vs. overflow
Date: Sat, 23 May 2020 12:45:01 -0400	[thread overview]
Message-ID: <20200523164500.GK1079@brightrain.aerifal.cx> (raw)
In-Reply-To: <900d665c-40be-bd1b-215a-391cded68d3b@cs.ucla.edu>

On Sat, May 23, 2020 at 09:28:26AM -0700, Paul Eggert wrote:
> On 5/23/20 9:11 AM, Rich Felker wrote:
> 
> > stopping on an initial prefix ... does not admit easily sharing a backend with strto*.
> 
> I don't see why. If the backend has a "stop scanning on integer overflow" flag
> (which it would need to have anyway, to support the proposed behavior), then
> *scanf can use the flag and strto* can not use it.
> 
> Anyway, this is not an issue for glibc, which has no such backend.

It's relevant because you want to propose this for standardization.

> > that's contrary to the abstract behavior defined for scanf
> > (matching fields syntactically then value conversion)
> 
> That's not really a problem. The abstract behavior already provides for matching
> that is not purely syntactic. For example, string conversion specifiers can
> impose length limits on the match, which means the matching does not rely purely
> on the syntax of the input. It would be easy to say that integer conversion
> specifiers can also impose limits related to integer overflow.

Sure that's syntax. It's /[^ ]{1,n}"/.

Of course for integers you can define a syntax that matches every
non-overflowing value (this is always true for finite matching sets),
but that's nothing like how the function is specified and I don't
think anyone reasonable would classify non-overflow as a syntactic
property.

> > It's also even *more
> > likely* to break programs that don't expect the behavior than just
> > storing a wrapped or clamped value
> 
> That's not true of the code that I looked at (see the URLs earlier in this
> thread). That code was pretty carefully written and yet still vulnerable to the
> integer-overflow issue.

I don't follow. *Any* use of scanf on untrusted input is "vulnerable
to the integer-overflow issue" in the sense that overflow is UB. This
is not something subtle.

If you mean actually using overflowed values in an unsafe way
(assuming no ballooning effects of UB, just wrong values), I don't see
how it's subtle either. Any value that could be produced via overflow
could also be produced via non-overflowing input, and you have to
validate data either way.

> > I'm pretty sure the real answer here is just "don't use *scanf for
> > that."
> 
> Absolutely true right now. We are merely talking about (a) what sort of
> implementation behavior is more useful for programs that are currently relying
> on undefined behavior, and (b) what might be the cleanest addition to POSIX
> later, to help improve this mess so that future programmers can use *scanf
> safely in more situations.

This is absolutely not "clean" and I am opposed to it.

Rich

  reply	other threads:[~2020-05-23 16:45 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-22 20:59 RFC: *scanf vs. overflow Eric Blake via Libc-alpha
2020-05-23  1:16 ` Rich Felker
2020-05-23  3:06   ` Paul Eggert
2020-05-23 16:11     ` Rich Felker
2020-05-23 16:28       ` Paul Eggert
2020-05-23 16:45         ` Rich Felker [this message]
2020-05-23 17:18           ` Paul Eggert
2020-05-26  9:30           ` [Libguestfs] " Richard W.M. Jones via Libc-alpha
2020-05-23  7:06 ` Richard W.M. Jones via Libc-alpha
2020-05-23 15:25   ` Paul Eggert
2020-05-23 16:21   ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://www.gnu.org/software/libc/involved.html

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200523164500.GK1079@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=eblake@redhat.com \
    --cc=eggert@cs.ucla.edu \
    --cc=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=libguestfs@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).