From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_PASS, SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 63EFE1F55B for ; Sat, 23 May 2020 16:45:05 +0000 (UTC) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5BD54386F44A; Sat, 23 May 2020 16:45:04 +0000 (GMT) Received: from brightrain.aerifal.cx (brightrain.aerifal.cx [216.12.86.13]) by sourceware.org (Postfix) with ESMTPS id AEDA63851C16 for ; Sat, 23 May 2020 16:45:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org AEDA63851C16 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=libc.org Authentication-Results: sourceware.org; spf=none smtp.mailfrom=dalias@libc.org Date: Sat, 23 May 2020 12:45:01 -0400 From: Rich Felker To: Paul Eggert Subject: Re: RFC: *scanf vs. overflow Message-ID: <20200523164500.GK1079@brightrain.aerifal.cx> References: <20200523011614.GE1079@brightrain.aerifal.cx> <20200523161143.GI1079@brightrain.aerifal.cx> <900d665c-40be-bd1b-215a-391cded68d3b@cs.ucla.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <900d665c-40be-bd1b-215a-391cded68d3b@cs.ucla.edu> User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Florian Weimer , glibc list , Eric Blake , "libguestfs@redhat.com" Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" On Sat, May 23, 2020 at 09:28:26AM -0700, Paul Eggert wrote: > On 5/23/20 9:11 AM, Rich Felker wrote: > > > stopping on an initial prefix ... does not admit easily sharing a backend with strto*. > > I don't see why. If the backend has a "stop scanning on integer overflow" flag > (which it would need to have anyway, to support the proposed behavior), then > *scanf can use the flag and strto* can not use it. > > Anyway, this is not an issue for glibc, which has no such backend. It's relevant because you want to propose this for standardization. > > that's contrary to the abstract behavior defined for scanf > > (matching fields syntactically then value conversion) > > That's not really a problem. The abstract behavior already provides for matching > that is not purely syntactic. For example, string conversion specifiers can > impose length limits on the match, which means the matching does not rely purely > on the syntax of the input. It would be easy to say that integer conversion > specifiers can also impose limits related to integer overflow. Sure that's syntax. It's /[^ ]{1,n}"/. Of course for integers you can define a syntax that matches every non-overflowing value (this is always true for finite matching sets), but that's nothing like how the function is specified and I don't think anyone reasonable would classify non-overflow as a syntactic property. > > It's also even *more > > likely* to break programs that don't expect the behavior than just > > storing a wrapped or clamped value > > That's not true of the code that I looked at (see the URLs earlier in this > thread). That code was pretty carefully written and yet still vulnerable to the > integer-overflow issue. I don't follow. *Any* use of scanf on untrusted input is "vulnerable to the integer-overflow issue" in the sense that overflow is UB. This is not something subtle. If you mean actually using overflowed values in an unsafe way (assuming no ballooning effects of UB, just wrong values), I don't see how it's subtle either. Any value that could be produced via overflow could also be produced via non-overflowing input, and you have to validate data either way. > > I'm pretty sure the real answer here is just "don't use *scanf for > > that." > > Absolutely true right now. We are merely talking about (a) what sort of > implementation behavior is more useful for programs that are currently relying > on undefined behavior, and (b) what might be the cleanest addition to POSIX > later, to help improve this mess so that future programmers can use *scanf > safely in more situations. This is absolutely not "clean" and I am opposed to it. Rich