From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <bug-gnulib-bounces+normalperson=yhbt.net@gnu.org>
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net
X-Spam-Level: 
X-Spam-ASN: AS22989 209.51.188.0/24
X-Spam-Status: No, score=-3.9 required=3.0 tests=AWL,BAYES_00,
	HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,
	SPF_HELO_NONE,SPF_PASS shortcircuit=no autolearn=ham
	autolearn_force=no version=3.4.2
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by dcvr.yhbt.net (Postfix) with ESMTPS id 349681F4C0
	for <normalperson@yhbt.net>; Sun, 13 Oct 2019 18:33:50 +0000 (UTC)
Received: from localhost ([::1]:41462 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <bug-gnulib-bounces+normalperson=yhbt.net@gnu.org>)
	id 1iJigd-0003p4-Rq
	for normalperson@yhbt.net; Sun, 13 Oct 2019 14:33:47 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:55340)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <eggert@cs.ucla.edu>) id 1iJifn-0003jA-81
 for bug-gnulib@gnu.org; Sun, 13 Oct 2019 14:32:56 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <eggert@cs.ucla.edu>) id 1iJifl-000124-RV
 for bug-gnulib@gnu.org; Sun, 13 Oct 2019 14:32:55 -0400
Received: from zimbra.cs.ucla.edu ([131.179.128.68]:36948)
 by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
 (Exim 4.71) (envelope-from <eggert@cs.ucla.edu>) id 1iJifl-0000zY-JD
 for bug-gnulib@gnu.org; Sun, 13 Oct 2019 14:32:53 -0400
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 436BA1605D8;
 Sun, 13 Oct 2019 11:32:51 -0700 (PDT)
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id Y1mz3Ifp4HcN; Sun, 13 Oct 2019 11:32:50 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1])
 by zimbra.cs.ucla.edu (Postfix) with ESMTP id 490A7160690;
 Sun, 13 Oct 2019 11:32:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at zimbra.cs.ucla.edu
Received: from zimbra.cs.ucla.edu ([127.0.0.1])
 by localhost (zimbra.cs.ucla.edu [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id 395Uk5kLUCFB; Sun, 13 Oct 2019 11:32:50 -0700 (PDT)
Received: from [192.168.1.9] (cpe-23-242-74-103.socal.res.rr.com
 [23.242.74.103])
 by zimbra.cs.ucla.edu (Postfix) with ESMTPSA id 24E6B1605D8;
 Sun, 13 Oct 2019 11:32:50 -0700 (PDT)
Subject: Re: supporting strings > 2 GB
To: Bruno Haible <bruno@clisp.org>
References: <15256545.f1uGFDiRv1@omega>
 <749e79a7-0c0b-74d9-dbda-9a4676a931d2@cs.ucla.edu> <5098484.EAxBekdZ73@omega>
From: Paul Eggert <eggert@cs.ucla.edu>
Organization: UCLA Computer Science Department
Message-ID: <61d5f74b-f092-db22-6dff-a3a47f6e64ce@cs.ucla.edu>
Date: Sun, 13 Oct 2019 11:32:49 -0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101
 Thunderbird/60.9.0
MIME-Version: 1.0
In-Reply-To: <5098484.EAxBekdZ73@omega>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [fuzzy]
X-Received-From: 131.179.128.68
X-BeenThere: bug-gnulib@gnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Gnulib discussion list <bug-gnulib.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/bug-gnulib>,
 <mailto:bug-gnulib-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/bug-gnulib>
List-Post: <mailto:bug-gnulib@gnu.org>
List-Help: <mailto:bug-gnulib-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/bug-gnulib>,
 <mailto:bug-gnulib-request@gnu.org?subject=subscribe>
Cc: bug-gnulib@gnu.org
Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org
Sender: "bug-gnulib" <bug-gnulib-bounces+normalperson=yhbt.net@gnu.org>

On 10/13/19 10:38 AM, Bruno Haible wrote:

> The type printf_len_t is meant to allow the user to write code that works with
> and without _PRINTF_LARGE.

By "the user" do you mean a user of an improved POSIX API for printf-like 
functions, or a user of a Gnulib wrapper around the improved POSIX API? If the 
former, I'm not quite following. If the latter, then I do follow; but we need to 
make it clear which part of the change is the former and which is for the 
latter, if we ever want to change POSIX and/or ISO C.

> 1) It would be wrong to write
> 
>       int ret = printf (...);
> 
>     because without _PRINTF_LARGE this code will truncate the printf result.

For this particular case, portable code could use 'ptrdiff_t' instead of 'int'; 
this would be portable enough as it would work regardless of whether printf is 
old-style or new-style (except on weird platforms where PTRDIFF_MAX < INT_MAX, 
which I don't think we need to worry about).

> The type and macro allow to write these as
> 
>       printf_len_t ret = printf (...);
> 
>       printf_len_t len;
>       if (len > PRINTF_LEN_MAX)
>         fail ();

Sorry, I don't follow this. I thought PRINTF_LEN_MAX was intended to be the 
maximum value that can be stored into printf_len_t, in which case 'len > 
PRINTF_LEN_MAX' must yield 0. If the intent is something else, then these types 
and/or macros probably need different names, to avoid confusion with 
longstanding naming practice elsewhere.

> There is no need to reserve a new length modifier and/or macros like PRIdPRINTF
> and SCNdPRINTF, because the type and macro are only a convenience.

So if I want to print a printf_len_t I must first convert it to intmax_t and 
print that? I don't see the convenience here, but perhaps that's because I don't 
understand the intent of printf_len_t and PRINTF_LEN_MAX.

>> Would %ln work only for the new *l functions, or would it also work for the
>> already-standard printf functions?
> 
> The existing printf functions are left unchanged: Since the entire result
> may not be longer than INT_MAX bytes, it makes no sense to add provisions
> for returning an index > INT_MAX or using a format directive with width
> or precision > INT_MAX.

printf already has provisions for width or precision > INT_MAX; one can do 
'printf ("%2147483648d", 0)', for example. These calls are a corner case that 
fail, but that's OK. Attempting to use '**' with old printf could fail in a 
similar way.

>> Perhaps it would be simpler if the new *l functions use ptrdiff_t everywhere
>> that the old functions use 'int' for sizes and widths. Then we wouldn't have to
>> worry about '**' vs '*', or about '%ln' versus '%n'. The Gnulib layer could
>> resolve whether the functions are about int or ptrdiff_t.
> 
> But then the valid format strings for the *l functions would not be
> a superset of the valid format strings for the existing *printf functions.

Why a superset? Shouldn't the sets of format strings be the same, so that 
programmers can easily switch back and forth between the two sets of functions? 
For example, if you have code that generates a format string, it would be nicer 
if you could use that same format string regardless of whether you pass it to 
printf or to lprintf.

> One of the goals is that programmers can use the new facility just be
> importing the respective gnulib modules and doing
>    #define _PRINTF_LARGE 1
> without reviewing every format string.

Yes, and that goal is furthered by having the two sets of functions accept the 
same format strings.

> Regarding the naming: I'm now tending towards 'lprintf' and 'flprintf',
> to make it look like 'wprintf' and 'fwprintf'.

Yes, that sounds better than the first proposal.