From: Bruno Haible <bruno@clisp.org>
To: bug-gnulib@gnu.org
Subject: supporting strings > 2 GB
Date: Sat, 12 Oct 2019 16:38:49 +0200 [thread overview]
Message-ID: <15256545.f1uGFDiRv1@omega> (raw)
Hi Paul, Eric,
I'd like to get over the INT_MAX limit on string size for
* the *printf family of functions,
* the wcswidth, mbswidth functions,
like it has been done for large files and regular expressions.
The benefit I expect from that is:
- Support of strings > 2 GB or 4 GB without making applications more complex.
- Since such strings occur rarely, these corner cases of the code are most
often untested. The change would eliminate these untested corners, thus
eliminating a number of bugs.
How was it done for regular expressions?
1) POSIX introduced a type 'regoff_t' that is to be used instead of 'int',
in the context of the regex APIs.
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/regex.h.html
2) glibc introduced a preprocessor define _REGEX_LARGE_OFFSETS.
3) gnulib defines _REGEX_LARGE_OFFSETS to 1.
In a similar vein, I think it could be done like this for *printf:
1) Introduce a type 'printf_len_t' that is a signed type, either 'int' or
'ptrdiff_t'. And a constant PRINTF_LEN_MAX accordingly.
2) For each *printf functions that returns 'int', define a similar function
*printfl, that returns 'printf_len_t'.
3) Introduce %ln as a printf_len_t alternative to %n.
4) If _PRINTF_LARGE is defined and non-zero, define xxxprintf as an alias
of xxxprintfl (e.g. '#define xxxprintf xxxprintfl').
5) Gnulib defines _PRINTF_LARGE to 1.
And similarly for wcswidth, with new function wclswidth and macro
_WCSWIDTH_LARGE.
This way, applications could switch from *printf to *printfl at their pace,
without introducing uncaught overflow bugs at any moment.
Has this already been discussed in the Austin Group, or on the glibc list?
Bruno
next reply other threads:[~2019-10-12 14:39 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-12 14:38 Bruno Haible [this message]
2019-10-13 3:01 ` supporting strings > 2 GB Paul Eggert
2019-10-13 17:38 ` Bruno Haible
2019-10-13 18:32 ` Paul Eggert
2019-10-13 19:50 ` Bruno Haible
2019-10-13 20:12 ` Paul Eggert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
List information: https://lists.gnu.org/mailman/listinfo/bug-gnulib
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=15256545.f1uGFDiRv1@omega \
--to=bruno@clisp.org \
--cc=bug-gnulib@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).