* preferring ptrdiff_t to size_t for object counts
@ 2017-06-05 6:45 Paul Eggert
2017-06-05 9:57 ` Bruno Haible
2017-06-05 10:07 ` Bruno Haible
0 siblings, 2 replies; 6+ messages in thread
From: Paul Eggert @ 2017-06-05 6:45 UTC (permalink / raw
To: Gnulib bugs
GNU Emacs has long been using signed types (typically ptrdiff_t) to count
objects. This has the advantage that signed integer overflow can be detected
automatically on some platforms (unfortunately, size_t arithmetic silently wraps
around). I would like to change the Gnulib modules that GNU Emacs uses, to use
this style. The main effect on these modules' non-Emacs users would be:
* They accept ptrdiff_t counts, not size_t counts. Normally sizes are computed
by new functions like xwgrowalloc. When the caller computes sizes by hand, it is
the caller's responsibility to check for integer overflow.
* They report errors via xwalloc_die, not xalloc_die.
I've also changed the modules that GNU grep uses, as a test that this idea works
on non-Emacs applications.
As this is a nontrivial change, I'll post the Gnulib patches first without
installing them, for discussion.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: preferring ptrdiff_t to size_t for object counts
2017-06-05 6:45 preferring ptrdiff_t to size_t for object counts Paul Eggert
@ 2017-06-05 9:57 ` Bruno Haible
2017-06-07 21:53 ` Bruno Haible
2017-06-05 10:07 ` Bruno Haible
1 sibling, 1 reply; 6+ messages in thread
From: Bruno Haible @ 2017-06-05 9:57 UTC (permalink / raw
To: bug-gnulib; +Cc: Paul Eggert
Hi Paul,
> GNU Emacs has long been using signed types (typically ptrdiff_t) to count
> objects. This has the advantage that signed integer overflow can be detected
> automatically on some platforms (unfortunately, size_t arithmetic silently wraps
> around).
I have one objection, but a big one: The direct use of ptrdiff_t.
Reasons:
1) Like you, I spend time reviewing code other people have written. In these
code reviews, it is important to know whether a variable is known to always
be >= 0 or not.
For example, when we have
int n = ...;
for (int i = 0; i < n; i++) ...
I always have to spend brain cycles around the question "what if n < 0?
Does the code still achieve its goal in this case?"
Whereas if the type clearly states the intent to store only values >= 0,
there is no issue; no extra brain cycles required.
2) Standards change, and the considerations behind 'walloc' may also change.
Do you want, 5 or 10 years from now, to go through hundreds of uses of
'ptrdiff_t' and separate those uses with values >= 0 from those with values
that can be negative? I certainly don't want to.
3) GCC has range types for Ada. I would hope that someday it also has range
types for C or C++. Then, it would be very useful to express the fact that
the values are in the range [0..PTRDIFF_MAX], so that GCC can use it for
optimization.
4) For static analysis tools (gnulib now uses coverity in particular), I can
imagine that an unsigned type is easier to work with than a signed type
(i.e. that the tool can make more inferences and therefore detect more bugs
when using unsigned types).
To this effect, it is useful to use an unsigned type for those counters /
size_t object, *just* for the static analysis tool.
To fix all of these issues, I suggest to use a typedef'ed type, instead. For
example:
typedef ptrdiff_t wsize_t;
And then use wsize_t everywhere.
This solves problems 1), 2), 3), and 4 (through a #ifdefed definition of
wsize_t).
Yes it means that people reading the code will have to memorize one more type
identifier. But it is to their benefit: they will know the values are >= 0
(see point 1).
Bruno
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: preferring ptrdiff_t to size_t for object counts
2017-06-05 6:45 preferring ptrdiff_t to size_t for object counts Paul Eggert
2017-06-05 9:57 ` Bruno Haible
@ 2017-06-05 10:07 ` Bruno Haible
1 sibling, 0 replies; 6+ messages in thread
From: Bruno Haible @ 2017-06-05 10:07 UTC (permalink / raw
To: bug-gnulib; +Cc: Paul Eggert
Hi Paul,
I'd like to understand how much better this "ptrdiff_t world" is.
> This has the advantage that signed integer overflow can be detected
> automatically on some platforms
You mean "-fsanitize=undefined", right?
Does this also catch the following situations?
a) Pointer subtraction. ISO C11 § J.2 says:
"The behavior is undefined in the following circumstances: ...
The result of subtracting two pointers is not representable in an object
of type ptrdiff_t (6.5.6)."
b) When assigning a 'size_t' value > PTRDIFF_MAX to a 'ptrdiff_t' variable,
is that undefined behaviour? Is that caught by "-fsanitize=undefined"?
Bruno
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: preferring ptrdiff_t to size_t for object counts
2017-06-05 9:57 ` Bruno Haible
@ 2017-06-07 21:53 ` Bruno Haible
2017-06-07 22:12 ` Paul Eggert
0 siblings, 1 reply; 6+ messages in thread
From: Bruno Haible @ 2017-06-07 21:53 UTC (permalink / raw
To: bug-gnulib; +Cc: Paul Eggert
I wrote:
> typedef ptrdiff_t wsize_t;
'wsize_t' or 'wcount_t'. I don't really mind the name of the type - as
long as it's a typedef.
Bruno
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: preferring ptrdiff_t to size_t for object counts
2017-06-07 21:53 ` Bruno Haible
@ 2017-06-07 22:12 ` Paul Eggert
2017-06-08 0:36 ` Bruno Haible
0 siblings, 1 reply; 6+ messages in thread
From: Paul Eggert @ 2017-06-07 22:12 UTC (permalink / raw
To: Bruno Haible, bug-gnulib
On 06/07/2017 02:53 PM, Bruno Haible wrote:
> I don't really mind the name of the type - as
> long as it's a typedef.
I've been leaning towards a name that doesn't start with 'w', since the
type is not specific to the walloc module family. The name I'm currently
thinking of is 'in_t', short for "index type". That's an
easy-to-remember name (the type is like 'int', but possibly wider).
One other advantage of having our own signed type is that we can
guarantee that it's at least as wide as int (something that is not true
for ptrdiff_t). That way, some of my current code that says 'MIN
(INT_MAX, PTRDIFF_MAX)' can be simplified to the more-natural INT_MAX.
This is helpful for traditional interfaces that use int counters.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: preferring ptrdiff_t to size_t for object counts
2017-06-07 22:12 ` Paul Eggert
@ 2017-06-08 0:36 ` Bruno Haible
0 siblings, 0 replies; 6+ messages in thread
From: Bruno Haible @ 2017-06-08 0:36 UTC (permalink / raw
To: Paul Eggert; +Cc: bug-gnulib
Hi Paul,
> The name I'm currently
> thinking of is 'in_t', short for "index type". That's an
> easy-to-remember name (the type is like 'int', but possibly wider).
Fine with me.
It doesn't collide: Only very few packages use this identifier 'in_t', and
only in isolated places.
> One other advantage of having our own signed type is that we can
> guarantee that it's at least as wide as int (something that is not true
> for ptrdiff_t). That way, some of my current code that says 'MIN
> (INT_MAX, PTRDIFF_MAX)' can be simplified to the more-natural INT_MAX.
> This is helpful for traditional interfaces that use int counters.
Indeed. (Although portability to Windows 3.1 is not in the focus of gnulib
nor of GNU programs any more.)
Bruno
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2017-06-08 0:36 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-05 6:45 preferring ptrdiff_t to size_t for object counts Paul Eggert
2017-06-05 9:57 ` Bruno Haible
2017-06-07 21:53 ` Bruno Haible
2017-06-07 22:12 ` Paul Eggert
2017-06-08 0:36 ` Bruno Haible
2017-06-05 10:07 ` Bruno Haible
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).