* Re: [PATCH] regex: port to Gawk on nonstandard platforms
@ 2020-01-26 9:42 arnold
2020-01-27 21:09 ` Paul Eggert
0 siblings, 1 reply; 3+ messages in thread
From: arnold @ 2020-01-26 9:42 UTC (permalink / raw)
To: eggert; +Cc: bug-gnulib
Hi. Paul.
> diff --git a/lib/regex_internal.h b/lib/regex_internal.h
> index 13e15e21e..6d436fde1 100644
> --- a/lib/regex_internal.h
> +++ b/lib/regex_internal.h
> @@ -141,6 +141,9 @@
> #ifndef SSIZE_MAX
> # define SSIZE_MAX ((ssize_t) (SIZE_MAX / 2))
> #endif
> +#ifndef ULONG_WIDTH
> +# define ULONG_WIDTH (CHAR_BIT * sizeof (unsigned long int))
> +#endif
>
> /* The type of indexes into strings. This is signed, not size_t,
> since the API requires indexes to fit in regoff_t anyway, and using
This change is problematic. Further on in regex_internal.h we
have
#define BITSET_WORD_BITS ULONG_WIDTH
And then in places in regcomp.c BITSET_WORD_BITS is tested in
several #if/#elif statements.
Thus on systems that don't provide ULONG_WIDTH, we end up with
expressions in #if/#elif that wants to use sizeof.
Needless to say, this fails spectactularly. :-(
Can you revert to the original code or to something else that
will compile on systems where ULONG_WIDTH is not defined?
Much thanks,
Arnold
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] regex: port to Gawk on nonstandard platforms
2020-01-26 9:42 [PATCH] regex: port to Gawk on nonstandard platforms arnold
@ 2020-01-27 21:09 ` Paul Eggert
2020-01-28 7:41 ` arnold
0 siblings, 1 reply; 3+ messages in thread
From: Paul Eggert @ 2020-01-27 21:09 UTC (permalink / raw)
To: arnold; +Cc: bug-gnulib
[-- Attachment #1: Type: text/plain, Size: 546 bytes --]
On 1/26/20 1:42 AM, arnold@skeeve.com wrote:
> And then in places in regcomp.c BITSET_WORD_BITS is tested in
> several #if/#elif statements.
Ouch, I hadn't noticed that. It's exercised only on non-GCC platforms
that don't support INT_WIDTH etc., which is why I didn't see it in my
testing. I installed the first attached patch, which should fix it.
Thanks for reporting it.
While I was at it I also installed the second attached patch, since the
regex code no longer depends on the limits-h module. This second patch
shouldn't affect Awk.
[-- Attachment #2: 0001-regex-port-to-non-GCC-pre-IEC-60559.patch --]
[-- Type: text/x-patch, Size: 2344 bytes --]
From cc27f179a6f3c17bda8c8bad5fa125864603bae2 Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 27 Jan 2020 13:00:57 -0800
Subject: [PATCH 1/2] regex: port to non-GCC pre-IEC-60559
Problem reported by Arnold Robbins in:
https://lists.gnu.org/r/bug-gnulib/2020-01/msg00154.html
* lib/regex_internal.h (ULONG_WIDTH): Make this usable in #if.
---
ChangeLog | 7 +++++++
lib/regex_internal.h | 17 ++++++++++++++++-
2 files changed, 23 insertions(+), 1 deletion(-)
diff --git a/ChangeLog b/ChangeLog
index a4ea8009b..d3d1942a1 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2020-01-27 Paul Eggert <eggert@cs.ucla.edu>
+
+ regex: port to non-GCC pre-IEC-60559
+ Problem reported by Arnold Robbins in:
+ https://lists.gnu.org/r/bug-gnulib/2020-01/msg00154.html
+ * lib/regex_internal.h (ULONG_WIDTH): Make this usable in #if.
+
2020-01-25 Bruno Haible <bruno@clisp.org>
c32isxdigit: Add tests.
diff --git a/lib/regex_internal.h b/lib/regex_internal.h
index 6d436fde1..8c42586c4 100644
--- a/lib/regex_internal.h
+++ b/lib/regex_internal.h
@@ -142,7 +142,22 @@
# define SSIZE_MAX ((ssize_t) (SIZE_MAX / 2))
#endif
#ifndef ULONG_WIDTH
-# define ULONG_WIDTH (CHAR_BIT * sizeof (unsigned long int))
+# define ULONG_WIDTH REGEX_UINTEGER_WIDTH (ULONG_MAX)
+/* The number of usable bits in an unsigned integer type with maximum
+ value MAX, as an int expression suitable in #if. Cover all known
+ practical hosts. This implementation exploits the fact that MAX is
+ 1 less than a power of 2, and merely counts the number of 1 bits in
+ MAX; "COBn" means "count the number of 1 bits in the low-order n bits". */
+# define REGEX_UINTEGER_WIDTH(max) REGEX_COB128 (max)
+# define REGEX_COB128(n) (REGEX_COB64 ((n) >> 31 >> 31 >> 2) + REGEX_COB64 (n))
+# define REGEX_COB64(n) (REGEX_COB32 ((n) >> 31 >> 1) + REGEX_COB32 (n))
+# define REGEX_COB32(n) (REGEX_COB16 ((n) >> 16) + REGEX_COB16 (n))
+# define REGEX_COB16(n) (REGEX_COB8 ((n) >> 8) + REGEX_COB8 (n))
+# define REGEX_COB8(n) (REGEX_COB4 ((n) >> 4) + REGEX_COB4 (n))
+# define REGEX_COB4(n) (!!((n) & 8) + !!((n) & 4) + !!((n) & 2) + ((n) & 1))
+# if ULONG_MAX / 2 + 1 != 1ul << (ULONG_WIDTH - 1)
+# error "ULONG_MAX out of range"
+# endif
#endif
/* The type of indexes into strings. This is signed, not size_t,
--
2.24.1
[-- Attachment #3: 0002-regex-remove-limits-h-dependency.patch --]
[-- Type: text/x-patch, Size: 1478 bytes --]
From 55cb9de6ff5f6d382da4efe6c47a0fad5b00c4cf Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Mon, 27 Jan 2020 13:07:22 -0800
Subject: [PATCH 2/2] regex: remove limits-h dependency
* modules/regex (Depends-on): Remove limits-h, since the
code no longer depends on ULONG_WIDTH already being defined.
---
ChangeLog | 4 ++++
modules/regex | 1 -
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/ChangeLog b/ChangeLog
index d3d1942a1..a861f4996 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,9 @@
2020-01-27 Paul Eggert <eggert@cs.ucla.edu>
+ regex: remove limits-h dependency
+ * modules/regex (Depends-on): Remove limits-h, since the
+ code no longer depends on ULONG_WIDTH already being defined.
+
regex: port to non-GCC pre-IEC-60559
Problem reported by Arnold Robbins in:
https://lists.gnu.org/r/bug-gnulib/2020-01/msg00154.html
diff --git a/modules/regex b/modules/regex
index bd38cd2d4..9d77df7ae 100644
--- a/modules/regex
+++ b/modules/regex
@@ -22,7 +22,6 @@ builtin-expect [test $ac_use_included_regex = yes]
intprops [test $ac_use_included_regex = yes]
langinfo [test $ac_use_included_regex = yes]
libc-config [test $ac_use_included_regex = yes]
-limits-h [test $ac_use_included_regex = yes]
lock [test "$ac_cv_gnu_library_2_1:$ac_use_included_regex" = no:yes]
memcmp [test $ac_use_included_regex = yes]
memmove [test $ac_use_included_regex = yes]
--
2.24.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] regex: port to Gawk on nonstandard platforms
2020-01-27 21:09 ` Paul Eggert
@ 2020-01-28 7:41 ` arnold
0 siblings, 0 replies; 3+ messages in thread
From: arnold @ 2020-01-28 7:41 UTC (permalink / raw)
To: eggert, arnold; +Cc: bug-gnulib
Paul Eggert <eggert@cs.ucla.edu> wrote:
> On 1/26/20 1:42 AM, arnold@skeeve.com wrote:
> > And then in places in regcomp.c BITSET_WORD_BITS is tested in
> > several #if/#elif statements.
>
> Ouch, I hadn't noticed that. It's exercised only on non-GCC platforms
> that don't support INT_WIDTH etc., which is why I didn't see it in my
> testing. I installed the first attached patch, which should fix it.
> Thanks for reporting it.
>
> While I was at it I also installed the second attached patch, since the
> regex code no longer depends on the limits-h module. This second patch
> shouldn't affect Awk.
Much thanks for the fix. I have pulled it into gawk and we'll see
what my testers report.
Thanks,
Arnold
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2020-01-28 7:41 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-26 9:42 [PATCH] regex: port to Gawk on nonstandard platforms arnold
2020-01-27 21:09 ` Paul Eggert
2020-01-28 7:41 ` arnold
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).