From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on dcvr.yhbt.net X-Spam-Level: X-Spam-ASN: AS22989 209.51.188.0/24 X-Spam-Status: No, score=-3.7 required=3.0 tests=AWL,BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.6 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 465451F428 for ; Tue, 21 Mar 2023 16:53:16 +0000 (UTC) Authentication-Results: dcvr.yhbt.net; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=clisp.org header.i=@clisp.org header.a=rsa-sha256 header.s=strato-dkim-0002 header.b=P1rkxYuW; dkim-atps=neutral Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pefEE-00067U-9x; Tue, 21 Mar 2023 12:52:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pefEC-00067J-O0 for bug-gnulib@gnu.org; Tue, 21 Mar 2023 12:52:52 -0400 Received: from mo4-p00-ob.smtp.rzone.de ([85.215.255.21]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pefE8-0000To-HR for bug-gnulib@gnu.org; Tue, 21 Mar 2023 12:52:50 -0400 ARC-Seal: i=1; a=rsa-sha256; t=1679417564; cv=none; d=strato.com; s=strato-dkim-0002; b=BeqtKIuIRsb85KX6t0Saohr7xLzqwFQnU2wwNCk2WJ2tALke1Tdmcu1/W3aEiF6KxS 4jDNEOtfdH4Dl1b2poKHtdLrH2X+IMzRGPBSurzXmNaPvcmcaOhpLQsu58CrMMWZ5IFm 5CadirHqPYdbRck7HxHxCjBt5/TgMFP/Aw7BGYUT1LfA8gox7QsM9HqFdKALkn/pdn6H L5HN9HCsAK+zL9v+b2RnWco0NJMaHI0Ha/beHOfWE6+XxM7BjRYV1Vb0Uy5dkSEjrQlx NsIKM/+1bmT01CjPrhS0OJpUFO9XmB2Hp9c7cssaYewXrXDHGCdcM2Lu+jboqSYZ43HI exDA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1679417564; s=strato-dkim-0002; d=strato.com; h=References:In-Reply-To:Message-ID:Date:Subject:To:From:Cc:Date:From: Subject:Sender; bh=a7DPJOmQGYZpNQvPqTowsqPXqanJQNozM7NpgNbfcRQ=; b=nqsohUvlRVjmjk/QdH6QnrDx3zgnBGMJXOIhXl+R2w6sHh2QjyewVIhyxTxLE7uidf VcKjTeRf5RUyWOL1dPZPIwzbTPaORNt/bPUy34Qkk96h3Vsn9iKKIdTk8wfnC5fkFaLi 7xB4nHORKjFRSpoox6ux7uD7tHapTRy6bUwQX9PJHTBs2PSU6nmh4jW4K3mnUlj/H0xJ zYlebjR/rVLXJMHV2QmvDlA/bJjJG7mfAhP/1EafcsURClYJYX+8MWgB2erjMRIujC5v gDVyYQjJLyik1DopUskW8wv0RSOJfZT6yrjfvGYSRV5GkL0csss5+EnxIF3lHyqy8AP8 hdiQ== ARC-Authentication-Results: i=1; strato.com; arc=none; dkim=none X-RZG-CLASS-ID: mo00 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1679417564; s=strato-dkim-0002; d=clisp.org; h=References:In-Reply-To:Message-ID:Date:Subject:To:From:Cc:Date:From: Subject:Sender; bh=a7DPJOmQGYZpNQvPqTowsqPXqanJQNozM7NpgNbfcRQ=; b=P1rkxYuWPNljoR5jOaL2Dr4xx+3uzy2shWaM34tk0i43ipqiNNe5f0X8FvmVyBgGVJ bkDVScJ+XuUuqfVoPy9f/YRyO3coqu9JHpkEsauDks4I9D4bKm0ppfqOyHcmVALA2EgA TPC02EGlAnrpzaG9jJs0PtJOMyIQoN9ImH6BFoNklr1dFYFQqWHSHktvNKqhdZiNacTj 5kvMCOCPf/sq4P05YawP6HOegrp+gY+f4jP/voGFrf0RjwmcG0LgV5JPbYCG5kdprzUq IQFjJPFE5wqpvOAqfOycNa9OSuYEZ86+xgKIhDbWDiQWFK59WIT4e0EWqy8yAGDNSx44 eneg== X-RZG-AUTH: ":Ln4Re0+Ic/6oZXR1YgKryK8brlshOcZlIWs+iCP5vnk6shH0WWb0LN8XZoH94zq68+3cfpOTiqZaxpjw9W0Pb7O9nKf6K/JS" Received: from nimes.localnet by smtp.strato.de (RZmta 49.3.1 AUTH) with ESMTPSA id Z5c498z2LGqhfTQ (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256 bits)) (Client did not present a certificate); Tue, 21 Mar 2023 17:52:43 +0100 (CET) From: Bruno Haible To: bug-gnulib@gnu.org Subject: Re: vasnwprintf: Port to older platforms without swprintf Date: Tue, 21 Mar 2023 17:52:43 +0100 Message-ID: <7410120.GJh79HuArf@nimes> In-Reply-To: <15428020.lVVuGzaMjS@nimes> References: <15428020.lVVuGzaMjS@nimes> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Received-SPF: none client-ip=85.215.255.21; envelope-from=bruno@clisp.org; helo=mo4-p00-ob.smtp.rzone.de X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_NONE=0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: bug-gnulib@gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gnulib discussion list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Sender: bug-gnulib-bounces+normalperson=yhbt.net@gnu.org Yesterday I did: > (VASNPRINTF): In this case, implement %ls and %lc directly. Adjust a > couple of #if conditions. For the conversion from TCHAR_T[] to > DCHAR_T[], use mbsrtowcs. Oops, this was buggy: - The %lc implementation needs to ignore the precision. - The conversion from TCHAR_T[] to DCHAR_T[] must not stop at NUL bytes. It needs to convert the entire output of snprintf. The new unit tests for the %c and %lc directives trigger these bugs. Fixed as follows. 2023-03-21 Bruno Haible vasnwprintf: Fix for older platforms without swprintf. * lib/vasnprintf.c (VASNPRINTF): In the %lc handling, ignore the precision. Convert the snprintf result to a wchar_t[] not by mbsrtowcs, but by a loop that does not stop at NUL characters. * tests/test-vasnwprintf-posix.c (test_function): Add more tests for the %c and %lc directives. * modules/vasnwprintf (Depends-on): Add mbrtowc. Remove mbsrtowcs. diff --git a/lib/vasnprintf.c b/lib/vasnprintf.c index 0b349f4b55..bd13002e98 100644 --- a/lib/vasnprintf.c +++ b/lib/vasnprintf.c @@ -2432,8 +2432,6 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, Instead, just copy the argument wchar_t[] to the result. */ int flags = dp->flags; size_t width; - int has_precision; - size_t precision; width = 0; if (dp->width_start != dp->width_end) @@ -2464,59 +2462,62 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, } } - has_precision = 0; - precision = 6; - if (dp->precision_start != dp->precision_end) - { - if (dp->precision_arg_index != ARG_NONE) - { - int arg; - - if (!(a.arg[dp->precision_arg_index].type == TYPE_INT)) - abort (); - arg = a.arg[dp->precision_arg_index].a.a_int; - /* "A negative precision is taken as if the precision - were omitted." */ - if (arg >= 0) - { - precision = arg; - has_precision = 1; - } - } - else - { - const FCHAR_T *digitp = dp->precision_start + 1; - - precision = 0; - while (digitp != dp->precision_end) - precision = xsum (xtimes (precision, 10), *digitp++ - '0'); - has_precision = 1; - } - } - { - const wchar_t *arg; + const wchar_t *ls_arg; wchar_t lc_arg[1]; size_t characters; if (dp->conversion == 's') { - arg = a.arg[dp->arg_index].a.a_wide_string; + int has_precision; + size_t precision; + + has_precision = 0; + precision = 6; + if (dp->precision_start != dp->precision_end) + { + if (dp->precision_arg_index != ARG_NONE) + { + int arg; + + if (!(a.arg[dp->precision_arg_index].type == TYPE_INT)) + abort (); + arg = a.arg[dp->precision_arg_index].a.a_int; + /* "A negative precision is taken as if the precision + were omitted." */ + if (arg >= 0) + { + precision = arg; + has_precision = 1; + } + } + else + { + const FCHAR_T *digitp = dp->precision_start + 1; + + precision = 0; + while (digitp != dp->precision_end) + precision = xsum (xtimes (precision, 10), *digitp++ - '0'); + has_precision = 1; + } + } + + ls_arg = a.arg[dp->arg_index].a.a_wide_string; if (has_precision) { /* Use only at most PRECISION wide characters, from the left. */ - const wchar_t *arg_end; + const wchar_t *ls_arg_end; - arg_end = arg; + ls_arg_end = ls_arg; characters = 0; for (; precision > 0; precision--) { - if (*arg_end == 0) + if (*ls_arg_end == 0) /* Found the terminating null wide character. */ break; - arg_end++; + ls_arg_end++; characters++; } } @@ -2524,17 +2525,14 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, { /* Use the entire string, and count the number of wide characters. */ - characters = local_wcslen (arg); + characters = local_wcslen (ls_arg); } } else /* dp->conversion == 'c' */ { lc_arg[0] = (wchar_t) a.arg[dp->arg_index].a.a_wide_char; - arg = lc_arg; - if (has_precision && precision == 0) - characters = 0; - else - characters = 1; + ls_arg = lc_arg; + characters = 1; } { @@ -2550,7 +2548,7 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, if (characters > 0) { - DCHAR_CPY (result + length, arg, characters); + DCHAR_CPY (result + length, ls_arg, characters); length += characters; } @@ -5854,21 +5852,59 @@ VASNPRINTF (DCHAR_T *resultbuf, size_t *lengthp, tmpsrc = tmp; # endif # if WIDE_CHAR_VERSION - const TCHAR_T *tmpsrc2; + /* Convert tmpsrc[0..count-1] to a freshly allocated + wide character array. */ mbstate_t state; - tmpsrc2 = tmpsrc; memset (&state, '\0', sizeof (mbstate_t)); - tmpdst_len = mbsrtowcs (NULL, &tmpsrc2, 0, &state); - if (tmpdst_len == (size_t) -1) - goto fail_with_errno; + tmpdst_len = 0; + { + const TCHAR_T *src = tmpsrc; + size_t srclen = count; + + for (; srclen > 0; tmpdst_len++) + { + /* Parse the next multibyte character. */ + size_t ret = mbrtowc (NULL, src, srclen, &state); + if (ret == (size_t)(-2) || ret == (size_t)(-1)) + goto fail_with_EILSEQ; + if (ret == 0) + ret = 1; + src += ret; + srclen -= ret; + } + } + tmpdst = (wchar_t *) malloc ((tmpdst_len + 1) * sizeof (wchar_t)); if (tmpdst == NULL) goto out_of_memory; - tmpsrc2 = tmpsrc; + memset (&state, '\0', sizeof (mbstate_t)); - (void) mbsrtowcs (tmpdst, &tmpsrc2, tmpdst_len + 1, &state); + { + DCHAR_T *destptr = tmpdst; + size_t len = tmpdst_len; + const TCHAR_T *src = tmpsrc; + size_t srclen = count; + + for (; srclen > 0; destptr++, len--) + { + /* Parse the next multibyte character. */ + size_t ret = mbrtowc (destptr, src, srclen, &state); + if (ret == (size_t)(-2) || ret == (size_t)(-1)) + /* Should already have been caught in the first + loop, above. */ + abort (); + if (ret == 0) + ret = 1; + src += ret; + srclen -= ret; + } + /* By the way tmpdst_len was computed, len should now + be 0. */ + if (len != 0) + abort (); + } # else tmpdst = DCHAR_CONV_FROM_ENCODING (locale_charset (), diff --git a/modules/vasnwprintf b/modules/vasnwprintf index 109c39f1c5..91c4ca64ed 100644 --- a/modules/vasnwprintf +++ b/modules/vasnwprintf @@ -35,7 +35,7 @@ errno memchr assert-h wchar -mbsrtowcs +mbrtowc wmemcpy wmemset diff --git a/tests/test-vasnwprintf-posix.c b/tests/test-vasnwprintf-posix.c index c609a104ff..e53c6a33f3 100644 --- a/tests/test-vasnwprintf-posix.c +++ b/tests/test-vasnwprintf-posix.c @@ -3940,6 +3940,26 @@ test_function (wchar_t * (*my_asnwprintf) (wchar_t *, size_t *, const wchar_t *, free (result); } + { /* Precision is ignored. */ + size_t length; + wchar_t *result = + my_asnwprintf (NULL, &length, + L"%.0c %d", (unsigned char) 'x', 33, 44, 55); + ASSERT (wcscmp (result, L"x 33") == 0); + ASSERT (length == wcslen (result)); + free (result); + } + + { /* NUL character. */ + size_t length; + wchar_t *result = + my_asnwprintf (NULL, &length, + L"a%cz %d", '\0', 33, 44, 55); + ASSERT (wmemcmp (result, L"a\0z 33\0", 6 + 1) == 0); + ASSERT (length == 6); + free (result); + } + #if HAVE_WCHAR_T static wint_t L_x = (wchar_t) 'x'; @@ -3982,6 +4002,24 @@ test_function (wchar_t * (*my_asnwprintf) (wchar_t *, size_t *, const wchar_t *, ASSERT (length == wcslen (result)); free (result); } + + { /* Precision is ignored. */ + size_t length; + wchar_t *result = + my_asnwprintf (NULL, &length, L"%.0lc %d", L_x, 33, 44, 55); + ASSERT (wcscmp (result, L"x 33") == 0); + ASSERT (length == wcslen (result)); + free (result); + } + + { /* NUL character. */ + size_t length; + wchar_t *result = + my_asnwprintf (NULL, &length, L"a%lcz %d", (wint_t) L'\0', 33, 44, 55); + ASSERT (wmemcmp (result, L"a\0z 33\0", 6 + 1) == 0); + ASSERT (length == 6); + free (result); + } #endif /* Test the support of the 'b' conversion specifier for binary output of