bug-gnulib@gnu.org mirror (unofficial)
 help / color / mirror / Atom feed
From: Bruno Haible <bruno@clisp.org>
To: bug-gnulib@gnu.org, Corinna Vinschen <vinschen@redhat.com>
Subject: Re: [PATCH] Do not decorate symbols as dllexport on Cygwin
Date: Mon, 06 Feb 2023 17:36:36 +0100	[thread overview]
Message-ID: <10831993.icLtTst4D9@nimes> (raw)
In-Reply-To: <Y+EX8FEm68UrzaDU@calimero.vinschen.de>

Hi Corinna,

Sorry, I wanted to reply sooner.

> > May I ask what's the idea to provide a thread-safe setlocale?  It was
> > never defined as thread-safe and POSIX explicitely mentions that.  Any
> > application expecting to call setlocale thread-safe is broken by design.

The 'recode' package includes a shared library 'librecode', and shared
libraries are supposed to be multithread-safe.

Multithread-safe code is supposed to be able to obey the current locale,
that is,
  - if uselocale(NULL) != LC_GLOBAL_LOCALE, uselocale()
  - if uselocale(NULL) == LC_GLOBAL_LOCALE, the global locale, that was
    last set through setlocale().

For example:
  - sprintf() needs to know the LC_CTYPE and LC_NUMERIC categories of the
    current locale, in order to transform wide strings via %ls or to
    format numbers via %f or %g.
  - iconv() needs to know the LC_CTYPE category of the current locale,
    for determining the appropriate transliterations.
  - strftime() needs to know the LC_TIME category of the current locale.
  - gettext() needs to know the LC_MESSAGES category of the current locale.
All these functions are supposed to be writable in application code
(if, for whatever reason, the sprintf, iconv, strftime, gettext functions
in libc are not considered adequate).

When POSIX specifies that applications cannot call setlocale() when there
are multiple threads, this implies that in order to *inspect* the locale
they need to do so before spawning the threads, and cache the value in a
global variable, for later use. But this is not the application architecture
that is in use when there are shared libraries. Shared libraries commonly
don't have an initialization hook by which the application informs the
library about the current locale for the rest of the execution of the
process.

So, conventional wisdom is to use setlocale(category, NULL). But this
faces MT-safety problems. On some platforms the problem is that the
setlocale(_, NULL) calls themselves will trash each other's data structures
and thus provoke a crash. On other platforms the problem is that the return
value produced in one thread is being clobbered by the second thread, and
thus the first thread has no chance to copy the return value to a safe
area in due time.

For Cygwin, the problem until yesterday was the second one, for
category == LC_ALL only.

> > It should use the newlocale/duplocale/uselocale/freelocale API instead,
> > isn't it?

Even code that pays attention to the per-thread locale has to be prepared
for the case that uselocale(NULL) returns LC_GLOBAL_LOCALE. In this case,
the caller needs to fetch the values from the global locale. There is no
locale_t object that could be queried in this case.

> Ahhh, I finally see what's going on.  The problem is not thread-safety
> as such, but thread-safety when reading the value of the LC_ALL category.

Yup, exactly.

> Glibc's setlocale isn't entirely thread-safe either, but there's a
> difference:
> 
> - GLibc creates the global strings returned by setlocale(LC_xxx, NULL)
>   at the time the locale data is changed.  All setlocale(LC_xxx, NULL)
>   calls only return pointer to strings created earlier.

Yes. Thus each thread can be sure to be able to inspect the return value.
(Assuming that there is no call to setlocale that *changes* the global
locale. This is forbidden by POSIX, and reasonable applications don't
do this.)

> - Cygwin or, better, newlib, also return a pointer to global strings.
>   However, while the global strings for the specific categories are
>   created when the locale is changed, the string returned for LC_ALL
>   gets created on the fly when setlocale(LC_ALL, NULL) is called.
... and gets overwritten by another thread, since the buffer is not
per-thread.
>   That's why test-setlocale_null-mt-all fails almost immediately.
> 
> I created a patch to newlib's setlocale to tweak the LC_ALL string
> each time the locale is changed, while setlocale(LC_ALL, NULL)
> just returns the already prepared string.
> 
> https://cygwin.com/git/?p=newlib-cygwin.git;a=commitdiff;h=23e49b18ce39
> 
> This patch will be in the next Cygwin release 3.4.6.

Thanks a lot! I'm adjusting the Gnulib code now, accordingly.

Bruno





  reply	other threads:[~2023-02-06 16:37 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-05 19:43 [PATCH] Do not decorate symbols as dllexport on Cygwin Corinna Vinschen
2023-02-05 20:41 ` Bruno Haible
2023-02-05 20:48   ` Bruno Haible
2023-02-05 21:22   ` Corinna Vinschen
2023-02-06 15:08     ` Corinna Vinschen
2023-02-06 16:36       ` Bruno Haible [this message]
2023-02-06 17:37       ` Bruno Haible
2023-02-06 20:04         ` Corinna Vinschen
2023-02-06 20:38           ` Bruno Haible
2023-02-06 20:43             ` Corinna Vinschen
2023-02-06 20:50             ` Reuben Thomas
2023-02-07 13:22               ` Corinna Vinschen
2023-02-07 15:22                 ` Corinna Vinschen
2023-02-10 14:11                   ` Reuben Thomas
2023-02-10 14:21                     ` Bruno Haible
2023-02-11 11:36                       ` Reuben Thomas
2023-02-11 12:06                         ` Corinna Vinschen
2023-02-11 12:29                           ` Reuben Thomas
2023-02-11 12:40                             ` Corinna Vinschen
2023-02-17  9:30                               ` Reuben Thomas
2023-02-17  9:58                                 ` Corinna Vinschen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://lists.gnu.org/mailman/listinfo/bug-gnulib

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=10831993.icLtTst4D9@nimes \
    --to=bruno@clisp.org \
    --cc=bug-gnulib@gnu.org \
    --cc=vinschen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).