* glibc 2.26 deadlock with __resolv_conf_detach
@ 2019-09-19 7:55 Douglas Jacobsen
2019-09-19 8:34 ` Andreas Schwab
0 siblings, 1 reply; 5+ messages in thread
From: Douglas Jacobsen @ 2019-09-19 7:55 UTC (permalink / raw)
To: libc-alpha
Hello,
We've found a possible deadlock in glibc 2.26 that is shipped with
SLES 15. The scenario we've uncovered is that some vendor software
runs a multithreaded daemon, which then fork()s, and spawns a thread,
then waits for that thread to terminate (pthread_join()), and finally
takes action like execve()ing.
The specific stack trace we see in the deadlock is:
Thread 2 (Thread 0x2aaacca09700 (LWP 140473)):
#0 0x00002aaaac483b9c in __lll_lock_wait_private () from /lib64/libc.so.6
#1 0x00002aaaac498148 in get_locked_global () from /lib64/libc.so.6
#2 0x00002aaaac499139 in __resolv_conf_detach () from /lib64/libc.so.6
#3 0x00002aaaac4e4aba in res_thread_freeres () from /lib64/libc.so.6
#4 0x00002aaaac4e4a62 in __libc_thread_freeres () from /lib64/libc.so.6
#5 0x00002aaaac16758e in start_thread () from /lib64/libpthread.so.0
#6 0x00002aaaac476a2f in clone () from /lib64/libc.so.6
The other thread is just waiting on this one to join. I'm fairly
unclear as to why the static "lock" variable in resolv_conf.c is
deadlocking here, unless there is some connection to that earlier
fork(), or there is some other mechanism I do not understand (which is
very possible).
In any case, it looks like sometime after glibc 2.26 was released, a
further update (124e0258) was made to this code to explicitly order
how these freeres functions were called. Was this done to address
this kind of deadlock scenario or for a different reason?
commit 124e025864bb39732c71fc60c1443d5680881a0a
Author: Florian Weimer <fweimer@redhat.com>
Date: Tue Jun 26 15:13:54 2018 +0200
Run thread shutdown functions in an explicit order
This removes the __libc_thread_subfreeres hook in favor of explict
calls.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Thanks,
Doug
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: glibc 2.26 deadlock with __resolv_conf_detach
2019-09-19 7:55 glibc 2.26 deadlock with __resolv_conf_detach Douglas Jacobsen
@ 2019-09-19 8:34 ` Andreas Schwab
2019-09-19 12:40 ` Douglas Jacobsen
0 siblings, 1 reply; 5+ messages in thread
From: Andreas Schwab @ 2019-09-19 8:34 UTC (permalink / raw)
To: Douglas Jacobsen; +Cc: libc-alpha
On Sep 19 2019, Douglas Jacobsen <dmjacobsen@lbl.gov> wrote:
> The scenario we've uncovered is that some vendor software runs a
> multithreaded daemon, which then fork()s, and spawns a thread,
In the forked child? If a multi-threaded process calls fork, the child
may only call async-signal-safe functions.
Andreas.
--
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: glibc 2.26 deadlock with __resolv_conf_detach
2019-09-19 8:34 ` Andreas Schwab
@ 2019-09-19 12:40 ` Douglas Jacobsen
2019-09-19 13:02 ` Andreas Schwab
0 siblings, 1 reply; 5+ messages in thread
From: Douglas Jacobsen @ 2019-09-19 12:40 UTC (permalink / raw)
To: Andreas Schwab; +Cc: libc-alpha
On Thu, Sep 19, 2019 at 1:34 AM Andreas Schwab <schwab@suse.de> wrote:
>
> On Sep 19 2019, Douglas Jacobsen <dmjacobsen@lbl.gov> wrote:
>
> > The scenario we've uncovered is that some vendor software runs a
> > multithreaded daemon, which then fork()s, and spawns a thread,
>
> In the forked child? If a multi-threaded process calls fork, the child
> may only call async-signal-safe functions.
>
Yes, the forked child is generating and joining a thread prior to
execve(), and then the child deadlocks. This code was not generating
problems in SLES12sp3 so it has the appearance of a new problem, but I
do understand what you're saying about async-signal-safe here. Would
the mechanism of failure here be that the lock variable in
glibc/resolv/resolv_conf.c is already locked in the parent process at
the time of fork() - or whenever the lookup is initially done from the
parent memory?
Thanks,
Doug
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: glibc 2.26 deadlock with __resolv_conf_detach
2019-09-19 12:40 ` Douglas Jacobsen
@ 2019-09-19 13:02 ` Andreas Schwab
2019-09-19 14:27 ` Douglas Jacobsen
0 siblings, 1 reply; 5+ messages in thread
From: Andreas Schwab @ 2019-09-19 13:02 UTC (permalink / raw)
To: Douglas Jacobsen; +Cc: libc-alpha
On Sep 19 2019, Douglas Jacobsen <dmjacobsen@lbl.gov> wrote:
> Would the mechanism of failure here be that the lock variable in
> glibc/resolv/resolv_conf.c is already locked in the parent process at
> the time of fork()
Yes, that is the most likely cause. Since fork only duplicates the
calling thread, there will be nothing left to release the lock.
Andreas.
--
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: glibc 2.26 deadlock with __resolv_conf_detach
2019-09-19 13:02 ` Andreas Schwab
@ 2019-09-19 14:27 ` Douglas Jacobsen
0 siblings, 0 replies; 5+ messages in thread
From: Douglas Jacobsen @ 2019-09-19 14:27 UTC (permalink / raw)
To: Andreas Schwab; +Cc: libc-alpha
Thanks, Andreas. I appreciate your help and responses. If we have
further reason to think this might be a glibc bug, I'll update here
again.
-Doug
On Thu, Sep 19, 2019 at 6:02 AM Andreas Schwab <schwab@suse.de> wrote:
>
> On Sep 19 2019, Douglas Jacobsen <dmjacobsen@lbl.gov> wrote:
>
> > Would the mechanism of failure here be that the lock variable in
> > glibc/resolv/resolv_conf.c is already locked in the parent process at
> > the time of fork()
>
> Yes, that is the most likely cause. Since fork only duplicates the
> calling thread, there will be nothing left to release the lock.
>
> Andreas.
>
> --
> Andreas Schwab, SUSE Labs, schwab@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-09-19 14:28 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-19 7:55 glibc 2.26 deadlock with __resolv_conf_detach Douglas Jacobsen
2019-09-19 8:34 ` Andreas Schwab
2019-09-19 12:40 ` Douglas Jacobsen
2019-09-19 13:02 ` Andreas Schwab
2019-09-19 14:27 ` Douglas Jacobsen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).