unofficial mirror of libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* glibc 2.26 deadlock with __resolv_conf_detach
@ 2019-09-19  7:55 Douglas Jacobsen
  2019-09-19  8:34 ` Andreas Schwab
  0 siblings, 1 reply; 5+ messages in thread
From: Douglas Jacobsen @ 2019-09-19  7:55 UTC (permalink / raw)
  To: libc-alpha

Hello,

We've found a possible deadlock in glibc 2.26 that is shipped with
SLES 15.  The scenario we've uncovered is that some vendor software
runs a multithreaded daemon, which then fork()s, and spawns a thread,
then waits for that thread to terminate (pthread_join()), and finally
takes action like execve()ing.

The specific stack trace we see in the deadlock is:

Thread 2 (Thread 0x2aaacca09700 (LWP 140473)):
#0  0x00002aaaac483b9c in __lll_lock_wait_private () from /lib64/libc.so.6
#1  0x00002aaaac498148 in get_locked_global () from /lib64/libc.so.6
#2  0x00002aaaac499139 in __resolv_conf_detach () from /lib64/libc.so.6
#3  0x00002aaaac4e4aba in res_thread_freeres () from /lib64/libc.so.6
#4  0x00002aaaac4e4a62 in __libc_thread_freeres () from /lib64/libc.so.6
#5  0x00002aaaac16758e in start_thread () from /lib64/libpthread.so.0
#6  0x00002aaaac476a2f in clone () from /lib64/libc.so.6

The other thread is just waiting on this one to join.  I'm fairly
unclear as to why the static "lock" variable in resolv_conf.c is
deadlocking here, unless there is some connection to that earlier
fork(), or there is some other mechanism I do not understand (which is
very possible).

In any case, it looks like sometime after glibc 2.26 was released, a
further update (124e0258) was made to this code to explicitly order
how these freeres functions were called.  Was this done to address
this kind of deadlock scenario or for a different reason?

commit 124e025864bb39732c71fc60c1443d5680881a0a
Author: Florian Weimer <fweimer@redhat.com>
Date:   Tue Jun 26 15:13:54 2018 +0200

    Run thread shutdown functions in an explicit order

    This removes the __libc_thread_subfreeres hook in favor of explict
    calls.

    Reviewed-by: Carlos O'Donell <carlos@redhat.com>



Thanks,
Doug

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: glibc 2.26 deadlock with __resolv_conf_detach
  2019-09-19  7:55 glibc 2.26 deadlock with __resolv_conf_detach Douglas Jacobsen
@ 2019-09-19  8:34 ` Andreas Schwab
  2019-09-19 12:40   ` Douglas Jacobsen
  0 siblings, 1 reply; 5+ messages in thread
From: Andreas Schwab @ 2019-09-19  8:34 UTC (permalink / raw)
  To: Douglas Jacobsen; +Cc: libc-alpha

On Sep 19 2019, Douglas Jacobsen <dmjacobsen@lbl.gov> wrote:

> The scenario we've uncovered is that some vendor software runs a
> multithreaded daemon, which then fork()s, and spawns a thread,

In the forked child?  If a multi-threaded process calls fork, the child
may only call async-signal-safe functions.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: glibc 2.26 deadlock with __resolv_conf_detach
  2019-09-19  8:34 ` Andreas Schwab
@ 2019-09-19 12:40   ` Douglas Jacobsen
  2019-09-19 13:02     ` Andreas Schwab
  0 siblings, 1 reply; 5+ messages in thread
From: Douglas Jacobsen @ 2019-09-19 12:40 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: libc-alpha

On Thu, Sep 19, 2019 at 1:34 AM Andreas Schwab <schwab@suse.de> wrote:
>
> On Sep 19 2019, Douglas Jacobsen <dmjacobsen@lbl.gov> wrote:
>
> > The scenario we've uncovered is that some vendor software runs a
> > multithreaded daemon, which then fork()s, and spawns a thread,
>
> In the forked child?  If a multi-threaded process calls fork, the child
> may only call async-signal-safe functions.
>

Yes, the forked child is generating and joining a thread prior to
execve(), and then the child deadlocks.  This code was not generating
problems in SLES12sp3 so it has the appearance of a new problem, but I
do understand what you're saying about async-signal-safe here.  Would
the mechanism of failure here be that the lock variable in
glibc/resolv/resolv_conf.c is already locked in the parent process at
the time of fork() - or whenever the lookup is initially done from the
parent memory?

Thanks,
Doug

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: glibc 2.26 deadlock with __resolv_conf_detach
  2019-09-19 12:40   ` Douglas Jacobsen
@ 2019-09-19 13:02     ` Andreas Schwab
  2019-09-19 14:27       ` Douglas Jacobsen
  0 siblings, 1 reply; 5+ messages in thread
From: Andreas Schwab @ 2019-09-19 13:02 UTC (permalink / raw)
  To: Douglas Jacobsen; +Cc: libc-alpha

On Sep 19 2019, Douglas Jacobsen <dmjacobsen@lbl.gov> wrote:

> Would the mechanism of failure here be that the lock variable in
> glibc/resolv/resolv_conf.c is already locked in the parent process at
> the time of fork()

Yes, that is the most likely cause.  Since fork only duplicates the
calling thread, there will be nothing left to release the lock.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: glibc 2.26 deadlock with __resolv_conf_detach
  2019-09-19 13:02     ` Andreas Schwab
@ 2019-09-19 14:27       ` Douglas Jacobsen
  0 siblings, 0 replies; 5+ messages in thread
From: Douglas Jacobsen @ 2019-09-19 14:27 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: libc-alpha

Thanks, Andreas.  I appreciate your help and responses.  If we have
further reason to think this might be a glibc bug, I'll update here
again.

-Doug

On Thu, Sep 19, 2019 at 6:02 AM Andreas Schwab <schwab@suse.de> wrote:
>
> On Sep 19 2019, Douglas Jacobsen <dmjacobsen@lbl.gov> wrote:
>
> > Would the mechanism of failure here be that the lock variable in
> > glibc/resolv/resolv_conf.c is already locked in the parent process at
> > the time of fork()
>
> Yes, that is the most likely cause.  Since fork only duplicates the
> calling thread, there will be nothing left to release the lock.
>
> Andreas.
>
> --
> Andreas Schwab, SUSE Labs, schwab@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-09-19 14:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-19  7:55 glibc 2.26 deadlock with __resolv_conf_detach Douglas Jacobsen
2019-09-19  8:34 ` Andreas Schwab
2019-09-19 12:40   ` Douglas Jacobsen
2019-09-19 13:02     ` Andreas Schwab
2019-09-19 14:27       ` Douglas Jacobsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).