* [PATCH] support: Kill process group for test failure
@ 2020-02-20 14:34 Adhemerval Zanella
2020-02-20 14:53 ` Carlos O'Donell
2020-02-20 15:04 ` Florian Weimer
0 siblings, 2 replies; 6+ messages in thread
From: Adhemerval Zanella @ 2020-02-20 14:34 UTC (permalink / raw)
To: libc-alpha
Some testcases that create multiple subprocesses might abort or exit
prior waiting for their children. In such case, support_test_main
does not try to kill the spawned test process group (as in the
test timeout case).
On example that we are observing in internal tests is when
malloc/tst-mallocfork2 fails to fork in the signal handling (due
either maximum number of process or other non expected failure).
This patch kill the process group in the case of failed execution,
similar on how it is done on timeout.
Checked on x86_64-linux-gnu.
---
support/support_test_main.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/support/support_test_main.c b/support/support_test_main.c
index e3f0bf15f2..ac9f710fb7 100644
--- a/support/support_test_main.c
+++ b/support/support_test_main.c
@@ -459,6 +459,9 @@ support_test_main (int argc, char **argv, const struct test_config *config)
/* Process terminated normaly without timeout etc. */
if (WIFEXITED (status))
{
+ /* It is expected that a successful test execution handles all its
+ children. */
+
if (config->expected_status == 0)
{
if (config->expected_signal == 0)
@@ -486,6 +489,10 @@ support_test_main (int argc, char **argv, const struct test_config *config)
/* Process was killed by timer or other signal. */
else
{
+ /* Kill the whole process group if test process aborts or exits prior
+ waiting for them. */
+ kill (-test_pid, SIGKILL);
+
if (config->expected_signal == 0)
{
printf ("Didn't expect signal from child: got `%s'\n",
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] support: Kill process group for test failure
2020-02-20 14:34 [PATCH] support: Kill process group for test failure Adhemerval Zanella
@ 2020-02-20 14:53 ` Carlos O'Donell
2020-02-20 15:14 ` Carlos O'Donell
2020-02-20 15:04 ` Florian Weimer
1 sibling, 1 reply; 6+ messages in thread
From: Carlos O'Donell @ 2020-02-20 14:53 UTC (permalink / raw)
To: Adhemerval Zanella, libc-alpha
On 2/20/20 9:34 AM, Adhemerval Zanella wrote:
> Some testcases that create multiple subprocesses might abort or exit
> prior waiting for their children. In such case, support_test_main
> does not try to kill the spawned test process group (as in the
> test timeout case).
>
> On example that we are observing in internal tests is when
> malloc/tst-mallocfork2 fails to fork in the signal handling (due
> either maximum number of process or other non expected failure).
>
> This patch kill the process group in the case of failed execution,
> similar on how it is done on timeout.
LGTM.
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
> Checked on x86_64-linux-gnu.
> ---
> support/support_test_main.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/support/support_test_main.c b/support/support_test_main.c
> index e3f0bf15f2..ac9f710fb7 100644
> --- a/support/support_test_main.c
> +++ b/support/support_test_main.c
> @@ -459,6 +459,9 @@ support_test_main (int argc, char **argv, const struct test_config *config)
> /* Process terminated normaly without timeout etc. */
> if (WIFEXITED (status))
> {
> + /* It is expected that a successful test execution handles all its
> + children. */
OK.
> +
> if (config->expected_status == 0)
> {
> if (config->expected_signal == 0)
> @@ -486,6 +489,10 @@ support_test_main (int argc, char **argv, const struct test_config *config)
> /* Process was killed by timer or other signal. */
> else
> {
> + /* Kill the whole process group if test process aborts or exits prior
> + waiting for them. */
> + kill (-test_pid, SIGKILL);
OK. Send negation of process group id to kill the whole process group. The
process group id is the same as the id of the process that created the group
so test_pid is the right value.
Notes:
- Should we be using test_pgid to make this clear?
- Should this cleanup be refactored a bit to avoid duplication from
signal_handler() and support_test_main() e.g. kill_process_group ()
which runs kill looks for errors prints diagnostic etc.
> +
> if (config->expected_signal == 0)
> {
> printf ("Didn't expect signal from child: got `%s'\n",
>
--
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] support: Kill process group for test failure
2020-02-20 14:34 [PATCH] support: Kill process group for test failure Adhemerval Zanella
2020-02-20 14:53 ` Carlos O'Donell
@ 2020-02-20 15:04 ` Florian Weimer
2020-02-20 15:12 ` Carlos O'Donell
1 sibling, 1 reply; 6+ messages in thread
From: Florian Weimer @ 2020-02-20 15:04 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> Some testcases that create multiple subprocesses might abort or exit
> prior waiting for their children. In such case, support_test_main
> does not try to kill the spawned test process group (as in the
> test timeout case).
Does this actually work? Is the process group preserved if a process is
reparented to init?
Thanks,
Florian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] support: Kill process group for test failure
2020-02-20 15:04 ` Florian Weimer
@ 2020-02-20 15:12 ` Carlos O'Donell
2020-02-20 16:50 ` Adhemerval Zanella
0 siblings, 1 reply; 6+ messages in thread
From: Carlos O'Donell @ 2020-02-20 15:12 UTC (permalink / raw)
To: Florian Weimer; +Cc: Adhemerval Zanella, GNU C Library
On Thu, Feb 20, 2020 at 10:05 AM Florian Weimer <fweimer@redhat.com> wrote:
>
> * Adhemerval Zanella:
>
> > Some testcases that create multiple subprocesses might abort or exit
> > prior waiting for their children. In such case, support_test_main
> > does not try to kill the spawned test process group (as in the
> > test timeout case).
>
> Does this actually work? Is the process group preserved if a process is
> reparented to init?
No, you are right.
There is a race too which I didn't see.
Once you waitpid the pid and pgid might be free for reuse and we can't
guarantee this will work.
Cheers,
Carlos.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] support: Kill process group for test failure
2020-02-20 14:53 ` Carlos O'Donell
@ 2020-02-20 15:14 ` Carlos O'Donell
0 siblings, 0 replies; 6+ messages in thread
From: Carlos O'Donell @ 2020-02-20 15:14 UTC (permalink / raw)
To: Adhemerval Zanella, GNU C Library
On Thu, Feb 20, 2020 at 9:53 AM Carlos O'Donell <codonell@redhat.com> wrote:
>
> On 2/20/20 9:34 AM, Adhemerval Zanella wrote:
> > Some testcases that create multiple subprocesses might abort or exit
> > prior waiting for their children. In such case, support_test_main
> > does not try to kill the spawned test process group (as in the
> > test timeout case).
> >
> > On example that we are observing in internal tests is when
> > malloc/tst-mallocfork2 fails to fork in the signal handling (due
> > either maximum number of process or other non expected failure).
> >
> > This patch kill the process group in the case of failed execution,
> > similar on how it is done on timeout.
>
> LGTM.
>
> Reviewed-by: Carlos O'Donell <carlos@redhat.com>
I'm withdrawing my reviewed-by here, since there is a race.
Florian highlighted that the children are all going to be reparented
to init and that therefore we can't catch them anymore.
The only plausible solution here is to use the controlling terminal to
kill the orphan children.
> > Checked on x86_64-linux-gnu.
> > ---
> > support/support_test_main.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/support/support_test_main.c b/support/support_test_main.c
> > index e3f0bf15f2..ac9f710fb7 100644
> > --- a/support/support_test_main.c
> > +++ b/support/support_test_main.c
> > @@ -459,6 +459,9 @@ support_test_main (int argc, char **argv, const struct test_config *config)
> > /* Process terminated normaly without timeout etc. */
> > if (WIFEXITED (status))
> > {
> > + /* It is expected that a successful test execution handles all its
> > + children. */
>
> OK.
>
> > +
> > if (config->expected_status == 0)
> > {
> > if (config->expected_signal == 0)
> > @@ -486,6 +489,10 @@ support_test_main (int argc, char **argv, const struct test_config *config)
> > /* Process was killed by timer or other signal. */
> > else
> > {
> > + /* Kill the whole process group if test process aborts or exits prior
> > + waiting for them. */
> > + kill (-test_pid, SIGKILL);
>
> OK. Send negation of process group id to kill the whole process group. The
> process group id is the same as the id of the process that created the group
> so test_pid is the right value.
>
> Notes:
> - Should we be using test_pgid to make this clear?
> - Should this cleanup be refactored a bit to avoid duplication from
> signal_handler() and support_test_main() e.g. kill_process_group ()
> which runs kill looks for errors prints diagnostic etc.
>
> > +
> > if (config->expected_signal == 0)
> > {
> > printf ("Didn't expect signal from child: got `%s'\n",
> >
>
>
> --
> Cheers,
> Carlos.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] support: Kill process group for test failure
2020-02-20 15:12 ` Carlos O'Donell
@ 2020-02-20 16:50 ` Adhemerval Zanella
0 siblings, 0 replies; 6+ messages in thread
From: Adhemerval Zanella @ 2020-02-20 16:50 UTC (permalink / raw)
To: Carlos O'Donell, Florian Weimer; +Cc: GNU C Library
On 20/02/2020 12:12, Carlos O'Donell wrote:
> On Thu, Feb 20, 2020 at 10:05 AM Florian Weimer <fweimer@redhat.com> wrote:
>>
>> * Adhemerval Zanella:
>>
>>> Some testcases that create multiple subprocesses might abort or exit
>>> prior waiting for their children. In such case, support_test_main
>>> does not try to kill the spawned test process group (as in the
>>> test timeout case).
>>
>> Does this actually work? Is the process group preserved if a process is
>> reparented to init?
>
> No, you are right.
>
> There is a race too which I didn't see.
>
> Once you waitpid the pid and pgid might be free for reuse and we can't
> guarantee this will work.
>
I tested by explicit injecting faulty fork calls with:
diff --git a/malloc/tst-mallocfork2.c b/malloc/tst-mallocfork2.c
index 0602a94895..c14cf9ef41 100644
--- a/malloc/tst-mallocfork2.c
+++ b/malloc/tst-mallocfork2.c
@@ -65,10 +65,13 @@ static volatile sig_atomic_t progress_indicator = 1;
static void
sigusr1_handler (int signo)
{
+ static int count = 0;
sigusr1_received = 1;
/* Perform a fork with a trivial subprocess. */
pid_t pid = fork ();
+ if (++count == 100)
+ pid = -1;
if (pid == -1)
{
write_message ("error: fork\n");
And without killing the process groups I see:
azanella 18876 6236 18874 7869 0 13:41 pts/0 00:00:00 ./elf/ld-linux-x86-64.so.2 --library-path .:./math:./elf:./dlfcn:./nss:./nis:./rt:./resolv:./mathvec:./support:./crypt:./nptl malloc/tst-mallocfork2
azanella 18878 6236 18874 7869 0 13:41 pts/0 00:00:00 ./elf/ld-linux-x86-64.so.2 --library-path .:./math:./elf:./dlfcn:./nss:./nis:./rt:./resolv:./mathvec:./support:./crypt:./nptl malloc/tst-mallocfork2
azanella 18879 6236 18874 7869 0 13:41 pts/0 00:00:00 ./elf/ld-linux-x86-64.so.2 --library-path .:./math:./elf:./dlfcn:./nss:./nis:./rt:./resolv:./mathvec:./support:./crypt:./nptl malloc/tst-mallocfork2
azanella 18880 6236 18874 7869 0 13:41 pts/0 00:00:00 ./elf/ld-linux-x86-64.so.2 --library-path .:./math:./elf:./dlfcn:./nss:./nis:./rt:./resolv:./mathvec:./support:./crypt:./nptl malloc/tst-mallocfork2
azanella 18881 6236 18874 7869 0 13:41 pts/0 00:00:00 ./elf/ld-linux-x86-64.so.2 --library-path .:./math:./elf:./dlfcn:./nss:./nis:./rt:./resolv:./mathvec:./support:./crypt:./nptl malloc/tst-mallocfork2
So the process group is still preserved even when it is reparented
to init (6236 is systemd in my case). In any case, as Carlos pointed
out the issue is the possible race when the process fails, but it
has not spawn any process and thus its group id might be reused.
I would aiming to add a more generic solution, but it seems that
the testcase itself would need to handle it on the abort situations.
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-02-20 16:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-20 14:34 [PATCH] support: Kill process group for test failure Adhemerval Zanella
2020-02-20 14:53 ` Carlos O'Donell
2020-02-20 15:14 ` Carlos O'Donell
2020-02-20 15:04 ` Florian Weimer
2020-02-20 15:12 ` Carlos O'Donell
2020-02-20 16:50 ` Adhemerval Zanella
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).