[LTP] [PATCH] Terminate leftover subprocesses when main test process crashes

Jan Stancek jstancek@redhat.com
Fri Feb 11 08:03:03 CET 2022


On Fri, Feb 11, 2022 at 7:48 AM Li Wang <liwang@redhat.com> wrote:
>
>
>
> On Fri, Feb 11, 2022 at 12:18 AM Martin Doucha <mdoucha@suse.cz> wrote:
>>
>> When the main test process crashes or gets killed e.g. by OOM killer,
>> the watchdog process currently does not clean up any remaining child
>> processes. Fix this by sending SIGKILL to the test process group when
>> the watchdog process gets notified that the main test process has exited
>> for any reason.
>
>
>>
>> --- a/lib/tst_test.c
>> +++ b/lib/tst_test.c
>> @@ -1399,6 +1399,13 @@ static void sigint_handler(int sig LTP_ATTRIBUTE_UNUSED)
>>         }
>>  }
>>
>> +static void sigchild_handler(int sig LTP_ATTRIBUTE_UNUSED)
>> +{
>> +       /* If the test process is dead, send SIGKILL to its children */
>> +       if (kill(test_pid, 0))
>> +               kill(-test_pid, SIGKILL);
>> +}
>> +
>>  unsigned int tst_timeout_remaining(void)
>>  {
>>         static struct timespec now;
>> @@ -1481,6 +1488,7 @@ static int fork_testrun(void)
>>                 tst_disable_oom_protection(0);
>>                 SAFE_SIGNAL(SIGALRM, SIG_DFL);
>>                 SAFE_SIGNAL(SIGUSR1, SIG_DFL);
>> +               SAFE_SIGNAL(SIGCHLD, SIG_DFL);
>>                 SAFE_SIGNAL(SIGINT, SIG_DFL);
>>                 SAFE_SETPGID(0, 0);
>>                 testrun();
>> @@ -1560,6 +1568,7 @@ void tst_run_tcases(int argc, char *argv[], struct tst_test *self)
>>
>>         SAFE_SIGNAL(SIGALRM, alarm_handler);
>>         SAFE_SIGNAL(SIGUSR1, heartbeat_handler);
>> +       SAFE_SIGNAL(SIGCHLD, sigchild_handler);
>
>
> Do we really need setup this signal handler for SIGCHILD?

 I had same question.

>
> Since we have already called 'SAFE_WAITPID(test_pid, &status, 0)'
> in the library process (lib_pid) which rely on SIGCHILD as well.
> And even this handler will be called everytime when test exit normally.
>
> Maybe better just add a kill function to cleanup the remain
> descendants if main test process exit with abonormal status.
>
> e.g.
>
> --- a/lib/tst_test.c
> +++ b/lib/tst_test.c
> @@ -1503,6 +1503,8 @@ static int fork_testrun(void)
>         if (WIFEXITED(status) && WEXITSTATUS(status))
>                 return WEXITSTATUS(status);
>
> +       kill(-test_pid, SIGKILL);

Could we skip the call if forks_child == 0 ?

> +
>         if (WIFSIGNALED(status) && WTERMSIG(status) == SIGKILL) {
>                 tst_res(TINFO, "If you are running on slow machine, "
>                                "try exporting LTP_TIMEOUT_MUL > 1");
>
> --
> Regards,
> Li Wang
>
> --
> Mailing list info: https://lists.linux.it/listinfo/ltp



More information about the ltp mailing list