[LTP] [PATCH] Terminate leftover subprocesses when main test process crashes
Li Wang
liwang@redhat.com
Fri Feb 11 07:47:37 CET 2022
On Fri, Feb 11, 2022 at 12:18 AM Martin Doucha <mdoucha@suse.cz> wrote:
> When the main test process crashes or gets killed e.g. by OOM killer,
> the watchdog process currently does not clean up any remaining child
> processes. Fix this by sending SIGKILL to the test process group when
> the watchdog process gets notified that the main test process has exited
> for any reason.
>
> --- a/lib/tst_test.c
> +++ b/lib/tst_test.c
> @@ -1399,6 +1399,13 @@ static void sigint_handler(int sig
> LTP_ATTRIBUTE_UNUSED)
> }
> }
>
> +static void sigchild_handler(int sig LTP_ATTRIBUTE_UNUSED)
> +{
> + /* If the test process is dead, send SIGKILL to its children */
> + if (kill(test_pid, 0))
> + kill(-test_pid, SIGKILL);
> +}
> +
> unsigned int tst_timeout_remaining(void)
> {
> static struct timespec now;
> @@ -1481,6 +1488,7 @@ static int fork_testrun(void)
> tst_disable_oom_protection(0);
> SAFE_SIGNAL(SIGALRM, SIG_DFL);
> SAFE_SIGNAL(SIGUSR1, SIG_DFL);
> + SAFE_SIGNAL(SIGCHLD, SIG_DFL);
> SAFE_SIGNAL(SIGINT, SIG_DFL);
> SAFE_SETPGID(0, 0);
> testrun();
> @@ -1560,6 +1568,7 @@ void tst_run_tcases(int argc, char *argv[], struct
> tst_test *self)
>
> SAFE_SIGNAL(SIGALRM, alarm_handler);
> SAFE_SIGNAL(SIGUSR1, heartbeat_handler);
> + SAFE_SIGNAL(SIGCHLD, sigchild_handler);
>
Do we really need setup this signal handler for SIGCHILD?
Since we have already called 'SAFE_WAITPID(test_pid, &status, 0)'
in the library process (lib_pid) which rely on SIGCHILD as well.
And even this handler will be called everytime when test exit normally.
Maybe better just add a kill function to cleanup the remain
descendants if main test process exit with abonormal status.
e.g.
--- a/lib/tst_test.c
+++ b/lib/tst_test.c
@@ -1503,6 +1503,8 @@ static int fork_testrun(void)
if (WIFEXITED(status) && WEXITSTATUS(status))
return WEXITSTATUS(status);
+ kill(-test_pid, SIGKILL);
+
if (WIFSIGNALED(status) && WTERMSIG(status) == SIGKILL) {
tst_res(TINFO, "If you are running on slow machine, "
"try exporting LTP_TIMEOUT_MUL > 1");
--
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20220211/773d521d/attachment.htm>
More information about the ltp
mailing list