[LTP] [PATCH] Terminate leftover subprocesses when main test process crashes

Li Wang liwang@redhat.com
Fri Feb 11 12:01:39 CET 2022


On Fri, Feb 11, 2022 at 6:34 PM Li Wang <liwang@redhat.com> wrote:

>
>
> On Fri, Feb 11, 2022 at 5:17 PM Martin Doucha <mdoucha@suse.cz> wrote:
>
>> On 11. 02. 22 7:47, Li Wang wrote:
>> > On Fri, Feb 11, 2022 at 12:18 AM Martin Doucha <mdoucha@suse.cz
>> > <mailto:mdoucha@suse.cz>> wrote:
>> >     @@ -1560,6 +1568,7 @@ void tst_run_tcases(int argc, char *argv[],
>> >     struct tst_test *self)
>> >
>> >             SAFE_SIGNAL(SIGALRM, alarm_handler);
>> >             SAFE_SIGNAL(SIGUSR1, heartbeat_handler);
>> >     +       SAFE_SIGNAL(SIGCHLD, sigchild_handler);
>> >
>> >
>> > Do we really need setup this signal handler for SIGCHILD?
>> >
>> > Since we have already called 'SAFE_WAITPID(test_pid, &status, 0)'
>> > in the library process (lib_pid) which rely on SIGCHILD as well.
>> > And even this handler will be called everytime when test exit normally.
>> >
>> > Maybe better just add a kill function to cleanup the remain
>> > descendants if main test process exit with abonormal status.
>> >
>> > e.g.
>> >
>> > --- a/lib/tst_test.c
>> > +++ b/lib/tst_test.c
>> > @@ -1503,6 +1503,8 @@ static int fork_testrun(void)
>> >         if (WIFEXITED(status) && WEXITSTATUS(status))
>> >                 return WEXITSTATUS(status);
>> >
>> > +       kill(-test_pid, SIGKILL);
>>
>> This will not work because at this point, the child process was already
>> destroyed by waitpid() and all its remaining children were moved under
>
> PID 1 (init). The only place where the grandchildren are still reachable
>> this way is in SIGCHLD handler while the dead child process still exists
>> in zombie state.
>
>
> Signal communicatoin is asynchronous processing, setup SIGCHILD
> handler can not 100% garantee the libarary process response
> in time as well.
>
> Though the test_pid being moved under PID 1(init), kill(-test_pid, SIGKILL)
> still works well for killing them. That beacuse the dead child process
> still
> exists until kernel recliam its all parent.
>


I give 5 seconds sleep before sending SIGKILL in lib-process
and modified the test_children_cleanup.c to print ppid each 1sec
to verify this:

# ./test_children_cleanup
tst_test.c:1452: TINFO: Timeout per run is 0h 00m 10s
test_children_cleanup.c:20: TINFO: Main process 173236 starting
test_children_cleanup.c:39: TINFO: Forked child 173238
test_children_cleanup.c:33: TINFO: ppid is 173236
test_children_cleanup.c:33: TINFO: ppid is 1
test_children_cleanup.c:33: TINFO: ppid is 1
test_children_cleanup.c:33: TINFO: ppid is 1
test_children_cleanup.c:33: TINFO: ppid is 1
tst_test.c:1502: TINFO: If you are running on slow machine, try exporting
LTP_TIMEOUT_MUL > 1
tst_test.c:1504: TBROK: Test killed! (timeout?)

Summary:
passed   0
failed   0
broken   1
skipped  0
warnings 0
=======

--- a/lib/newlib_tests/test_children_cleanup.c
+++ b/lib/newlib_tests/test_children_cleanup.c
@@ -28,7 +28,11 @@ static void run(void)

        /* Start child that will outlive the main test process */
        if (!child_pid) {
-               sleep(30);
+               int i;
+               for (i = 0; i < 30; i++) {
+                       tst_res(TINFO, "ppid is %d", getppid());
+                       sleep(1);
+               }
                return;
        }

diff --git a/lib/tst_test.c b/lib/tst_test.c
index 84ce0a5d3..6f2d93611 100644
--- a/lib/tst_test.c
+++ b/lib/tst_test.c
@@ -1503,6 +1503,9 @@ static int fork_testrun(void)
        if (WIFEXITED(status) && WEXITSTATUS(status))
                return WEXITSTATUS(status);

+       sleep(5);
+       kill(-test_pid, SIGKILL);
+
        if (WIFSIGNALED(status) && WTERMSIG(status) == SIGKILL) {
                tst_res(TINFO, "If you are running on slow machine, "
                               "try exporting LTP_TIMEOUT_MUL > 1");


-- 
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20220211/498062cd/attachment.htm>


More information about the ltp mailing list