[LTP] [RFC] Shell API timeout sleep orphan processes

Joerg Vehlow lkml@jv-coder.de
Tue May 4 10:04:30 CEST 2021


Hi Petr,
>> The kill code is not working as expected, because it only kills the shell
>> process spawned by "sleep $sec && _tst_kill_test &".
>> We are running single ltp tests using robot framework and robot waits until
>> all processes of session have finished.
> Interesting. Do you mean $_tst_setup_timer_pid from _tst_setup_timer was left
> running if the test does not timeout? Because I was not able to find it.
Ups there was a bug in my command. Redirection of the output of the test 
to /dev/null does not trigger the long delay:
Please try with time sh -c './timeout02.sh | cat'
Sorry for that...

The line "sleep $sec && _tst_kill_test &" spawns two processes:
sleep and a shell process, that is (probably) forked from the running 
shell. The pid returned by $! is the pid of this shell.
When killing the timeout process, only this shell process, but not the 
sleep is killed. That is also were the slowdown comes from.

However, this might be shell implementation specific. At least for 
busybox sh and I think dash and bash the behavior is the same.

> Interesting slowdown. It looks to me it's exit $ret in final _tst_do_exit()
> takes so much time. I have no idea why, but it was here before 25ad54dba
> ("tst_test.sh: Run cleanup also after test timeout").
I think what actually is consuming the time is the sleep process, that 
has stdout still opened.
Redirecting the output of sleep to /dev/null, fixes the hanging, but 
there is still the orphaned sleep process lingering around.
Try "sleep $sec >/dev/null && _tst_kill_test &"

$ ps; time sh -c 'PATH="$PWD:$PWD/../../../testcases/lib/:$PATH" 
./timeout02.sh | cat' ; ps
     PID TTY          TIME CMD
    2352 pts/5    00:00:00 bash
   19981 pts/5    00:00:00 ps
timeout02 1 TINFO: timeout per run is 0h 0m 2s
timeout02 1 TPASS: timeout 2 set (LTP_TIMEOUT_MUL='1')

Summary:
passed   1
failed   0
broken   0
skipped  0
warnings 0

real    0m0,013s
user    0m0,012s
sys    0m0,005s
     PID TTY          TIME CMD
    2352 pts/5    00:00:00 bash
   19998 pts/5    00:00:00 sleep
   20001 pts/5    00:00:00 ps

>> The only way to fix this really portable I can think of is moving the
>> timeout code (including the logic in _tst_kill_test) into c code. This way
>> there would only be one binary, that can be killed flawlessly.
> Maybe set -m would be enough. But sure, rewriting C is usually the best approach
> for shell problems, we use quite a lot of C helpers for shell already.
I will send the patch, if this introduces any new issues, we can still 
switch to a c based implementation.

Jörg


More information about the ltp mailing list