[LTP] [RFC] Shell API timeout sleep orphan processes
Petr Vorel
pvorel@suse.cz
Tue May 4 08:52:17 CEST 2021
Hi Joerg,
[ Cc: Cyril and Li ]
> I am looking into getting rid of our custom patches for ltp.
> One of these patches fixes the problem, that the timeout sleep process is
> orphaned, if the test does not timeout.
> The kill code is not working as expected, because it only kills the shell
> process spawned by "sleep $sec && _tst_kill_test &".
> We are running single ltp tests using robot framework and robot waits until
> all processes of session have finished.
Interesting. Do you mean $_tst_setup_timer_pid from _tst_setup_timer was left
running if the test does not timeout? Because I was not able to find it.
> This can also be seen by piping the output of a testrun into cat (eg. with
> timeout02.sh from newlib_test/shell):
> $ time sh -c './timeout02.sh >/dev/null | cat'
> timeout02 1 TINFO: timeout per run is 0h 0m 2s
> timeout02 1 TPASS: timeout 2 set (LTP_TIMEOUT_MUL='1')
> [snip]
> real 0m2,011s
> The test does nothing, and completes in < 100ms. This can be seen without
> piping through cat:
> time sh -c 'PATH="$PWD:$PWD/../../../testcases/lib/:$PATH" ./timeout02.sh'
> timeout02 1 TINFO: timeout per run is 0h 0m 2s
> timeout02 1 TPASS: timeout 2 set (LTP_TIMEOUT_MUL='1')
> [snip]
> real 0m0,010s
Interesting slowdown. It looks to me it's exit $ret in final _tst_do_exit()
takes so much time. I have no idea why, but it was here before 25ad54dba
("tst_test.sh: Run cleanup also after test timeout").
> I am not sure what the best approach for fixing these sleep orphans is. Out
> patch uses "set -m" around the start of the timer, this makes most of the
> shells create a new process group, but it failed (at least did not work) in
> zsh. The killing of the timeout process is then changed to kill the process
> group (kill -- -$_tst_setup_timer_pid).
> This works fine at least for some shells.
Please do send the patch. "set -m" is supported by dash and busbox sh, IMHO it's
safe to use it.
> The only way to fix this really portable I can think of is moving the
> timeout code (including the logic in _tst_kill_test) into c code. This way
> there would only be one binary, that can be killed flawlessly.
Maybe set -m would be enough. But sure, rewriting C is usually the best approach
for shell problems, we use quite a lot of C helpers for shell already.
> Do you have any other idea or do you think this "bug" is not relevant enough
> to be fixed?
Kind regards,
Petr
> Thanks,
> Joerg
More information about the ltp
mailing list