[LTP] [PATCH v2] tst_test: using SIGTERM to terminate process
Petr Vorel
pvorel@suse.cz
Thu Sep 16 00:40:28 CEST 2021
Hi Li, all,
> We'd better avoid using SIGINT for process terminating becuasue,
> it has different behavoir on kind of shell.
> From Joerg Vehlow's test:
> - bash does not seem to care about SIGINT delivery to background
> processes, but can be blocked using trap
> - zsh ignores SIGINT for background processes by default, but can be
> allowed using trap
> - dash and busybox sh ignore the signal to background processes, and
> this cannot be changed with trap
> This patch cover the below situations:
> 1. SIGINT (Ctrl^C) for terminating the main process and do cleanup
> correctly before a timeout
> 2. Test finish normally and retrieves the _tst_timeout_process in the
> background via SIGTERM(sending by _tst_cleanup_timer)
> 3. Test timed out occurs and _tst_kill_test sending SIGTERM to
> terminating all process, and the main process do cleanup work
> 4. Test timed out occurs but still have process alive after _tst_kill_test
> sending SIGTERM, then sending SIGKILL to the whole group
> 5. Test terminated by SIGTERM unexpectly (e.g. system shutdown or process
> manager) and do cleanup work as well
> Co-authored-by: Joerg Vehlow <joerg.vehlow@aox-tech.de>
> Signed-off-by: Li Wang <liwang@redhat.com>
> Reviewed-by: Joerg Vehlow <joerg.vehlow@aox-tech.de>
...
> +++ b/testcases/lib/tst_test.sh
> @@ -21,7 +21,8 @@ export TST_LIB_LOADED=1
> . tst_security.sh
> # default trap function
> -trap "tst_brk TBROK 'test interrupted or timed out'" INT
> +trap "tst_brk TBROK 'test interrupted'" INT
> +trap "unset _tst_setup_timer_pid; tst_brk TBROK 'test terminated'" TERM
FYI this commit (merged as 4a6b8a697 ("tst_test: using SIGTERM to terminate process"))
broke net_stress_interface tests, particularly tst_require_cmds() call (which
calls tst_brk TCONF:
# ./if-addr-adddel.sh -c ifconfig
if-addr-adddel 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
if-addr-adddel 1 TINFO: add local addr 10.0.0.2/24
if-addr-adddel 1 TINFO: add local addr fd00:1:1:1::2/64
if-addr-adddel 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
if-addr-adddel 1 TINFO: add remote addr 10.0.0.1/24
if-addr-adddel 1 TINFO: add remote addr fd00:1:1:1::1/64
if-addr-adddel 1 TINFO: Network config (local -- remote):
if-addr-adddel 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
if-addr-adddel 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
if-addr-adddel 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
if-addr-adddel 1 TINFO: timeout per run is 0h 5m 0s
if-addr-adddel 1 TCONF: 'ifconfig' not found
=> waits till timeout
if-addr-adddel 1 TBROK: Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
if-addr-adddel 1 TWARN: test terminated
Debugging it hangs in wait in _tst_cleanup_timer():
kill -TERM $_tst_setup_timer_pid 2>/dev/null
wait $_tst_setup_timer_pid 2>/dev/null
because kill does not kill the test.
The problem looks to be that unset actually does not work.
trap "unset _tst_setup_timer_pid; tst_brk TBROK 'test terminated'" TERM
It looks to be something setup specific, because I discovered this on SLES on
both bash and dash. Running it on current Debian testing it works on both bash
and dash. I checked shopt output on both, but don't see anything obvious. It
must be something else.
Kind regards,
Petr
> _tst_do_exit()
> {
> @@ -439,9 +440,9 @@ _tst_kill_test()
> {
> local i=10
> - trap '' INT
> - tst_res TBROK "Test timeouted, sending SIGINT! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1"
> - kill -INT -$pid
> + trap '' TERM
> + tst_res TBROK "Test timed out, sending SIGTERM! If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1"
> + kill -TERM -$pid
> tst_sleep 100ms
> while kill -0 $pid >/dev/null 2>&1 && [ $i -gt 0 ]; do
More information about the ltp
mailing list