[LTP] [PATCH 1/2 v2] tst_test: Fail the test subprocess cannot be killed
Li Wang
liwang@redhat.com
Thu Jun 28 11:41:43 CEST 2018
Hi Cyril,
On Wed, Jun 27, 2018 at 11:22 PM, Cyril Hrubis <chrubis@suse.cz> wrote:
> If there are any leftover children the main test process will likely be
> killed while sleeping in wait(). That is because all child processes are
> either waited explicitely by the test code or implicitly by the test
> library.
>
> We also send SIGKILL to the whole process group, so if one of the
> children continues to live for long enough that very likely means that
> it ended up stuck in the kernel.
>
> So if there are any processes left with in the process group we created
> once the process group leader i.e. main test process has been waited
> for we loop for a short while to give the init daemon chance to reap the
> process after it has been reparented and if that does not happen for a
> few seconds we declare the process to be stuck in the kernel.
>
> Signed-off-by: Cyril Hrubis <chrubis@suse.cz>
> CC: Eric Biggers <ebiggers3@gmail.com>
> ---
> lib/tst_test.c | 17 ++++++++++++++++-
> 1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/lib/tst_test.c b/lib/tst_test.c
> index 80808854e..329168a24 100644
> --- a/lib/tst_test.c
> +++ b/lib/tst_test.c
> @@ -1,5 +1,5 @@
> /*
> - * Copyright (c) 2015-2016 Cyril Hrubis <chrubis@suse.cz>
> + * Copyright (c) 2015-2018 Cyril Hrubis <chrubis@suse.cz>
> *
> * This program is free software: you can redistribute it and/or modify
> * it under the terms of the GNU General Public License as published by
> @@ -1047,6 +1047,21 @@ static int fork_testrun(void)
> alarm(0);
> SAFE_SIGNAL(SIGINT, SIG_DFL);
>
> + unsigned int sleep = 100;
> + unsigned int retries = 0;
> +
> + while (kill(-test_pid, 0) == 0) {
I'm a little worried about here, image that, if a process_A(test_pid)
exist to make function kill(-test_pid, 0) return 0 at first time, then
we go into this while loop, but during the sleeping time process_A
exit and system reuse the test_pid to another process_B, we will still
keep looping and very probably make mistake to report TFAIL(with stack
of process_B dump to ltp user in PATCH 2/2).
> +
> + usleep(sleep);
> + sleep*=2;
> +
> + if (retries++ <= 14)
> + continue;
> +
> + tst_res(TFAIL, "Test process child stuck in the kernel!");
> + tst_brk(TFAIL, "Congratulation, likely test hit a kernel bug.");
> + }
> +
> if (WIFEXITED(status) && WEXITSTATUS(status))
> return WEXITSTATUS(status);
>
> --
> 2.13.6
>
>
> --
> Mailing list info: https://lists.linux.it/listinfo/ltp
--
Regards,
Li Wang
More information about the ltp
mailing list