[LTP] [RFC??[DO_NOT_MERGE][PATCH v2 1/1] netstress: Fix race between SETSID() and exit(0)

Petr Vorel pvorel@suse.cz
Mon Feb 21 14:13:36 CET 2022


Hi Cyril,

> Hi!
> Uff, I did found the root cause after debugging for a while.
Thanks a lot!

> The network tests rely a lot on passing data between processes by files
> by a local directory and .needs_checkpoints causes the test to run under
> a different directory, that is because checkpoints needs a backing file
> that is mapped into the process memory. And so after setting
> .need_checkpoints the client was storing the file into a different
> directory.

> This should be a minimal fix:

> diff --git a/testcases/lib/tst_net.sh b/testcases/lib/tst_net.sh
> index 047686dc3..891472c8a 100644
> --- a/testcases/lib/tst_net.sh
> +++ b/testcases/lib/tst_net.sh
> @@ -715,7 +715,7 @@ tst_netload()
>         fi

>         s_opts="${cs_opts}${s_opts}-R $s_replies -B $TST_TMPDIR"
> -       c_opts="${cs_opts}${c_opts}-a $c_num -r $((c_requests / run_cnt)) -d $rfile"
> +       c_opts="${cs_opts}${c_opts}-a $c_num -r $((c_requests / run_cnt)) -d $PWD/$rfile"

>         tst_res_ TINFO "run server 'netstress $s_opts'"
>         tst_res_ TINFO "run client 'netstress -l $c_opts' $run_cnt times"


Yes, this looks like enough. Do you want me to merge this proposal with added
this change? Or you send a patch or just merge fix yourself?

> However the debugging took longer than I wanted to since the network
> tests are such a mess. The server does exit by TBROK (which looks like
> it's an expected behavior), only half of the sever log is printed on a
> failure, etc. These should really deserve some cleanups...
I'd say specifically tst_netload() (in tst_net.sh) and netstress.c deserve
cleanup. Also, as we noticed several times shell tests tends to be buggy,
specially in combination with C tests. But not sure if feasible to write
everything in C.

Kind regards,
Petr


More information about the ltp mailing list