[LTP] [RFCŊ[DO_NOT_MERGE][PATCH v2 1/1] netstress: Fix race between SETSID() and exit(0)

Petr Vorel pvorel@suse.cz
Fri Feb 18 17:48:45 CET 2022


******* DO NOT MERGE *******

There is a race between the SETSID() and exit(0) in move_to_background()
caused by "Killed the leftover descendant processes" introduced in
72b172867 ("Terminate leftover subprocesses when main test process
crashes").

If the main test process calls exit(0) before the newly forked child
managed to do SETSID() it's killed by the test library because it's
still in the old process group. Therefore use TST_CHECKPOINT_{WAIT,WAKE}()
to avoid the race.

Link: https://lore.kernel.org/ltp/Yg+RXbUTOxK56iZa@pevik/

Suggested-by: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: Petr Vorel <pvorel@suse.cz>
---
This patch somehow cause server to think that "client asks to terminate"
server_fn():

void *server_fn(void *cfd)
{
	...
	/* client asks to terminate */
	if (recv_msg[0] == start_fin_byte) {
		tst_res(TINFO, "client asks to terminate");
		goto out;
	}

This is really strange, because because server shouldn't do anything
before exit(), I miss something here.

Kind regards,
Petr

tcp_ipsec 1 TINFO: timeout per run is 0h 5m 0s
tcp_ipsec 1 TINFO: run server 'netstress -D ltp_ns_veth1 -R 10 -B /tmp/LTP_tcp_ipsec.9uqaI9HX3i'
tcp_ipsec 1 TINFO: run client 'netstress -l -H 10.0.0.1 -n 100 -N 100 -D ltp_ns_veth2 -a 2 -r 100 -d tst_netload.res' 5 times

=> PROBLEM HERE
netstress.c:588: TINFO: client asks to terminate
netstress.c:644: TBROK: Server closed

tst_test.c:1457: TINFO: Timeout per run is 0h 05m 00s
netstress.c:901: TINFO: connection: addr '10.0.0.1', port '34989'
netstress.c:902: TINFO: client max req: 100
netstress.c:903: TINFO: clients num: 2
netstress.c:908: TINFO: client msg size: 100
netstress.c:909: TINFO: server msg size: 100
netstress.c:823: TINFO: tcp_tw_reuse is already set
netstress.c:953: TINFO: TCP client is using old TCP API.
netstress.c:795: TINFO: '/proc/sys/net/ipv4/tcp_fastopen' is 1
netstress.c:476: TINFO: Running the test over IPv4
netstress.c:503: TINFO: total time '4' ms
netstress.c:521: TPASS: test completed

Summary:
passed   1
failed   0
broken   0
skipped  0
warnings 0
tcp_ipsec 1 TFAIL: can't read tst_netload.res
tcp_ipsec 1 TINFO: AppArmor enabled, this may affect test results
tcp_ipsec 1 TINFO: it can be disabled with TST_DISABLE_APPARMOR=1 (requires super/root)
tcp_ipsec 1 TINFO: loaded AppArmor profiles: none

 testcases/network/netstress/netstress.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/testcases/network/netstress/netstress.c b/testcases/network/netstress/netstress.c
index 0914c65bd4..6c9e83112e 100644
--- a/testcases/network/netstress/netstress.c
+++ b/testcases/network/netstress/netstress.c
@@ -713,11 +713,15 @@ static void server_cleanup(void)
 
 static void move_to_background(void)
 {
-	if (SAFE_FORK())
+	if (SAFE_FORK()) {
+		TST_CHECKPOINT_WAIT(0);
 		exit(0);
+	}
 
 	SAFE_SETSID();
 
+	TST_CHECKPOINT_WAKE(0);
+
 	close(STDIN_FILENO);
 	SAFE_OPEN("/dev/null", O_RDONLY);
 	close(STDOUT_FILENO);
@@ -1024,4 +1028,5 @@ static struct tst_test test = {
 		{"B:", &server_bg, "Run in background, arg is the process directory"},
 		{}
 	},
+	.needs_checkpoints = 1,
 };
-- 
2.35.1



More information about the ltp mailing list