[LTP] TI-RPC test failures; network configuration related?
Chuck Lever III
chuck.lever@oracle.com
Thu Sep 12 17:50:06 CEST 2024
> On Aug 29, 2024, at 3:35 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
>
> For a while now my nightly "runltp -f net.tirpc_tests" have
> thrown a bunch of failures but I haven't had time to look
> into it until now. Without modification, about half of the
> client test programs segfault.
>
> Here's a sample test failure. I instrumented the
> tirpc_clnt_destroy test case and the rpc_tests.sh script as
> shown below, but I still don't understand why clnt_create(3t)
> is failing.
>
> Seems to occur on all recent versions of Fedora with stock
> kernels or custom-built kernels.
>
>
> [root@cel-tirpc ltp]# testcases/bin/rpc_test.sh -s tirpc_svc_2 -c tirpc_clnt_destroy
> rpc_test 1 TINFO: Running: rpc_test.sh -s tirpc_svc_2 -c tirpc_clnt_destroy
> rpc_test 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> rpc_test 1 TINFO: add local addr 10.0.0.2/24
> rpc_test 1 TINFO: add local addr fd00:1:1:1::2/64
> rpc_test 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> rpc_test 1 TINFO: add remote addr 10.0.0.1/24
> rpc_test 1 TINFO: add remote addr fd00:1:1:1::1/64
> rpc_test 1 TINFO: Network config (local -- remote):
> rpc_test 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> rpc_test 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> rpc_test 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> rpc_test 1 TINFO: timeout per run is 0h 5m 0s
> rpc_test 1 TINFO: check registered RPC with rpcinfo
> rpc_test 1 TINFO: registered RPC:
> program vers proto port service
> 100000 4 tcp 111 portmapper
> 100000 3 tcp 111 portmapper
> 100000 2 tcp 111 portmapper
> 100000 4 udp 111 portmapper
> 100000 3 udp 111 portmapper
> 100000 2 udp 111 portmapper
> 100024 1 udp 46925 status
> 100024 1 tcp 60195 status
> 100005 1 udp 20048 mountd
> 100005 1 tcp 20048 mountd
> 100005 2 udp 20048 mountd
> 100005 2 tcp 20048 mountd
> 100005 3 udp 20048 mountd
> 100005 3 tcp 20048 mountd
> 100003 3 tcp 2049 nfs
> 100003 4 tcp 2049 nfs
> 100227 3 tcp 2049 nfs_acl
> 100003 3 udp 2049 nfs
> 100227 3 udp 2049 nfs_acl
> 100021 1 udp 33304 nlockmgr
> 100021 3 udp 33304 nlockmgr
> 100021 4 udp 33304 nlockmgr
> 100021 1 tcp 42895 nlockmgr
> 100021 3 tcp 42895 nlockmgr
> 100021 4 tcp 42895 nlockmgr
> 10 1 udp 59751
>
> # Note above: the test RPC program (536875000) does not
> # appear in the rpcinfo output. That makes me suspect
> # the network namespace configuration on this guest is
> # somehow incorrect.
This is a red herring.
> rpc_test 1 TINFO: using libtirpc: yes
> traceroute to 10.0.0.2 (10.0.0.2), 30 hops max, 60 byte packets
> 1 cel-tirpc (10.0.0.2) 0.501 ms 0.438 ms 0.392 ms
> Kernel IP routing table
> Destination Gateway Genmask Flags MSS Window irtt Iface
> 0.0.0.0 192.168.122.1 0.0.0.0 UG 0 0 0 enp1s0
> 10.0.0.0 0.0.0.0 255.255.255.0 U 0 0 0 ltp_ns_veth2
> 192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 enp1s0
Or, using "ip a" instead of traceroute:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host noprefixroute valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether 52:54:00:16:65:ac brd ff:ff:ff:ff:ff:ff
inet 192.168.122.67/24 brd 192.168.122.255 scope global dynamic noprefixroute enp1s0
valid_lft 3020sec preferred_lft 3020sec
inet6 fe80::1a2a:ab8f:ac06:39aa/64 scope link noprefixroute valid_lft forever preferred_lft forever
3: ltp_ns_veth2@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether f2:91:ad:ed:2a:af brd ff:ff:ff:ff:ff:ff link-netns ltp_ns
inet 10.0.0.2/24 scope global ltp_ns_veth2
valid_lft forever preferred_lft forever
inet6 fd00:1:1:1::2/64 scope global nodad valid_lft forever preferred_lft forever
inet6 fe80::f091:adff:feed:2aaf/64 scope link proto kernel_ll valid_lft forever preferred_lft forever
I see ltp_ns_veth2, but not ltp_ns_veth1. Shouldn't both appear?
I've never dealt with veth before, so I'm swimming in the deep end.
> rpc_test 1 TFAIL: tirpc_clnt_destroy 10.0.0.2 536875000 failed unexpectedly
>
> # I changed tirpc_clnt_destroy to display the following
> # information instead of segfaulting. clnt_create()
> # returns NULL and sets the library's rpc_createerr.
>
> rpc_createerr.cf_stat=12
> error: No route to host
> 2
> rpc_test 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
> rpc_test 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
> rpc_test 2 TINFO: install seinfo to find used SELinux profiles
> rpc_test 2 TINFO: loaded SELinux profiles: none
>
> Summary:
> passed 0
> failed 1
> broken 0
> skipped 0
> warnings 0
>
> --
> Chuck Lever
>
>
--
Chuck Lever
More information about the ltp
mailing list