[LTP] TI-RPC test failures; network configuration related?

Chuck Lever III chuck.lever@oracle.com
Thu Sep 12 17:50:06 CEST 2024



> On Aug 29, 2024, at 3:35 PM, Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
> For a while now my nightly "runltp -f net.tirpc_tests" have
> thrown a bunch of failures but I haven't had time to look
> into it until now. Without modification, about half of the
> client test programs segfault.
> 
> Here's a sample test failure. I instrumented the
> tirpc_clnt_destroy test case and the rpc_tests.sh script as
> shown below, but I still don't understand why clnt_create(3t)
> is failing.
> 
> Seems to occur on all recent versions of Fedora with stock
> kernels or custom-built kernels.
> 
> 
> [root@cel-tirpc ltp]# testcases/bin/rpc_test.sh -s tirpc_svc_2 -c tirpc_clnt_destroy
> rpc_test 1 TINFO: Running: rpc_test.sh -s tirpc_svc_2 -c tirpc_clnt_destroy
> rpc_test 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
> rpc_test 1 TINFO: add local addr 10.0.0.2/24
> rpc_test 1 TINFO: add local addr fd00:1:1:1::2/64
> rpc_test 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
> rpc_test 1 TINFO: add remote addr 10.0.0.1/24
> rpc_test 1 TINFO: add remote addr fd00:1:1:1::1/64
> rpc_test 1 TINFO: Network config (local -- remote):
> rpc_test 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
> rpc_test 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
> rpc_test 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
> rpc_test 1 TINFO: timeout per run is 0h 5m 0s
> rpc_test 1 TINFO: check registered RPC with rpcinfo
> rpc_test 1 TINFO: registered RPC:
>   program vers proto   port  service
>    100000    4   tcp    111  portmapper
>    100000    3   tcp    111  portmapper
>    100000    2   tcp    111  portmapper
>    100000    4   udp    111  portmapper
>    100000    3   udp    111  portmapper
>    100000    2   udp    111  portmapper
>    100024    1   udp  46925  status
>    100024    1   tcp  60195  status
>    100005    1   udp  20048  mountd
>    100005    1   tcp  20048  mountd
>    100005    2   udp  20048  mountd
>    100005    2   tcp  20048  mountd
>    100005    3   udp  20048  mountd
>    100005    3   tcp  20048  mountd
>    100003    3   tcp   2049  nfs
>    100003    4   tcp   2049  nfs
>    100227    3   tcp   2049  nfs_acl
>    100003    3   udp   2049  nfs
>    100227    3   udp   2049  nfs_acl
>    100021    1   udp  33304  nlockmgr
>    100021    3   udp  33304  nlockmgr
>    100021    4   udp  33304  nlockmgr
>    100021    1   tcp  42895  nlockmgr
>    100021    3   tcp  42895  nlockmgr
>    100021    4   tcp  42895  nlockmgr
>        10    1   udp  59751
> 
> # Note above: the test RPC program (536875000) does not
> # appear in the rpcinfo output. That makes me suspect
> # the network namespace configuration on this guest is
> # somehow incorrect.

This is a red herring.


> rpc_test 1 TINFO: using libtirpc: yes
> traceroute to 10.0.0.2 (10.0.0.2), 30 hops max, 60 byte packets
> 1  cel-tirpc (10.0.0.2)  0.501 ms  0.438 ms  0.392 ms
> Kernel IP routing table
> Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
> 0.0.0.0         192.168.122.1   0.0.0.0         UG        0 0          0 enp1s0
> 10.0.0.0        0.0.0.0         255.255.255.0   U         0 0          0 ltp_ns_veth2
> 192.168.122.0   0.0.0.0         255.255.255.0   U         0 0          0 enp1s0

Or, using "ip a" instead of traceroute:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute         valid_lft forever preferred_lft forever
2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:16:65:ac brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.67/24 brd 192.168.122.255 scope global dynamic noprefixroute enp1s0
       valid_lft 3020sec preferred_lft 3020sec
    inet6 fe80::1a2a:ab8f:ac06:39aa/64 scope link noprefixroute         valid_lft forever preferred_lft forever
3: ltp_ns_veth2@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether f2:91:ad:ed:2a:af brd ff:ff:ff:ff:ff:ff link-netns ltp_ns
    inet 10.0.0.2/24 scope global ltp_ns_veth2
       valid_lft forever preferred_lft forever
    inet6 fd00:1:1:1::2/64 scope global nodad         valid_lft forever preferred_lft forever
    inet6 fe80::f091:adff:feed:2aaf/64 scope link proto kernel_ll         valid_lft forever preferred_lft forever

I see ltp_ns_veth2, but not ltp_ns_veth1. Shouldn't both appear?

I've never dealt with veth before, so I'm swimming in the deep end.


> rpc_test 1 TFAIL: tirpc_clnt_destroy 10.0.0.2 536875000 failed unexpectedly
> 
> # I changed tirpc_clnt_destroy to display the following
> # information instead of segfaulting. clnt_create()
> # returns NULL and sets the library's rpc_createerr.
> 
> rpc_createerr.cf_stat=12
> error: No route to host
> 2
> rpc_test 2 TINFO: SELinux enabled in enforcing mode, this may affect test results
> rpc_test 2 TINFO: it can be disabled with TST_DISABLE_SELINUX=1 (requires super/root)
> rpc_test 2 TINFO: install seinfo to find used SELinux profiles
> rpc_test 2 TINFO: loaded SELinux profiles: none
> 
> Summary:
> passed   0
> failed   1
> broken   0
> skipped  0
> warnings 0
> 
> --
> Chuck Lever
> 
> 

--
Chuck Lever




More information about the ltp mailing list