[LTP] TI-RPC test failures; network configuration related?
Xinjian Ma (Fujitsu)
maxj.fnst@fujitsu.com
Thu Sep 26 04:02:38 CEST 2024
Hi Chuck
> -----Original Message-----
> From: Chuck Lever III <chuck.lever@oracle.com>
> Sent: 2024年9月26日 4:39
> To: Chen, Hanxiao/陈 晗霄 <chenhx.fnst@fujitsu.com>
> Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>; ltp@lists.linux.it; Ma,
> Xinjian/马 新建 <maxj.fnst@fujitsu.com>; Steve Dickson
> <SteveD@redhat.com>
> Subject: Re: TI-RPC test failures; network configuration related?
>
>
>
> > On Sep 25, 2024, at 6:00 AM, Hanxiao Chen (Fujitsu)
> <chenhx.fnst@fujitsu.com> wrote:
> >
> >
> >
> >> -----邮件原件-----
> >> 发件人: ltp <ltp-bounces+chenhx.fnst=fujitsu.com@lists.linux.it> 代表
> >> Chuck Lever III via ltp
> >> 发送时间: 2024年9月12日 23:50
> >> 收件人: ltp@lists.linux.it
> >> 主题: Re: [LTP] TI-RPC test failures; network configuration related?
> >>
> >>
> >>
> >>> On Aug 29, 2024, at 3:35 PM, Chuck Lever III
> >>> <chuck.lever@oracle.com>
> >> wrote:
> >>>
> >>> For a while now my nightly "runltp -f net.tirpc_tests" have thrown a
> >>> bunch of failures but I haven't had time to look into it until now.
> >>> Without modification, about half of the client test programs
> >>> segfault.
> >>>
> >>> Here's a sample test failure. I instrumented the tirpc_clnt_destroy
> >>> test case and the rpc_tests.sh script as shown below, but I still
> >>> don't understand why clnt_create(3t) is failing.
> >>>
> >
> > Hi, Chuck
> >
> > I can reproduce this issue on my CentOS 10 stream machine with upstream
> LTP.
> > libtirpc-1.3.5-0.el10.x86_64
> > rpcbind-1.2.7-2.el10.x86_64
> >
> > In my limited investigation, it looks like libtirpc returns NULL when
> > LTP trying to create client.
> >
> > 937 __rpcb_findaddr_timed(program, version, nconf, host, clpp, tp) ...
> > 1023 CLNT_CONTROL(client, CLSET_VERS, (char *)(void
> *)&vers);
> > 1024 clnt_st = CLNT_CALL(client,
> (rpcproc_t)RPCBPROC_GETADDR,
> > 1025 (xdrproc_t) xdr_rpcb, (char *)(void *)&parms,
> > 1026 (xdrproc_t) xdr_wrapstring, (char *)(void *)
> &ua, *tp);
> >
> > The ua got "" of line 1026
> >
> > 1027 switch (clnt_st) {
> > 1028 case RPC_SUCCESS:
> > 1029 if ((ua == NULL) || (ua[0] == 0)) {
> > 1030 /* address unknown */
> > 1031 rpc_createerr.cf_stat =
> RPC_PROGNOTREGISTERED;
> > 1032 goto error;
> > 1033 }
> >
> > May be rpcbproc_getaddr_com of rpcbind broken?
>
> The program is registered on one of the veth interfaces.
> The rpcinfo works there. The test program is running on another veth, and it
> can't see the first veth at all (no route to host). So the clnt_create(3) fails.
>
> There is some kind of configuration problem on my test system. Was traveling
> last week, but I have some time to look at it again now.
>
>
> > Hi, Ma
> >
> > Can you fix tirpc cases to let LTP get rid of segfault?
>
> All the RPC test programs assume that libtirpc will return a non-NULL clnt, and
> simply proceed to call CLNT_DESTROY, which segfaults in these error cases.
>
> If the test configuration is not correct, the API returns NULL and sets cf_stat. It
> would be helpful to display the cf_stat error in those cases, and skip
> CLNT_DESTROY.
Got it, I will send patches to get rid of segfault in LTP.
Best regards
Ma
More information about the ltp
mailing list