[LTP] LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)

NeilBrown neilb@suse.de
Tue Jan 18 23:13:35 CET 2022


On Wed, 19 Jan 2022, Nikita Yushchenko wrote:
> 18.01.2022 18:26, Petr Vorel wrote:
> > Hi all,
> > 
> > this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
> > looks to be failing on NFS v3:
> > 
> > "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
> > inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
> > ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
> > for lockd already active for root namespace. This breaks nfs3 file locking."
> 
> What exactly happens is:
> 
> Test runs 'mount' in non-root netns, trying to mount a directory from root netns of the same host via nfsv3
> 
> (Part of) call chain inside the kernel
> 
> nfs_try_get_tree()
>   nfs3_create_server()
>    nfs_create_server()
>     nfs_init_server()
>      nfs_start_lockd()
>       nlmclnt_init()
>        lockd_up()
>         svc_bind()
>          svc_rpcb_setup()
>           rpcb_create_local()
> 
> ... and at this point it tries AF_UNIX connection to /var/run/rpcbind.sock
> 
> AF_UNIX is not netns-aware.
> So it connects to host's rpcbind.
> And overwrites ports registered in host's rpcbind by lockd instance for root namespace. Since this 
> point, lockd instance for root namespace becomes no longer accessible (it still listens but nobody can 
> learn the ports). Thus nfs locks don't work.
> 
> I'm not sure what is the correct behavior here.
> 
> Maybe rpcb_create_local() shall detect that it is not in root netns, and only try AF_INET connection to 
> localhost in that case.

That would be simple and might be sensible.  IF changing the AF_UNIX
path to "/run/rpcbind.sock" isn't sufficient, then testing for the
root_ns is probably the best second option.

Thanks,
NeilBrown


More information about the ltp mailing list