[LTP] LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)

Wed Jan 19 06:28:47 CET 2022

19.01.2022 08:26, Nikita Yushchenko wrote:
>> Big picture is - lockd tries to be per-netns, but lockd isn't standalone, it depends on rpcbind, and 
>> rpcbind isn't guaranteed to be per-netns.
>>
>> One can argue that it is not kernel's job to provide per-netns rpcbind.
>>
>> Still, the current situation is - by default, doing an nfs mount from within netns B immediately 
>> breaks lockd serving nfs mounts exported from different netns A. "By default" = "as long as nfsmount 
>> process executed in netns B is also in a different mount namespace that has RPCBIND_SOCK_PATHNAME not 
>> pointing to AF_UNIX socket instance owned by rpcbind serving netns A.
>>
>> Although in LTP's 'nfslock01' test the "non working locking" is reproduced on the same mount that 
>> triggered the breakage, the breakage is not limited to that mount. Since that mount operation in netns 
>> B, any client of nfs exports from netns A will get locking broken - including clients running on 
>> different physical hosts.
>>
>> I'd say that using AF_UNIX connection from lockd to rpcbind does not play well with per-netns lockd.
>>
>> Solution to use AF_UNIX connection to rpcbind only for lockd serving root netns, and using AF_INET 
>> otherwise - looks more sane.
> 
> Btw, not sure (did not test) what will happen if nfs server will be similarly started in netns B.  Will 
> it hijack requests addressed to nfs server running in netns A?

No it won't "hijack"...  because in will still listen inside netns B only.  But, if ports in rpcbind get 
overwritten in the similar manner, nfs server running in netns A will become no longer reachable.