[LTP] [PATCH 1/1] nfslock01.sh: Don't test on NFS v3 on TCP
Jeff Layton
jlayton@kernel.org
Tue May 2 14:25:46 CEST 2023
On Tue, 2023-05-02 at 09:59 +0200, Petr Vorel wrote:
> nfs_flock (run via nfslock01.sh) is known to fail on NFS v3 [1]:
>
> not unsharing /var makes AF_UNIX socket for host's rpcbind to become
> available inside ltp_ns. Then, at NFS v3 mount time, kernel creates
> an instance of lockd for ltp_ns, and ports for that instance leak to
> host's rpcbind and overwrite ports for lockd already active for root
> namespace. This breaks nfs3 file locking.
>
Yeccchhh...that is pretty nasty.
rpcbind was obviously written in a time before namespaces were even a
thought to anyone. I wonder if there is something we can do in rpcbind
itself to guard against these sorts of shenanigans? Probably not, I
guess...
Is /var shared between namespaces in this test for some particular
reason?
> Before bd512e733 ("nfs_flock: fail the test if lock/unlock ops fail")
> it run indefinitely with "unhandled error -107":
> [ 2840.099565] lockd: cannot monitor 10.0.0.2
> [ 2840.109353] lockd: cannot monitor 10.0.0.2
> [ 2843.286811] xs_tcp_setup_socket: connect returned unhandled error -107
> [ 2850.198791] xs_tcp_setup_socket: connect returned unhandled error -107
>
> bd512e733 caused an early abort (therefore only "cannot monitor 10.0.0.2"
> appears).
>
> Although there is suggestion, how to fix the problem in kernel [2]:
>
> > Maybe rpcb_create_local() shall detect that it is not in root
> > netns, and only try AF_INET connection to > localhost in that case.
>
> That would be simple and might be sensible. IF changing the AF_UNIX
> path to "/run/rpcbind.sock" isn't sufficient, then testing for the
> root_ns is probably the best second option.
>
Was it determined that changing the location of the socket wasn't
sufficient to fix this? FWIW, My Fedora 38 machine seems to listen on
that socket already:
[Socket]
ListenStream=/run/rpcbind.sock
--
Jeff Layton <jlayton@kernel.org>
More information about the ltp
mailing list