[LTP] [RFC] [PATCH] netns: Fix race in virtual interface bringup

Li Wang liwang@redhat.com
Fri Nov 17 07:09:13 CET 2017


Hi Dan,

On Fri, Nov 10, 2017 at 4:38 AM, Dan Rue <dan.rue@linaro.org> wrote:
> Symptoms (+ command, error):
>     netns_comm_ip_ipv6_ioctl:
>         + ip netns exec tst_net_ns1 ping6 -q -c2 -I veth1 fd00::2
>         connect: Cannot assign requested address
>
>     netns_comm_ip_ipv6_netlink:
>         + ip netns exec tst_net_ns0 ping6 -q -c2 -I veth0 fd00::3
>         connect: Cannot assign requested address
>
>     netns_comm_ns_exec_ipv6_ioctl:
>         + ns_exec 6689 net ping6 -q -c2 -I veth0 fd00::3
>         connect: Cannot assign requested address
>
>     netns_comm_ns_exec_ipv6_netlin:
>         + ns_exec 6891 net ping6 -q -c2 -I veth0 fd00::3
>         connect: Cannot assign requested address
>
> The error is coming from ping6, which is trying to get an IP address for
> veth0 (due to -I veth0), but cannot. Waiting for two seconds fixes the
> test in my testcases. 1 second is not long enough.
>
> dmesg shows the following during the test:
>
>     [Nov 7 15:39] LTP: starting netns_comm_ip_ipv6_ioctl (netns_comm.sh ip ipv6 ioctl)
>     [  +0.302401] IPv6: ADDRCONF(NETDEV_UP): veth0: link is not ready
>     [  +0.048059] IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
>
> Signed-off-by: Dan Rue <dan.rue@linaro.org>
> ---
>
> We've periodically hit this problem across many arm64 kernels and boards, and
> it seems to be caused by "ping6" running before the virtual interface is
> actually ready. "sleep 2" works around the issue and proves that it is a race
> condition, but I would prefer something faster and deterministic. Please
> suggest a better implementation.

Just FYI:

I'm not good at network things, but one method I copied from ltp/numa
test is to split the '2s' into many smaller pieces of time.

which something like:

--- a/testcases/kernel/containers/netns/netns_helper.sh
+++ b/testcases/kernel/containers/netns/netns_helper.sh
@@ -240,6 +240,22 @@ netns_ip_setup()
                tst_brkm TBROK "unable to add device veth1 to the
separate network namespace"
 }

+wait_for_set_ip()
+{
+       local dev=$1
+       local retries=200
+
+       while [ $retries -gt 0 ]; do
+               dmesg -c | grep -q "IPv6: ADDRCONF(NETDEV_CHANGE):
$dev: link becomes ready"
+               if [ $? -eq 0 ]; then
+                       break
+               fi
+
+               retries=$((retries-1))
+               tst_sleep 10ms
+       done
+}
+
 ##
 # Enables virtual ethernet devices and assigns IP addresses for both
 # of them (IPv4/IPv6 variant is decided by netns_setup() function).
@@ -285,6 +301,9 @@ netns_set_ip()
                        tst_brkm TBROK "enabling veth1 device failed"
                ;;
        esac
+
+       wait_for_set_ip veth0
+       wait_for_set_ip veth1
 }

 netns_ns_exec_cleanup()


>
> Also, is it correct that "ifconfig veth0 up" returns before the interface is
> actually ready?
>
> See also this isolated test script:
> https://gist.github.com/danrue/7b76bbcbc23a6296030b7295650b69f3
>
>  testcases/kernel/containers/netns/netns_helper.sh | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/testcases/kernel/containers/netns/netns_helper.sh b/testcases/kernel/containers/netns/netns_helper.sh
> index a95cdf206..99172c0c0 100755
> --- a/testcases/kernel/containers/netns/netns_helper.sh
> +++ b/testcases/kernel/containers/netns/netns_helper.sh
> @@ -285,6 +285,7 @@ netns_set_ip()
>                         tst_brkm TBROK "enabling veth1 device failed"
>                 ;;
>         esac
> +       sleep 2
>  }
>
>  netns_ns_exec_cleanup()
> --
> 2.13.6
>
>
> --
> Mailing list info: https://lists.linux.it/listinfo/ltp



-- 
Li Wang
liwang@redhat.com


More information about the ltp mailing list