测试环境
- 两台主机A(10.25.151.100)和B(10.25.151.101)可以通信,安装好了docker
# docker -v
Docker version 1.13.1, build b2f74b2/1.13.1
- 主机A上安装etcd用来同步数据
# etcd -v
2019-03-26 22:46:28.933879 W | pkg/flags: flag "-v" is no longer supported - ignoring.
2019-03-26 22:46:28.934207 I | etcdmain: etcd Version: 3.3.11
2019-03-26 22:46:28.934221 I | etcdmain: Git SHA: 2cf9e51
2019-03-26 22:46:28.934228 I | etcdmain: Go Version: go1.10.3
2019-03-26 22:46:28.934234 I | etcdmain: Go OS/Arch: linux/amd64
2019-03-26 22:46:28.934242 I | etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1
2019-03-26 22:46:28.934256 W | etcdmain: no data-dir provided, using default data-dir ./default.etcd
2019-03-26 22:46:28.938159 C | etcdmain: listen tcp 127.0.0.1:2380: bind: address already in use
- 安装好flannel,并设置网段
# flanneld -version
0.7.1
# etcdctl mk /atomic.io/network/config '{ "Network": "182.48.0.0/16" }'
{ "Network": "182.48.0.0/16" }
- 主机A上docker0和flannel0的IP都有了
# ifconfig flannel0 | grep 182
inet 182.48.56.0 netmask 255.255.0.0 destination 182.48.56.0
# ifconfig docker0 | grep 182
inet 182.48.56.1 netmask 255.255.255.0 broadcast 0.0.0.0
- 主机A上使用 docker.io/centos 启动容器 "centos",获取到IP
# ifconfig | grep 182
inet 182.48.56.2 netmask 255.255.255.0 broadcast 0.0.0.0
- 主机B上docker0和flannel0的IP都有了
# ifconfig flannel0 | grep 182
inet 182.48.72.0 netmask 255.255.0.0 destination 182.48.72.0
# ifconfig docker0 | grep 182
inet 182.48.72.1 netmask 255.255.255.0 broadcast 0.0.0.0
- 主机B,使用docker.io/centos启动容器"centos",获取到IP
# ifconfig | grep 182
inet 182.48.72.2 netmask 255.255.255.0 broadcast 0.0.0.0
遇到的问题
一句话概括,就是UDP模式下,通过Flannel跨节点容器IP不通
定位过程
- Flannel采用默认的UDP模式
- 在主机A的容器上持续ping主机B上容器的IP地址
[root@976483e7ea80 /]# ping 182.48.72.2
PING 182.48.72.2 (182.48.72.2) 56(84) bytes of data.
- 在主机A的docker0上抓包,报文还是正确的,源IP是182.48.56.2
# tcpdump -i docker0 -enn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on docker0, link-type EN10MB (Ethernet), capture size 262144 bytes
22:54:37.359239 02:42:b6:30:38:02 > 02:42:33:38:70:51, ethertype IPv4 (0x0800), length 98: 182.48.56.2 > 182.48.72.2: ICMP echo request, id 18, seq 90, length 64
22:54:38.359238 02:42:b6:30:38:02 > 02:42:33:38:70:51, ethertype IPv4 (0x0800), length 98: 182.48.56.2 > 182.48.72.2: ICMP echo request, id 18, seq 91, length 64
- 但是,但是,在Flannel上抓包,发现报文的源IP被改为182.48.56.0了!
# tcpdump -i flannel0 -enn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on flannel0, link-type RAW (Raw IP), capture size 262144 bytes
22:57:05.359275 ip: 182.48.56.0 > 182.48.72.2: ICMP echo request, id 18, seq 238, length 64
22:57:06.359272 ip: 182.48.56.0 > 182.48.72.2: ICMP echo request, id 18, seq 239, length 64
- 在主机B的flannel0上抓包,和上一步得到的报文相同
- 而在主机B的docker0上抓包,发现ICMP Request已经不见了
- 当然,主机B上的容器没有收到ICMP Request,当然也不会回复ICMP Reply
One more thing
- 如果改为在主机A的容器上ping主机B的docker0的IP 182.48.72.1 ,是可以成功的
[root@976483e7ea80 /]# ping 182.48.72.1
PING 182.48.72.1 (182.48.72.1) 56(84) bytes of data.
64 bytes from 182.48.72.1: icmp_seq=1 ttl=61 time=0.539 ms
64 bytes from 182.48.72.1: icmp_seq=2 ttl=61 time=0.708 ms
- 在主机B的flannel0上抓包,发现源IP也被修改为182.48.56.0
# tcpdump -i flannel0 -enn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on flannel0, link-type RAW (Raw IP), capture size 262144 bytes
23:01:07.454285 ip: 182.48.56.0 > 182.48.72.1: ICMP echo request, id 20, seq 4, length 64
23:01:07.455040 ip: 182.48.72.1 > 182.48.56.0: ICMP echo reply, id 20, seq 4, length 64
- 在主机B的flannel0上可以抓到和上一步一样的报文
- 在主机B的docker0上依然是抓不到报文的
网友评论