# - 多网络访问场景流表设计概述
多网络访问场景控制器是在云平台控制器上配置网络资源的时候,比如创建子网,创建网口,attach网口等等的操作,会被dpu上的云平台控制器响应去配置对应的流表。
多网络访问场景流表设计类似linux内核网络模块,具备l2,l3层转发,访问本机服务的能力,且具备类似netfilter框架钩子,实现端口&&mac绑定、出入方向限速,安全组(支持带状态),ACL等。**也支持ipv6等协议**
## - 流表pipeline结构
多网络访问场景流表设计的pipeline可以大致的分成:
phase1:根据不同源端口分别处理,vxlan类型和本地port类型
phase2:Egress 流控,包含BPS, PPS
phase3:port绑定检查
phase4:Egress 安全组(支持带状态)
phase5:访问本地其他服务,如dhcp, arp欺骗havip,metadata svc等
phase6: 二层转发 到本地或走隧道转发出去
phase7: 三层转发 访问网关,查路由,类似路由器三层转发对包修改
phase8: 二层转发 查完路由后,换了vni和子网
phase9: Ingress 安全组
phase10: Ingress 流控,BPS,PPS
phase11: Output 最终的转发出接口
----
## 关键fileds说明
##### 寄存器说明
1. reg3: ct*label*
2. reg5: tun*id 大二层隧道id*
3. reg6: inport ofport num
4. reg7: outport ofport num
5. reg8: 未知
6. reg9: vm id
7. reg11: ct*mark 可用来实现状态防火墙等*
8. metadata: vpc+subnet
**其他寄存器暂且没有太看到**
> 实际语法中,actions居多会用NXM*NX*REG和OXM*OF*前缀的寄存器字段和fields,他们只是不同厂商对openflow协议的扩展,比如reg5 等同于 NXM*NX*REG5, in*port 等同于 OXM*OF*IN*PORT等等
##### ovs conntrack状态
> ct状态需要说明的是最常用到的几种状态和ovs的实现。ovs通过match域匹配ct*state来匹配状态,通过ct动作里完成 相关的操作。 ct*state是通过bit位来标志的某个状态是否置位
1. + trk 但凡进入到ct模块,就置位
2. + new 进入CT后,查不到已有连接,就新建,与+trk一起置位
3. + est 同一个方向来去方向都有包后,置该位,与-new互斥
4. + rel 跟其他已存在的会话有关联,比如icmp unreachable,或ftp,iperf的控制会话与数据面传输会话
5. + rpl 回程包,反向的回程包等
6. + inv 无效的ct状态
##### ovs conntrack action
1. table ct动作完成后,最后跳的目的table
2. commit 对+trk的包匹配ct*state后,完成commit操作; 即将会话由unconfirm表放到到confirm*table
3. zone ct的上下文环境,会话在zone之间是完全隔离的
4. nat 做正向snat, dnat等, 也可做反向nat
5. exec([action][,action…]) 执行对ct会话的一些修改,比如设置ct*mark*
6. force 强制commit,重建会话
----
## 流表实现分析
###### vpc实现
> 采用vxlan overlay实现大二层,arp洪泛采用本地arp代答,通过metadata和reg5在流表中实现子网,vpc隔离等
###### L2层访问
> 大二层访问,对跨节点arp请求采用本地代答的方式(包括网关的mac请求),通过本地寻址的方式,判定出口是走隧道还是送往本节点代表口。目的mac是本节点,则送往对应的代表口。
> 如果目的mac不是本地,则根据对应的vtep和tunid封装后从vxlan口发出去。
###### L3层访问
> 三层访问,在二层查询后发现目的mac是子网网关的mac地址,然后预检查ttl是否该丢弃, 然后看是否有acl,没有则通过查询目的网段,匹配源vni, 然后修改其相关的tunid为新vni,修改源mac成新vni子网的网关mac,目的mac为目的ip的mac,重新跳到L2表寻址。 返程类似
###### 流控
> 通过meters实例实现了出入方向BPS/PPS的限速。
###### 安全组(支持带状态)
> 根据端口粒度实现不带状态的安全组,也支持利用ct状态实现带状态的
###### ACL
> 根据源,目,协议号等方式匹配,然后选择放行或拒绝
###### NAT
> 访问本机的一些服务,用到了nat。 ovs的nat是基于ct动作不同参数实现
----
## 主要的访问场景
- 同vpc下同子网同宿主机
- 同vpc下同子网跨宿主机
- 同vpc下跨子网同宿主机
- 同vpc下跨子网跨宿主机
- 跨vpc三层访问
# 访问场景实例
### 同subnet跨节点访问
##### 拓扑
###### **Node1(10.23.10.6) 访问 Node2(10.23.10.4)**
###### Node1:
IP: xx.xx.10.6
MAC: fa:16:3e:17:4b:9d
代表口:port-xxxxxq2py2
vtep: 10.24.40.67
###### Node2:
IP: xx.xx.10.4
MAC: fa:16:3e:46:09:25
代表口:port-yyyyyv66x7
vtep: xx.xx.40.70
###### 流表**Node1(发送)**
###### arp处理
**table=xx**
arp均采用代答的方式(**后面不再分析arp**),实现原理:修改sha, spa, tha ,tpa,arp*op等实现*
cookie=0x170a30c1320ce4af, table=xx, priority=100,arp,metadata=0x47d100000000,arp_tpa=xx.xx.100.7,arp_op=1 actions=move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],set_field:fa:16:3e:0c:02:73->eth_src,set_field:2->arp_op,set_field:xx.xx.100.7->arp_spa,set_field:fa:16:3e:0c:02:73->arp_sha,IN_PORT
###### ip处理
**table=0** 流量入口,根据 入接口分流,设置相关寄存器;reg5是vni,reg6是入接口port number,reg9是vmid,metadata是tunid*subnetid*
cookie=0x170a32cec8c3e4c1, priority=100,in_port="port-xxxxxq2py2" actions=set_field:0xd54dd->reg5,set_field:0x4->reg6,set_field:0x64->reg9,write_metadata:0xd54dd00000001,goto_table:1
**table=1** ip报文都跳到限速处理
cookie=0x170a380bc44c5c23, table=1, priority=50,actions=goto_table:5_
**table=5** Egress BPS限速,无限速规则不涉及
cookie=0x170a380bc44c5b6b, table=5, priority=100 actions=goto_table:6_
**table=6** Egress PPS限速,无限速规则
cookie=0x170a380bc44c5b7f, table=6, priority=100 actions=goto_table:10_
**table=10** Bind port and mac;reg6是port的ofport number,mac是host侧ip源ip
cookie=0x170a32cec8c3e501, table=10, priority=1000,ip,reg6=0x4,dl_src=fa:16:3e:17:4b:9d actions=goto_table:20
**table=20** Egress Pre-CT; icmp报文进入到ct,zone由源端口ofport number区分
cookie=0x170a380bc44c5c9f, table=20, priority=58000,icmp actions=ct(table=25,zone=NXM_NX_REG6[0..15])
**table=25** Egress匹配ct 状态,根据port号筛选zone,匹配ct状态:+new+trk;zone和状态正确,则commit确认ct状态
cookie=0x170a32cec8c3e4ed, table=25, priority=39799,ct_state=+new-est-rel-inv+trk,ip,reg6=0x4 actions=ct(commit,table=30,zone=NXM_NX_REG6[0..15])
**table=30** 是否访问本机服务,服务请查看30表全部流表,本次icmp不涉及
cookie=0x170a3983468be7c9, table=30, priority=50 actions=goto_table:60
**table=60** 根据reg5(vni)、目的mac匹配走哪个隧道封装,并设置出接口为vxlan1; 可以通过ovs-ofctl show br-int查到关系
cookie=0x170a3983468be931, table=60, priority=100,reg5=0xd54dd,dl_dst=fa:16:3e:46:09:25 actions=set_field:0xd54dd->tun_id,set_field:xx.xx.40.70->tun_dst,set_field:0x2->reg7,goto_table:80
**table=80** 根据出接口,存下vmid,
cookie=0x170a3983468beaf9, table=80, priority=1000,reg7=0x4 actions=set_field:0x64->reg9,goto_table:81
**table=81** svc probe, 不涉及,跳过
cookie=0x170a3983468be853, table=81, priority=100 actions=goto_table:85_
**table=85** Ingress BPS, 不涉及
cookie=0x170a3983468be84b, table=85, priority=100 actions=goto_table:86
**table=86** Ingress PPS, 不涉及
cookie=0x170a3983468be8db, table=86, priority=100 actions=goto_table:90
**table=90** 从出接口发出去, 本case是从vxlan口发出去
cookie=0x170a3983468be837, table=90, priority=1000 actions=output:NXM_NX_REG7[]
###### 流表**node2(接收)**
**table=0** 跨节点接收,都是从vxlan口收到包
cookie=0x170a277782792425, priority=1000,in_port=vxlan1 actions=goto_table:50
**table=50** 匹配隧道,目的mac, 设置接收端的pipline里的寄存器,metadata是tunid*subnetid*, reg=vni,
cookie=0x170a2777827925f9, table=50, priority=50,tun_id=0xd54dd,dl_dst=fa:16:3e:46:09:25 actions=set_field:0xd54dd00000001->metadata,set_field:0xd54dd->reg5,resubmit(,30)
**table=30** 是否访问本服务,不涉及
cookie=0x170a2777827923e7, table=30, priority=50 actions=goto_table:60
**table=60** 匹配tunid和mac,二层转发查询
cookie=0x170a277782792601, table=60, priority=100,reg5=0xd54dd,dl_dst=fa:16:3e:46:09:25 actions=set_field:0xd54dd->tun_id,set_field:0x6->reg7,goto_table:70
**table=70**
cookie=0x170a277782792393, table=70, priority=58000,icmp actions=ct(table=75,zone=NXM_NX_REG7[0..15])
**table=75** 匹配ct状态,首包匹配+trk+new, 后续包匹配+trk+est
**首包,匹配+trk+new**
cookie=0x170a2777827925eb, duration=33902.506s, table=75, n_packets=9144, n_bytes=895680, idle_age=4, priority=39819,ct_state=+new-est-rel-inv+trk,ip,reg7=0x6 actions=ct(commit,table=80,zone=NXM_NX_REG7[0..15])**table=**
**后续包匹配+trk+est**
cookie=0x170a2777827923ed, table=75, priority=60000,ct_state=-new+est-rel-inv+trk actions=goto_table:80
**table=80** 匹配出接口,从某个口发出去,也就从ofport=6的接口发出去
cookie=0x170a27778279262d, table=80, priority=1000,reg7=0x6 actions=set_field:0x64->reg9,goto_table:81
**table=81** probe svc,不涉及
cookie=0x170a2777827924a3, table=81, priority=100 actions=goto_table:85
**table=85**
cookie=0x170a277782792431, table=85, priority=100 actions=goto_table:86
**table=86**
cookie=0x170a27778279241d, table=86, priority=100 actions=goto_table:90
**table=90**
cookie=0x170a27778279239d, table=90, priority=1000 actions=output:NXM_NX_REG7[]
### 场景跨子网访问
##### 拓扑
###### **Node1(xx.xx.10.6) 访问 Node2(xx.xx.11.4)**
###### Node1:
IP: xx.xx.10.6
MAC: fa:16:3e:17:4b:9d
下一跳:xx.xx.10.1(fa:16:3e:ec:22:0d)
代表口:port-xxxxxq2py2 (ens4的代表口)
vtep: xx.xx.40.67
###### Node2:
IP: xx.xx.11.4
MAC: fa:16:3e:74:20:c6
下一跳:xx.xx.11.1(fa:16:3e:c4:ed:57)
代表口:port-2zbgfw4f26
vtep: xx.xx.40.70
###### 流表**Node1(发送)**
**table=0** 流量入口,根据 入接口分流,设置相关寄存器;reg5是vni,reg6是入接口port
cookie=0x170a48c6658cc733, priority=100,in_port="port-xxxxxq2py2" actions=set_field:0xd54dd->reg5,set_field:0x4->reg6,set_field:0x64->reg9,write_metadata:0xd54dd00000001,goto_table:1
**table=1** ip报文都跳到限速处理
cookie=0x170a48c6658cc42d, table=1, priority=50 actions=goto_table:5
**table=5** Egress BPS限速,无限速规则不涉及
cookie=0x170a48c6658cc435, table=5, priority=100 actions=goto_table:6
**table=6** Egress PPS限速,无限速规则
cookie=0x170a48c6658cc4d5, table=6, priority=100 actions=goto_table:10
**table=10** Bind port and mac;reg6是port的ofport number,mac是host侧ip源ip
cookie=0x170a48c6658cc74d, duration=1362.918s, table=10, n_packets=13516, n_bytes=1469823, idle_age=1, priority=1000,ip,reg6=0x4,dl_src=fa:16:3e:17:4b:9d actions=goto_table:20
**table=20**
cookie=0x170a48c6658cc4b5, table=20, priority=58000,icmp actions=ct(table=25,zone=NXM_NX_REG6[0..15])
**table=25**
**首包**
cookie=0x170a48c6658cc73b, table=25, priority=39799,ct_state=+new-est-rel-inv+trk,ip,reg6=0x4 actions=ct(commit,table=30,zone=NXM_NX_REG6[0..15])
**后续包**
cookie=0x170a48c6658cc4fb, table=25, priority=60000,ct_state=-new+est-rel-inv+trk actions=goto_table:30
**table=30**
cookie=0x170a48c6658cc4a5, table=30, priority=50 actions=goto_table:60
**table=60**
cookie=0x170a48c6658cc5fb, table=60, priority=100,metadata=0xd54dd00000001,dl_dst=fa:16:3e:ec:22:0d actions=goto_table:100
**table=100** 三层转发入口,目的ip不是本机,则到pre routing
cookie=0x170a48c6658cc4e1, table=100, priority=50,ip actions=goto_table:110
**table=110** 路由前,查ttl若为0或1则丢包,否则继续
cookie=0x170a4b2b174d5e1d, table=110, priority=100 actions=goto_table:120
**table=120** 匹配acl,没有规则,跳过
cookie=0x170a4b2b174d5e1f, table=120, priority=50 actions=goto_table:130
**table=130** 查目的网段是xx.xx.0.0/16,则去查精细路由
cookie=0x170a4b2b174d5f1d, duration=180.589s, table=130, n_packets=4860, n_bytes=733806, idle_age=1, priority=10016,ip,metadata=0xd54dd00000001,nw_dst=xx.xx.0.0/16 actions=goto_table:140
**table=140** 查精细路由,根据reg5筛选大二层,根据目的ip查到具体路由,通过修改大二层metadata, 修改目的ip的mac未目的mac(原先是网关mac),修改源mac为目的网段网关mac,ttl减1, 跳到postrouting
cookie=0x170a4b2b174d5ed3, table=140, priority=100,ip,reg5=0xd54dd,nw_dst=xx.xx.11.4 actions=set_field:0xd54dd00000002->metadata,set_field:fa:16:3e:74:20:c6->eth_dst,set_field:fa:16:3e:c4:ed:57->eth_src,dec_ttl,goto_table:160
**table=160** postrouting没有动作,跳过
cookie=0x170a4b2b174d5d6b, table=160, priority=50 actions=resubmit(,170)
**table=170** 查完路由后,重新二层转发,也就是根据目的mac查找出接口
cookie=0x170a4b2b174d5db3, table=170, priority=50 actions=resubmit(,30)
**table=30** 不访问本地服务,直接查mac表
cookie=0x170a4c013de71d19, table=30, priority=50 actions=goto_table:60
**table=60** 根据大二层vni和目的mac(目的ip的实际mac),进行隧道封装,注意这里的reg7的赋值,他是出接口
cookie=0x170a4c013de71e81, table=60, priority=100,reg5=0xd54dd,dl_dst=fa:16:3e:74:20:c6 actions=set_field:0xd54dd->tun_id,set_field:xx.xx.40.70->tun_dst,set_field:0x2->reg7,goto_table:80
**table=80**
cookie=0x170a4c013de71d77, table=80, priority=1000,reg7=0x2 actions=output:vxlan1
###### 流表 **Node2(接收)**
**table=0**
cookie=0x170d7535048f9acb, priority=1000,in_port=vxlan1 actions=goto_table:50
**table=50** l3 lookup
cookie=0x170d7535048f9cc3, table=50, priority=50,tun_id=0xd54dd,dl_dst=fa:16:3e:c4:ed:57 actions=set_field:0xd54dd00000002->metadata,set_field:0xd54dd->reg5,goto_table:140
**table=140** 查询目的网关
ookie=0x170d7535048f9cd9, table=140, priority=100,ip,reg5=0xd54dd,nw_dst=xx.xx.11.4 actions=set_field:0xd54dd00000002->metadata,set_field:fa:16:3e:74:20:c6->eth_dst,set_field:fa:16:3e:c4:ed:57->eth_src,dec_ttl,goto_table:160
**table=160** postrouting
cookie=0x170d7535048f9acf, table=160, priority=50 actions=resubmit(,170)
**table=170** ingress acl
cookie=0x170d7535048f9b29, table=170, priority=50 actions=resubmit(,30)
**table=30**
cookie=0x170d7535048f9afd, table=30, priority=50 actions=goto_table:60
**table=60**
cookie=0x170d7535048f9e0d, table=60, priority=100,reg5=0xd54dd,dl_dst=fa:16:3e:74:20:c6 actions=set_field:0xd54dd->tun_id,set_field:0x7->reg7,goto_table:70
**table=70**
cookie=0x170d7535048f9bd5, table=70, priority=58000,tcp actions=ct(table=75,zone=NXM_NX_REG7[0..15])
**table=75**
cookie=0x170d7535048f9df5, table=75, priority=39819,ct_state=+new-est-rel-inv+trk,ip,reg7=0x6 actions=ct(commit,table=80,zone=NXM_NX_REG7[0..15])
**table=80**
cookie=0x170d7535048f9dfb, table=80, priority=1000,reg7=0x6 actions=set_field:0x64->reg9,goto_table:81
**table=81**
cookie=0x170d7535048f9b91, table=81, priority=100 actions=goto_table:85
**table=85**
cookie=0x170d7535048f9be1, table=85, priority=100 actions=goto_table:86
**table=86**
cookie=0x170d7535048f9b49, table=86, priority=100 actions=goto_table:90
**table=90**
cookie=0x170d7535048f9bef, table=90, priority=1000 actions=output:NXM_NX_REG7[]
### 场景 访问本地服务
**访问** 访问metadata接口
cookie=0x170b616612e8b709, table=10, priority=2000,tcp,nw_dst=169.254.169.254,tp_dst=8000 actions=goto_table:30
cookie=0x170b616612e8b781, table=45, priority=100,tcp,nw_dst=169.254.169.254,tp_dst=8000 actions=set_field:fa:16:3e:25:fd:7e->eth_dst,set_field:8111->tcp_dst,goto_table:46
cookie=0x170b616612e8b759, table=46, priority=2000,tcp,nw_dst=169.254.169.254,tp_dst=8111 actions=move:NXM_NX_REG6[]->NXM_OF_IP_SRC[],set_field:128.0.0.0/16->ip_src,output:1
**返程**
cookie=0x170b616612e8b859, priority=100,in_port=1 actions=goto_table:47
cookie=0x170b616612e8bd73, table=47, priority=100,tcp,nw_dst=128.0.0.2,tp_src=8111 actions=set_field:169.254.169.254->ip_src,set_field:xx.xx.100.5->ip_dst,set_field:8000->tcp_src,output:"port-17icfhnrgo"
### 场景 安全组 基于port不带状态安全组
**table=70**
cookie=0x170b616612e8bdc1, table=70, priority=39800,ip,reg7=0x4 actions=set_field:0x46->reg8,goto_table:200
> cookie=0x170b616612e8bdc5, table=70, priority=39800,ipv6,reg7=0x4 actions=set_field:0x46->reg8,goto_table:200
### 场景 基于port带状态安全组
cookie=0x170b616612e8becb, table=75, priority=39800,ct_state=+new-est-rel-inv+trk,ip,reg7=0xd actions=set_field:0x4b->reg8,goto_table:200
> cookie=0x170b616612e8bf11, table=75, priority=39800,ct_state=+new-est-rel-inv+trk,ipv6,reg7=0xd actions=set_field:0x4b->reg8,goto_table:200
### 场景 nat
cookie=0x170b616612e8bf69, table=44, priority=2000,ct_state=+new-est-rel-inv+trk,tcp,reg6=0x9,tp_dst=20048 actions=encap(tcp_option(tlv(254,0x0a156b03000047d1))),ct(commit,table=80,nat(src=xx.xx.9.207,random))
cookie=0x170b616612e8c0f1, table=44, priority=2000,ct_state=+new-est-rel-inv+trk,tcp6,reg6=0xa,tp_dst=20048 actions=encap(tcp_option(tlv(254,0x010000000007000000123d2100080041))),ct(commit,table=80,nat(src=240e:108:4:200:1:2:0:70f,random))
### 场景 流控
table=85, priority=1000,reg9=0x64 actions=meter:101,goto_table:86
table=86, priority=1000,reg9=0x64 actions=meter:102,goto_table:90
### 场景 ACL
table=170,priority=55533,icmp,metadata=0x2076370000000a,nw_src=10.2.2.11,nw_dst=10.2.1.11 actions=resubmit(,30)
table=170, priority=24535,icmp6,metadata=0x1cc23500000000 actions=set_field:0xaa->reg8,goto_table:200
# 访问场景实例
### 同subnet跨节点访问
##### 拓扑
###### **Node1(xx.xx.10.6) 访问 Node2(xx.xx.10.4)**
###### Node1:
IP: xx.xx.10.6
MAC: fa:16:3e:17:4b:9d
代表口:port-xxxxxq2py2
vtep: xx.xx.40.67
###### Node2:
IP: xx.xx.10.4
MAC: fa:16:3e:46:09:25
代表口:port-yyyyyv66x7
vtep: xx.xx.40.70
###### 流表**Node1(发送)**
###### arp处理
**table=35**
arp均采用代答的方式(**后面不再分析arp**),实现原理:修改sha, spa, tha ,tpa,arp*op等实现*
cookie=0x170a30c1320ce4af, table=35, priority=100,arp,metadata=0x47d100000000,arp_tpa=xx.xx.100.7,arp_op=1 actions=move:NXM_OF_ETH_SRC[]->NXM_OF_ETH_DST[],move:NXM_NX_ARP_SHA[]->NXM_NX_ARP_THA[],move:NXM_OF_ARP_SPA[]->NXM_OF_ARP_TPA[],set_field:fa:16:3e:0c:02:73->eth_src,set_field:2->arp_op,set_field:xx.xx.100.7->arp_spa,set_field:fa:16:3e:0c:02:73->arp_sha,IN_PORT
###### ip处理
**table=0** 流量入口,根据 入接口分流,设置相关寄存器;reg5是vni,reg6是入接口port number,reg9是vmid,metadata是tunid*subnetid*
cookie=0x170a32cec8c3e4c1, priority=100,in_port="port-xxxxxq2py2" actions=set_field:0xd54dd->reg5,set_field:0x4->reg6,set_field:0x64->reg9,write_metadata:0xd54dd00000001,goto_table:1
**table=1** ip报文都跳到限速处理
cookie=0x170a380bc44c5c23, table=1, priority=50,actions=goto_table:5_
**table=5** Egress BPS限速,无限速规则不涉及
cookie=0x170a380bc44c5b6b, table=5, priority=100 actions=goto_table:6_
**table=6** Egress PPS限速,无限速规则
cookie=0x170a380bc44c5b7f, table=6, priority=100 actions=goto_table:10_
**table=10** Bind port and mac;reg6是port的ofport number,mac是host侧ip源ip
cookie=0x170a32cec8c3e501, table=10, priority=1000,ip,reg6=0x4,dl_src=fa:16:3e:17:4b:9d actions=goto_table:20
**table=20** Egress Pre-CT; icmp报文进入到ct,zone由源端口ofport number区分
cookie=0x170a380bc44c5c9f, table=20, priority=58000,icmp actions=ct(table=25,zone=NXM_NX_REG6[0..15])
**table=25** Egress匹配ct 状态,根据port号筛选zone,匹配ct状态:+new+trk;zone和状态正确,则commit确认ct状态
cookie=0x170a32cec8c3e4ed, table=25, priority=39799,ct_state=+new-est-rel-inv+trk,ip,reg6=0x4 actions=ct(commit,table=30,zone=NXM_NX_REG6[0..15])
**table=30** 是否访问本机服务,服务请查看30表全部流表,本次icmp不涉及
cookie=0x170a3983468be7c9, table=30, priority=50 actions=goto_table:60
**table=60** 根据reg5(vni)、目的mac匹配走哪个隧道封装,并设置出接口为vxlan1; 可以通过ovs-ofctl show br-int查到关系
cookie=0x170a3983468be931, table=60, priority=100,reg5=0xd54dd,dl_dst=fa:16:3e:46:09:25 actions=set_field:0xd54dd->tun_id,set_field:xx.xx.40.70->tun_dst,set_field:0x2->reg7,goto_table:80
**table=80** 根据出接口,存下vmid,
cookie=0x170a3983468beaf9, table=80, priority=1000,reg7=0x4 actions=set_field:0x64->reg9,goto_table:81
**table=81** svc probe, 不涉及,跳过
cookie=0x170a3983468be853, table=81, priority=100 actions=goto_table:85_
**table=85** Ingress BPS, 不涉及
cookie=0x170a3983468be84b, table=85, priority=100 actions=goto_table:86
**table=86** Ingress PPS, 不涉及
cookie=0x170a3983468be8db, table=86, priority=100 actions=goto_table:90
**table=90** 从出接口发出去, 本case是从vxlan口发出去
cookie=0x170a3983468be837, table=90, priority=1000 actions=output:NXM_NX_REG7[]
###### 流表**node2(接收)**
**table=0** 跨节点接收,都是从vxlan口收到包
cookie=0x170a277782792425, priority=1000,in_port=vxlan1 actions=goto_table:50
**table=50** 匹配隧道,目的mac, 设置接收端的pipline里的寄存器,metadata是tunid*subnetid*, reg=vni,
cookie=0x170a2777827925f9, table=50, priority=50,tun_id=0xd54dd,dl_dst=fa:16:3e:46:09:25 actions=set_field:0xd54dd00000001->metadata,set_field:0xd54dd->reg5,resubmit(,30)
**table=30** 是否访问本服务,不涉及
cookie=0x170a2777827923e7, table=30, priority=50 actions=goto_table:60
**table=60** 匹配tunid和mac,二层转发查询
cookie=0x170a277782792601, table=60, priority=100,reg5=0xd54dd,dl_dst=fa:16:3e:46:09:25 actions=set_field:0xd54dd->tun_id,set_field:0x6->reg7,goto_table:70
**table=70**
cookie=0x170a277782792393, table=70, priority=58000,icmp actions=ct(table=75,zone=NXM_NX_REG7[0..15])
**table=75** 匹配ct状态,首包匹配+trk+new, 后续包匹配+trk+est
**首包,匹配+trk+new**
cookie=0x170a2777827925eb, duration=33902.506s, table=75, n_packets=9144, n_bytes=895680, idle_age=4, priority=39819,ct_state=+new-est-rel-inv+trk,ip,reg7=0x6 actions=ct(commit,table=80,zone=NXM_NX_REG7[0..15])**table=**
**后续包匹配+trk+est**
cookie=0x170a2777827923ed, table=75, priority=60000,ct_state=-new+est-rel-inv+trk actions=goto_table:80
**table=80** 匹配出接口,从某个口发出去,也就从ofport=6的接口发出去
cookie=0x170a27778279262d, table=80, priority=1000,reg7=0x6 actions=set_field:0x64->reg9,goto_table:81
**table=81** probe svc,不涉及
cookie=0x170a2777827924a3, table=81, priority=100 actions=goto_table:85
**table=85**
cookie=0x170a277782792431, table=85, priority=100 actions=goto_table:86
**table=86**
cookie=0x170a27778279241d, table=86, priority=100 actions=goto_table:90
**table=90**
cookie=0x170a27778279239d, table=90, priority=1000 actions=output:NXM_NX_REG7[]
### 场景跨子网访问
##### 拓扑
###### **Node1(xx.xx.10.6) 访问 Node2(xx.xx.11.4)**
###### Node1:
IP: xx.xx.10.6
MAC: fa:16:3e:17:4b:9d
下一跳:xx.xx.10.1(fa:16:3e:ec:22:0d)
代表口:port-xxxxxq2py2 (ens4的代表口)
vtep: xx.xx.40.67
###### Node2:
IP: xx.xx.11.4
MAC: fa:16:3e:74:20:c6
下一跳:xx.xx.11.1(fa:16:3e:c4:ed:57)
代表口:port-2zbgfw4f26
vtep: xx.xx.40.70
###### 流表**Node1(发送)**
**table=0** 流量入口,根据 入接口分流,设置相关寄存器;reg5是vni,reg6是入接口port
cookie=0x170a48c6658cc733, priority=100,in_port="port-xxxxxq2py2" actions=set_field:0xd54dd->reg5,set_field:0x4->reg6,set_field:0x64->reg9,write_metadata:0xd54dd00000001,goto_table:1
**table=1** ip报文都跳到限速处理
cookie=0x170a48c6658cc42d, table=1, priority=50 actions=goto_table:5
**table=5** Egress BPS限速,无限速规则不涉及
cookie=0x170a48c6658cc435, table=5, priority=100 actions=goto_table:6
**table=6** Egress PPS限速,无限速规则
cookie=0x170a48c6658cc4d5, table=6, priority=100 actions=goto_table:10
**table=10** Bind port and mac;reg6是port的ofport number,mac是host侧ip源ip
cookie=0x170a48c6658cc74d, duration=1362.918s, table=10, n_packets=13516, n_bytes=1469823, idle_age=1, priority=1000,ip,reg6=0x4,dl_src=fa:16:3e:17:4b:9d actions=goto_table:20
**table=20**
cookie=0x170a48c6658cc4b5, table=20, priority=58000,icmp actions=ct(table=25,zone=NXM_NX_REG6[0..15])
**table=25**
**首包**
cookie=0x170a48c6658cc73b, table=25, priority=39799,ct_state=+new-est-rel-inv+trk,ip,reg6=0x4 actions=ct(commit,table=30,zone=NXM_NX_REG6[0..15])
**后续包**
cookie=0x170a48c6658cc4fb, table=25, priority=60000,ct_state=-new+est-rel-inv+trk actions=goto_table:30
**table=30**
cookie=0x170a48c6658cc4a5, table=30, priority=50 actions=goto_table:60
**table=60**
cookie=0x170a48c6658cc5fb, table=60, priority=100,metadata=0xd54dd00000001,dl_dst=fa:16:3e:ec:22:0d actions=goto_table:100
**table=100** 三层转发入口,目的ip不是本机,则到pre routing
cookie=0x170a48c6658cc4e1, table=100, priority=50,ip actions=goto_table:110
**table=110** 路由前,查ttl若为0或1则丢包,否则继续
cookie=0x170a4b2b174d5e1d, table=110, priority=100 actions=goto_table:120
**table=120** 匹配acl,没有规则,跳过
cookie=0x170a4b2b174d5e1f, table=120, priority=50 actions=goto_table:130
**table=130** 查目的网段是xx.xx.0.0/16,则去查精细路由
cookie=0x170a4b2b174d5f1d, duration=180.589s, table=130, n_packets=4860, n_bytes=733806, idle_age=1, priority=10016,ip,metadata=0xd54dd00000001,nw_dst=xx.xx.0.0/16 actions=goto_table:140
**table=140** 查精细路由,根据reg5筛选大二层,根据目的ip查到具体路由,通过修改大二层metadata, 修改目的ip的mac未目的mac(原先是网关mac),修改源mac为目的网段网关mac,ttl减1, 跳到postrouting
cookie=0x170a4b2b174d5ed3, table=140, priority=100,ip,reg5=0xd54dd,nw_dst=xx.xx.11.4 actions=set_field:0xd54dd00000002->metadata,set_field:fa:16:3e:74:20:c6->eth_dst,set_field:fa:16:3e:c4:ed:57->eth_src,dec_ttl,goto_table:160
**table=160** postrouting没有动作,跳过
cookie=0x170a4b2b174d5d6b, table=160, priority=50 actions=resubmit(,170)
**table=170** 查完路由后,重新二层转发,也就是根据目的mac查找出接口
cookie=0x170a4b2b174d5db3, table=170, priority=50 actions=resubmit(,30)
**table=30** 不访问本地服务,直接查mac表
cookie=0x170a4c013de71d19, table=30, priority=50 actions=goto_table:60
**table=60** 根据大二层vni和目的mac(目的ip的实际mac),进行隧道封装,注意这里的reg7的赋值,他是出接口
cookie=0x170a4c013de71e81, table=60, priority=100,reg5=0xd54dd,dl_dst=fa:16:3e:74:20:c6 actions=set_field:0xd54dd->tun_id,set_field:xx.xx.40.70->tun_dst,set_field:0x2->reg7,goto_table:80
**table=80**
cookie=0x170a4c013de71d77, table=80, priority=1000,reg7=0x2 actions=output:vxlan1
###### 流表 **Node2(接收)**
**table=0**
cookie=0x170d7535048f9acb, priority=1000,in_port=vxlan1 actions=goto_table:50
**table=50** l3 lookup
cookie=0x170d7535048f9cc3, table=50, priority=50,tun_id=0xd54dd,dl_dst=fa:16:3e:c4:ed:57 actions=set_field:0xd54dd00000002->metadata,set_field:0xd54dd->reg5,goto_table:140
**table=140** 查询目的网关
ookie=0x170d7535048f9cd9, table=140, priority=100,ip,reg5=0xd54dd,nw_dst=xx.xx.11.4 actions=set_field:0xd54dd00000002->metadata,set_field:fa:16:3e:74:20:c6->eth_dst,set_field:fa:16:3e:c4:ed:57->eth_src,dec_ttl,goto_table:160
**table=160** postrouting
cookie=0x170d7535048f9acf, table=160, priority=50 actions=resubmit(,170)
**table=170** ingress acl
cookie=0x170d7535048f9b29, table=170, priority=50 actions=resubmit(,30)
**table=30**
cookie=0x170d7535048f9afd, table=30, priority=50 actions=goto_table:60
**table=60**
cookie=0x170d7535048f9e0d, table=60, priority=100,reg5=0xd54dd,dl_dst=fa:16:3e:74:20:c6 actions=set_field:0xd54dd->tun_id,set_field:0x7->reg7,goto_table:70
**table=70**
cookie=0x170d7535048f9bd5, table=70, priority=58000,tcp actions=ct(table=75,zone=NXM_NX_REG7[0..15])
**table=75**
cookie=0x170d7535048f9df5, table=75, priority=39819,ct_state=+new-est-rel-inv+trk,ip,reg7=0x6 actions=ct(commit,table=80,zone=NXM_NX_REG7[0..15])
**table=80**
cookie=0x170d7535048f9dfb, table=80, priority=1000,reg7=0x6 actions=set_field:0x64->reg9,goto_table:81
**table=81**
cookie=0x170d7535048f9b91, table=81, priority=100 actions=goto_table:85
**table=85**
cookie=0x170d7535048f9be1, table=85, priority=100 actions=goto_table:86
**table=86**
cookie=0x170d7535048f9b49, table=86, priority=100 actions=goto_table:90
**table=90**
cookie=0x170d7535048f9bef, table=90, priority=1000 actions=output:NXM_NX_REG7[]