searchusermenu
  • 发布文章
  • 消息中心
点赞
收藏
评论
分享
原创

Mellanox网卡开启SRIOV方法简介

2023-03-29 09:27:39
678
0

环境准备

1、确保BIOS中使能VT-d和SR-IOV

2、系统开启IOMMU

vim /etc/default/grub

在GRUB_CMDLINE_LINUX字段中添加

intel_iommu=on iommu=pt

生效配置

grub2-mkconfig -o /boot/efi/EFI/centos/grub.cfg
reboot

重启后可看到iommu已开启

[root@localhost ~]#  dmesg | grep IOMMU
[    0.125709] DMAR: IOMMU enabled

固件开启SR-IOV,配置NUM_OF_VFS

1、 开启Mellanox Software Tools (MST)驱动

为后续配置固件做准备

[root@localhost ~]# mst start
Starting MST (Mellanox Software Tools) driver set
Loading MST PCI module - Success
Loading MST PCI configuration module - Success
Create devices
-W- Missing "lsusb" command, skipping MTUSB devices detection
Unloading MST PCI module (unused) - Success

 

查询mst设备,选择所需要使用的设备,接下来以mt4125_pciconf1 为例进行配置

[root@localhost ~]# mst status
MST modules:
------------
MST PCI module is not loaded
MST PCI configuration module loaded
 
MST devices:
------------
/dev/mst/mt4125_pciconf0 - PCI configuration cycles access.
domain:bus:dev.fn=0000:32:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1
Chip revision is: 00
/dev/mst/mt4125_pciconf1 - PCI configuration cycles access.
domain:bus:dev.fn=0000:98:00.0 addr.reg=88 data.reg=92 cr_bar.gw_offset=-1
Chip revision is: 00

2、SR-IOV配置查询及修改

查询SR-IOV配置

mlxconfig -d /dev/mst/mt4125_pciconf1 q

使能SR-IOV

mlxconfig -d /dev/mst/mt4125_pciconf1 set SRIOV_EN=1

修改固件vf数量NUM_OF_VFS,例如修改为5,注意这个参数是vf数量上限,而不是真实已分配的vf数量

mlxconfig -d /dev/mst/mt4125_pciconf1 set NUM_OF_VFS=5

设置完成后需要重启机器

 

配置VF

1、查看pci设备、网口映射关系等信息

[root@localhost ~]# lspci -D | grep Mellanox
0000:32:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
0000:32:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
0000:98:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
0000:98:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]

 

[root@localhost ~]# ibdev2netdev -v
0000:32:00.0 mlx5_0 (MT4125 - MCX621102AN-ADAT) ConnectX-6 Dx EN adapter card, 25GbE, Dual-port SFP28, PCIe 4.0 x8, No Crypto                                                                                                fw 22.34.1002 port 1 (DOWN  ) ==> ens3f0np0 (Down)
0000:32:00.1 mlx5_1 (MT4125 - MCX621102AN-ADAT) ConnectX-6 Dx EN adapter card, 25GbE, Dual-port SFP28, PCIe 4.0 x8, No Crypto                                                                                                fw 22.34.1002 port 1 (DOWN  ) ==> ens3f1np1 (Down)
0000:98:00.0 mlx5_2 (MT4125 - MCX621102AN-ADAT) ConnectX-6 Dx EN adapter card, 25GbE, Dual-port SFP28, PCIe 4.0 x8, No Crypto                                                                                                fw 22.34.1002 port 1 (ACTIVE) ==> ens6f0np0 (Up)
0000:98:00.1 mlx5_3 (MT4125 - MCX621102AN-ADAT) ConnectX-6 Dx EN adapter card, 25GbE, Dual-port SFP28, PCIe 4.0 x8, No Crypto                                                                                                fw 22.34.1002 port 1 (DOWN  ) ==> ens6f1np1 (Down)

2、查看固件vf数量

[root@localhost~]# cat /sys/class/net/ens6f0np0/device/sriov_totalvfs
5

3、设置vf数量

根据获取到的目标网卡的pci/网卡/设备名称信息,设置vf数量,以下四种方式等效,可以切分出5个vf

方法一:

[root@localhost~]# echo 5 > /sys/bus/pci/devices/0000:98:00.0/sriov_numvfs​
[root@localhost~]# cat /sys/bus/pci/devices/0000\:98\:00.0/sriov_numvfs
5

方法二:

[root@localhost~]# echo 5 > /sys/class/net/ens6f0np0/device/sriov_numvfs
[root@localhost~]# cat /sys/class/net/ens6f0np0/device/sriov_numvfs
5

方法三:

[root@localhost~]# echo 5 > /sys/class/infiniband/mlx5_2/device/mlx5_num_vfs
[root@localhost~]# cat /sys/class/infiniband/mlx5_2/device/mlx5_num_vfs
5

方法四:

[root@localhost~]# echo 5 > /sys/class/net/ens6f0np0/device/mlx5_num_vfs
[root@localhost~]# cat /sys/class/net/ens6f0np0/device/mlx5_num_vfs
5

注意:重启后vf数量会失效

4、查看vf是否成功生成

[root@localhost~]# lspci -D | grep Mellanox
0000:32:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
0000:32:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
0000:98:00.0 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
0000:98:00.1 Ethernet controller: Mellanox Technologies MT2892 Family [ConnectX-6 Dx]
0000:98:00.2 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
0000:98:00.3 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
0000:98:00.4 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
0000:98:00.5 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
0000:98:00.6 Ethernet controller: Mellanox Technologies ConnectX Family mlx5Gen Virtual Function
[root@localhost~]# ibdev2netdev -v
0000:32:00.0 mlx5_0 (MT4125 - MCX621102AN-ADAT) ConnectX-6 Dx EN adapter card, 25GbE, Dual-port SFP28, PCIe 4.0 x8, No Crypto                                                                                                fw 22.34.1002 port 1 (DOWN  ) ==> ens3f0np0 (Down)
0000:32:00.1 mlx5_1 (MT4125 - MCX621102AN-ADAT) ConnectX-6 Dx EN adapter card, 25GbE, Dual-port SFP28, PCIe 4.0 x8, No Crypto                                                                                                fw 22.34.1002 port 1 (DOWN  ) ==> ens3f1np1 (Down)
0000:98:00.0 mlx5_2 (MT4125 - MCX621102AN-ADAT) ConnectX-6 Dx EN adapter card, 25GbE, Dual-port SFP28, PCIe 4.0 x8, No Crypto                                                                                                fw 22.34.1002 port 1 (ACTIVE) ==> ens6f0np0 (Up)
0000:98:00.1 mlx5_3 (MT4125 - MCX621102AN-ADAT) ConnectX-6 Dx EN adapter card, 25GbE, Dual-port SFP28, PCIe 4.0 x8, No Crypto                                                                                                fw 22.34.1002 port 1 (DOWN  ) ==> ens6f1np1 (Down)
0000:98:00.2 mlx5_4 (MT4126 - NA)  fw 22.34.1002 port 1 (ACTIVE) ==> ens6f0v0 (Up)
0000:98:00.3 mlx5_5 (MT4126 - NA)  fw 22.34.1002 port 1 (ACTIVE) ==> ens6f0v1 (Up)
0000:98:00.4 mlx5_6 (MT4126 - NA)  fw 22.34.1002 port 1 (ACTIVE) ==> ens6f0v2 (Up)
0000:98:00.5 mlx5_7 (MT4126 - NA)  fw 22.34.1002 port 1 (ACTIVE) ==> ens6f0v3 (Up)
0000:98:00.6 mlx5_8 (MT4126 - NA)  fw 22.34.1002 port 1 (ACTIVE) ==> ens6f0v4 (Up)
[root@localhost ~]#  ip link show ens6f0np0
6: ens6f0np0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 08:c0:eb:4b:90:e8 brd ff:ff:ff:ff:ff:ff
    vf 0     link/ether 00:00:00:00:00:00, spoof checking off, link-state auto, trust off, query_rss off
    vf 1     link/ether 00:00:00:00:00:00, spoof checking off, link-state auto, trust off, query_rss off
    vf 2     link/ether 00:00:00:00:00:00, spoof checking off, link-state auto, trust off, query_rss off
    vf 3     link/ether 00:00:00:00:00:00, spoof checking off, link-state auto, trust off, query_rss off
    vf 4     link/ether 00:00:00:00:00:00, spoof checking off, link-state auto, trust off, query_rss off

此时可以看到已生成了5个vf,mlx5_4 ~ mlx5_8,对应的vf序号为0~4,对应的pci function为 0000:98:00.2 ~ 0000:98:00.6。

对于每个设备的state、GUID等详细信息,还可以通过以下命令进行查看。

[root@localhost ~]# ibstatus mlx5_2
Infiniband device 'mlx5_2' port 1 status:
        default gid:     fe80:0000:0000:0000:0ac0:ebff:fe4b:90e8
        base lid:        0x0
        sm lid:          0x0
        state:           4: ACTIVE
        phys state:      5: LinkUp
        rate:            25 Gb/sec (1X EDR)
        link_layer:      Ethernet
[root@localhost ~]# ibstat -d mlx5_2
CA 'mlx5_2'
        CA type: MT4125
        Number of ports: 1
        Firmware version: 22.34.1002
        Hardware version: 0
        Node GUID: 0x08c0eb03004b90e8
        System image GUID: 0x08c0eb03004b90e8
        Port 1:
                State: Active
                Physical state: LinkUp
                Rate: 25
                Base lid: 0
                LMC: 0
                SM lid: 0
                Capability mask: 0x00010000
                Port GUID: 0x0ac0ebfffe4b90e8
                Link layer: Ethernet

5、配置vf信息

根据上一步中查询到的vf情况,对于每个vf缺少信息进行按需配置。

所有vf的配置方法一致,以下操作均以mlx5_4 为例。

5.1配置MAC地址

echo 0000:98:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind
ip link set ens6f0np0 vf 0 mac 00:22:33:44:55:66
echo 0000:98:00.2 > /sys/bus/pci/drivers/mlx5_core/bind

其中:

a. 配置前后需对该vf进行解绑及绑定操作。

b. 注意在以ip link set进行配置时,参数"ens6f0np0"为pf的网卡名称,"vf 0 "指为该pf下序号为0的vf进行配置,"00:22:33:44:55:66 "为配置的MAC地址。

配置完成后可以在ip link show中查看MAC地址。

同时,在配置过MAC地址后,应当会自动配置GUID信息(全局唯一标识符),可以通过ibstat查询到,无需再单独配置,如果查询结果为空或需要修改默认GUID,可参见5.2。

[root@localhost ~]# ibstat -d mlx5_4 |grep GUID
        Node GUID: 0x002233fffe445577
        System image GUID: 0x08c0eb03004b90e8
                Port GUID: 0x022233fffe445577

5.2配置GUID

如重新配置Node GUID

echo 08:c0:eb:03:00:4b:90:e9 > /sys/class/infiniband/mlx5_2/device/sriov/0/node
echo 0000:98:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind
echo 0000:98:00.2 > /sys/bus/pci/drivers/mlx5_core/bind

其中:

a. 配置后需对该v进行解绑及绑定操作

b. 注意在配置Node GUID时,"08:c0:eb:03:00:4b:90:e9"为配置的Node GUID,一般可依据pf的Node GUID设置,本处配置为pf的Node GUID+1,配置路径中"mlx5_2"为pf的设备名,"0"指为该pf下序号为0的vf进行配置,"node"指配置的是Node GUID信息。

5.3添加namespace及配置ip

对于同一台机器上的多个vf,如果需要进行互通操作,需要满足:

a.  各个vf以namespace进行隔离

b.  各个vf配置同一网段的ip

否则,如果只需要与其他机器通信,应当为各个vf配置不同网段的ip地址,此时不需要进行隔离操作。

5.3.1创建namespace

添加一个名称为ns1的namespace。

ip netns add ns1

5.3.2 将网卡ens6f0v0 添加到命名空间ns1。

ip link set ens6f0v0 netns ns1

  此时ens6f0v0 在原来的namespace里将被移除,使用lspci和ibverbs命令将不能再看到此网卡的相关信息

5.3.3配置ip

在命名空间ns1上启动进程,进程以exit指令退出。

ip netns exec ns1 bash

此时将进入到namespace1,并且在bash命令中,可以在这里对网卡配置IP等信息,如,给vf设备ens6f0v0配置ip为 200.1.1.93 

ifconfig ens6f0v0 200.1.1.93 netmask 255.255.255.0

此时,该vf可以正常进行收发包操作,其他vf的配置操作与之一致。

0条评论
0 / 1000