searchusermenu
  • 发布文章
  • 消息中心
点赞
收藏
评论
分享
原创

gpu显卡直通配置

2023-12-13 06:26:00
101
0

安装部署gpu pass through

1)在KVM主机上启用IOMMU

vi /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet amd_iommu=on"
GRUB_DISABLE_RECOVERY="true"

如果是amd cpu在GRUB_CMDLINE_LINUX后面加上amd_iommu=on,如果是intel cpu则加上intel_iommu=on

2)禁用nouveau驱动

vi /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0

 

3)升级grub参数并重启生效

grub2-mkconfig -o /boot/grub2/grub.cfg
reboot
检查iommu是否启动
dmesg | grep -E "DMAR|IOMMU"
检查nouveau是否禁用
dmesg | grep -i nouveau


4)启动 vfio-pci 驱动,并绑定到设备

 
modprobe vfio-pci
这里需要将显卡所在的iommu_group所有设备都添加到/etc/modprobe.d/vfio.conf
通过命令
for iommu_group in $(ls -dv /sys/kernel/iommu_groups/*/); do
 echo "IOMMU group $(basename "$iommu_group")"
for device in $(ls -1 "$iommu_group"/devices/); do
echo -n $'\t'
lspci -nns "$device"
done
done
查找到对应设备,将Vendor ID和Device ID添加到/etc/modprobe.d/vfio.conf
...
IOMMU group 2
        00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
        00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
        07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1650 SUPER] [10de:2187] (rev a1)
        07:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1)
        07:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1)
        07:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller [10de:1aed] (rev a1)
...
vi /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:2187,10de:1aeb,10de:1aec,10de:1aed,1022:1482,1022:1483
 
 
执行
dracut --force
reboot
dmesg | grep -i vfio 检查是否绑定
 
 
[root@dev /]# lspci -nnk -d 10de:
07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1650 SUPER] [10de:2187] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852]
        Kernel driver in use: vfio-pci
        Kernel modules: nouveau
07:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852]
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel
07:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852]
        Kernel driver in use: vfio-pci
07:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller [10de:1aed] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852]
        Kernel driver in use: vfio-pci
        Kernel modules: i2c_nvidia_gpu
 
 
会出现有设备无法绑定情况,需要手动设置,比如USB controller这个设备绑定不了执行下面命令
echo -n "0000:07:00.2" /sys/bus/pci/drivers/xhci_hcd/unbind
echo -n "0000:07:00.2" /sys/bus/pci/drivers/vfio-pci/bind
0条评论
0 / 1000
刘****健
8文章数
1粉丝数
刘****健
8 文章 | 1 粉丝
刘****健
8文章数
1粉丝数
刘****健
8 文章 | 1 粉丝
原创

gpu显卡直通配置

2023-12-13 06:26:00
101
0

安装部署gpu pass through

1)在KVM主机上启用IOMMU

vi /etc/default/grub
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_DISABLE_SUBMENU=true
GRUB_TERMINAL_OUTPUT="console"
GRUB_CMDLINE_LINUX="rd.lvm.lv=centos/root rd.lvm.lv=centos/swap rhgb quiet amd_iommu=on"
GRUB_DISABLE_RECOVERY="true"

如果是amd cpu在GRUB_CMDLINE_LINUX后面加上amd_iommu=on,如果是intel cpu则加上intel_iommu=on

2)禁用nouveau驱动

vi /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0

 

3)升级grub参数并重启生效

grub2-mkconfig -o /boot/grub2/grub.cfg
reboot
检查iommu是否启动
dmesg | grep -E "DMAR|IOMMU"
检查nouveau是否禁用
dmesg | grep -i nouveau


4)启动 vfio-pci 驱动,并绑定到设备

 
modprobe vfio-pci
这里需要将显卡所在的iommu_group所有设备都添加到/etc/modprobe.d/vfio.conf
通过命令
for iommu_group in $(ls -dv /sys/kernel/iommu_groups/*/); do
 echo "IOMMU group $(basename "$iommu_group")"
for device in $(ls -1 "$iommu_group"/devices/); do
echo -n $'\t'
lspci -nns "$device"
done
done
查找到对应设备,将Vendor ID和Device ID添加到/etc/modprobe.d/vfio.conf
...
IOMMU group 2
        00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge [1022:1482]
        00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Starship/Matisse GPP Bridge [1022:1483]
        07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1650 SUPER] [10de:2187] (rev a1)
        07:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1)
        07:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1)
        07:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller [10de:1aed] (rev a1)
...
vi /etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:2187,10de:1aeb,10de:1aec,10de:1aed,1022:1482,1022:1483
 
 
执行
dracut --force
reboot
dmesg | grep -i vfio 检查是否绑定
 
 
[root@dev /]# lspci -nnk -d 10de:
07:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU116 [GeForce GTX 1650 SUPER] [10de:2187] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852]
        Kernel driver in use: vfio-pci
        Kernel modules: nouveau
07:00.1 Audio device [0403]: NVIDIA Corporation TU116 High Definition Audio Controller [10de:1aeb] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852]
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel
07:00.2 USB controller [0c03]: NVIDIA Corporation TU116 USB 3.1 Host Controller [10de:1aec] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852]
        Kernel driver in use: vfio-pci
07:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU116 USB Type-C UCSI Controller [10de:1aed] (rev a1)
        Subsystem: Micro-Star International Co., Ltd. [MSI] Device [1462:3852]
        Kernel driver in use: vfio-pci
        Kernel modules: i2c_nvidia_gpu
 
 
会出现有设备无法绑定情况,需要手动设置,比如USB controller这个设备绑定不了执行下面命令
echo -n "0000:07:00.2" /sys/bus/pci/drivers/xhci_hcd/unbind
echo -n "0000:07:00.2" /sys/bus/pci/drivers/vfio-pci/bind
文章来自个人专栏
K8s
5 文章 | 1 订阅
0条评论
0 / 1000
请输入你的评论
0
0