searchusermenu
  • 发布文章
  • 消息中心
点赞
收藏
评论
分享
原创

torch.cuda.is_available()报错CUDA initialization错误解决

2024-03-11 01:33:00
456
0

我和小伙伴多次遇到这个问题了,根因在fabricmanager和nvidia driver版本不一致造成的。下面教程比较简单,但是解决CUDA initialization问题非常实用啊!!!

执行:python -c "import torch; torch.cuda.is_available() "报错

UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 802: system not yet initialized ...

根因:fabricmanager和nvidia driver版本不一致

解决方案:

1.执行nvidia-smi查看驱动版本

2.然后去官网找一致的nvidia-fabric-manager

在这里找:XXXXs://developer.download.nvidia.cn/compute/cuda/repos/rhel8/x86_64/
安装:yum install nvidia-fabric-manager-(换成自己的版本号)-1.x86_64.rpm
启动:

systemctl enable nvidia-fabricmanager
systemctl restart nvidia-fabricmanager
systemctl status nvidia-fabricmanager

0条评论
0 / 1000
l****n
28文章数
5粉丝数
l****n
28 文章 | 5 粉丝
原创

torch.cuda.is_available()报错CUDA initialization错误解决

2024-03-11 01:33:00
456
0

我和小伙伴多次遇到这个问题了,根因在fabricmanager和nvidia driver版本不一致造成的。下面教程比较简单,但是解决CUDA initialization问题非常实用啊!!!

执行:python -c "import torch; torch.cuda.is_available() "报错

UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 802: system not yet initialized ...

根因:fabricmanager和nvidia driver版本不一致

解决方案:

1.执行nvidia-smi查看驱动版本

2.然后去官网找一致的nvidia-fabric-manager

在这里找:XXXXs://developer.download.nvidia.cn/compute/cuda/repos/rhel8/x86_64/
安装:yum install nvidia-fabric-manager-(换成自己的版本号)-1.x86_64.rpm
启动:

systemctl enable nvidia-fabricmanager
systemctl restart nvidia-fabricmanager
systemctl status nvidia-fabricmanager

文章来自个人专栏
AI-llama大模型,go语言开发
28 文章 | 2 订阅
0条评论
0 / 1000
请输入你的评论
0
0