卸载nouveau驱动
- 加入黑名单
$ echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
- 备份
initramfs
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
重建
initramfs
$ dracut -v /boot/initramfs-$(uname -r).img $(uname -r) --force
重启确认是否
nouveau
已被禁用
$ reboot
$ lsmod | grep nouveau
安装Gpu驱动
1.查看Gpu版本
$ yum install -y pciutils
$ lspci | grep NVIDIA
执行结果如下
3b:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
d8:00.0 3D controller: NVIDIA Corporation GP102GL [Tesla P40] (rev a1)
说明版本为Tesla P40
2.下载驱动
需要根据cuda
版本选择,笔者这里cuda
版本如下:
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
下载地址如下:
https://cn.download.nvidia.com/tesla/418.226.00/NVIDIA-Linux-x86_64-418.226.00.run
- 下载cuda
https://developer.nvidia.com/cuda-toolkit-archive
笔者需要下载列表:
- 文件确认
此时文件列表:
- NVIDIA-Linux-x86_64-418.226.00.run: GPU驱动文件
- cuda_10.1.105_418.39_linux.run: cuda文件
- 安装依赖
$ yum -y install kernel-devel kernel-doc kernel-headers gcc gcc-c++
- 安装驱动
$ chmod +x NVIDIA-Linux-x86_64-418.226.00.run
$ ./NVIDIA-Linux-x86_64-418.226.00.run --kernel-source-path=/usr/src/kernels/`uname -r`
安装过程:按提示键入交互
安装完毕后,测试显卡驱动
[root@localhost tmp]# nvidia-smi
Thu Aug 11 02:48:19 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.226.00 Driver Version: 418.226.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P40 Off | 00000000:3B:00.0 Off | 0 |
| N/A 27C P0 49W / 250W | 0MiB / 22919MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 Tesla P40 Off | 00000000:D8:00.0 Off | 0 |
| N/A 25C P0 49W / 250W | 0MiB / 22919MiB | 2% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
- 安装cuda
$ chmod +x cuda_10.1.105_418.39_linux.run
$ ./cuda_10.1.105_418.39_linux.run --silent
加入PATH
$ echo "export PATH=\$PATH:/usr/local/cuda-10.1/bin" >> /etc/profile
$ . /etc/profile
查看cuda版本
$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105
测试
$ conda create -n yourEnv python=3.6 numpy pandas
$ conda activate yourEnv
$ conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
测试cuda是否可用
$ python
>>> import torch
>>> print(torch.cuda.is_available())
True