在CentOS上利用GPU加速PyTorch,你需要完成以下几个步骤:
安装NVIDIA驱动:
sudo yum install epel-release
sudo yum install dkms
sudo yum install nvidia-driver-latest-dkms
安装CUDA Toolkit:
wget https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-10.2.89-1.x86_64.rpm
sudo rpm -i cuda-repo-rhel7-10.2.89-1.x86_64.rpm
sudo yum clean all
sudo yum install cuda
echo 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc
echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc
source ~/.bashrc
安装cuDNN:
tar -xzvf cudnn-10.2-linux-x64-v8.0.5.39.tgz
sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
安装NCCL(可选):
安装PyTorch:
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu102
验证安装:
test_gpu.py
的文件,内容如下:import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
x = torch.rand(5, 3).to(device)
y = torch.rand(5, 3).to(device)
z = x + y
print(z)
python test_gpu.py
完成以上步骤后,你就可以在CentOS上使用GPU加速PyTorch了。记得在进行深度学习训练时,将模型和数据移动到GPU上,例如使用.to(device)
方法。