在Ubuntu上部署PyTorch模型可遵循以下最佳实践,涵盖环境搭建、模型转换、部署及优化等关键环节:
apt
安装系统依赖:sudo apt update && sudo apt install python3 python3-pip build-essential
。conda
管理虚拟环境:wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
conda create -n pytorch_env python=3.8 -y
conda activate pytorch_env
conda install pytorch torchvision torchaudio cudatoolkit=11.8 -c pytorch -c nvidia
或使用pip(需指定CUDA版本):pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118
import torch
print(torch.__version__, torch.cuda.is_available())
.pt
或.pth
文件:torch.save(model.state_dict(), 'model.pth')
scripted_model = torch.jit.script(model)
scripted_model.save('model_scripted.pt')
torch.onnx.export(model, input_tensor, 'model.onnx', opset_version=11)
import torch
from model import MyModel
model = MyModel()
model.load_state_dict(torch.load('model.pth'))
model.eval()
# 示例输入(需根据模型调整)
input_data = torch.randn(1, 3, 224, 224)
with torch.no_grad():
output = model(input_data)
print(output)
pip install flask
。from flask import Flask, request, jsonify
import torch
app = Flask(__name__)
model = torch.load('model.pth', map_location=torch.device('cpu'))
model.eval()
@app.route('/predict', methods=['POST'])
def predict():
data = request.json['input']
input_tensor = torch.tensor(data).unsqueeze(0)
with torch.no_grad():
output = model(input_tensor)
return jsonify(output.tolist())
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
python app.py
,通过HTTP请求调用。pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:5000 app:app
Dockerfile
:FROM python:3.8-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["gunicorn", "-w", "4", "-b", "0.0.0.0:5000", "app:app"]
docker build -t pytorch-model-server .
docker run -p 5000:5000 pytorch-model-server
pip install torchserve torch-model-archiver
torch-model-archiver --model-name my_model --version 1.0 --serialized-file model.pth --handler image_classifier
torchserve --start --model-store ./ --models my_model.mar
torch.cuda.is_available()
验证。device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
input_tensor = input_tensor.to(device)
batch_input = torch.cat([input1, input2, input3], dim=0)
with torch.no_grad():
batch_output = model(batch_input)
requirements.txt
记录依赖版本,避免环境冲突:pip freeze > requirements.txt
logging
模块记录服务日志,结合Prometheus+Grafana监控模型性能。