Ubuntu上PyTorch模型的部署与监控实践指南
一 环境准备与模型导出
pip install torch torchvision torchaudiopip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117python -c "import torch; print(torch.__version__, torch.cuda.is_available())"scripted = torch.jit.script(model); scripted.save("model.pt")dummy_input,执行torch.onnx.export(model, dummy_input, "model.onnx")torch.cuda.amp.autocast() + GradScalertorch.quantization(动态/静态)torch.nn.utils.prune二 部署路径与示例
model.eval()、预处理/后处理、用Gunicorn/Uvicorn多进程/异步运行pip install fastapi uvicorn[standard]model = MyModel(); model.load_state_dict(torch.load("model.pth", map_location="cpu")); model.eval()with torch.no_grad():uvicorn app:app --host 0.0.0.0 --port 5000pip install torchserve torch-model-archiver torch-workflow-archivertorch-model-archiver --model-name mnist --version 1.0 --model-file mnist.py --serialized-file mnist_cnn.pt --handler mnist_handler.pydocker run --rm -it -p 3000:8080 -p 3001:8081 pytorch/torchserveonnxruntime加载model.onnx推理,结合异步框架或批量队列提升吞吐FROM python:3.10-slimWORKDIR /appCOPY requirements.txt .RUN pip install -r requirements.txtCOPY . .CMD ["gunicorn", "app:app", "-b", "0.0.0.0:5000", "--timeout", "120"]docker build -t my-pytorch-app . && docker run -p 5000:5000 my-pytorch-app三 运行期监控与可观测性
watch -n 1 nvidia-smi(显存、利用率、温度、功耗)htop 或 toppsutil 记录CPU%、RSS内存等,便于写入日志/打点pip install tensorboard torch.utils.tensorboard
SummaryWriter(log_dir="runs").add_scalar("Loss/train", loss, step)tensorboard --logdir=runspip install netron;netron model.onnx 在浏览器查看网络拓扑与维度summary(model, input_size=(B, C, H, W)) 查看参数量与层信息with torch.profiler.profile(activities=[ProfilerActivity.CPU, ProfilerActivity.CUDA], record_shapes=True, profile_memory=True) as prof: outputs = model(inputs) print(prof.key_averages().table(sort_by="cuda_time_total", row_limit=10))四 上线检查清单与常见问题