在 CentOS 上搭建 Docker 监控的可落地方案
一 快速上手与内置工具
docker stats(或指定容器:docker stats <容器名或ID>)。适合临时排查与巡检。sudo journalctl -u docker.service -f 实时跟踪守护进程日志,定位启动失败、重启等问题。curl http://localhost:8080 测试 cAdvisor 页面是否可达。二 基于 Prometheus + Grafana 的通用监控
version: "3.8"
services:
cadvisor:
image: gcr.io/cadvisor/cadvisor:v0.47.2
container_name: cadvisor
restart: unless-stopped
volumes:
- /:/rootfs:ro
- /var/run:/var/run:ro
- /sys:/sys:ro
- /var/lib/docker:/var/lib/docker:ro
- /dev/disk:/dev/disk:ro
ports:
- "8080:8080"
prometheus:
image: prom/prometheus:v2.55.0
container_name: prometheus
restart: unless-stopped
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml:ro
ports:
- "9090:9090"
grafana:
image: grafana/grafana:11.2.0
container_name: grafana
restart: unless-stopped
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=StrongPass!
ports:
- "3000:3000"
volumes:
- grafana_data:/var/lib/grafana
volumes:
grafana_data:
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
三 启用 Docker 远程 API 与防火墙要点
{
"hosts": ["unix:///var/run/docker.sock", "tcp://0.0.0.0:2375"]
}
sudo systemctl restart docker。sudo firewall-cmd --permanent --zone=trusted --add-port=2375/tcp && sudo firewall-cmd --reload。curl http://<DOCKER_HOST>:2375/containers/json(如启用 TLS,需使用证书并在客户端指定)。四 Overlay 网络与底层网络监控
sudo iptables -L -v -nsudo nft list rulesetsudo tcpdump -i docker0 -nn 或针对 vxlan 接口抓包,分析 overlay 封装与跨主机通信细节。五 可选方案与扩展