Kubernetes 在 Linux 上的资源管理技巧
一 核心机制与 Linux 内核关系
二 Pod 与容器资源配置要点
apiVersion: v1
kind: Pod
metadata:
name: demo
spec:
containers:
- name: app
image: nginx:1.25
resources:
requests:
memory: "128Mi"
cpu: "250m"
limits:
memory: "256Mi"
cpu: "500m"
spec:
containers:
- name: gpu-app
image: nvidia/cuda:12.2-base
resources:
requests:
nvidia.com/gpu: "1"
limits:
nvidia.com/gpu: "1"
nodeSelector:
nvidia.com/gpu.product: nvidia-a100
三 命名空间与多租户边界
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-quota
namespace: dev
spec:
hard:
requests.cpu: "4"
requests.memory: "8Gi"
limits.cpu: "8"
limits.memory: "16Gi"
pods: "20"
apiVersion: v1
kind: LimitRange
metadata:
name: default-limits
namespace: dev
spec:
limits:
- type: Container
default:
requests.cpu: "200m"
requests.memory: "128Mi"
defaultRequest:
requests.cpu: "100m"
requests.memory: "64Mi"
max:
requests.cpu: "1"
requests.memory: "512Mi"
min:
requests.cpu: "50m"
requests.memory: "32Mi"
kubectl describe resourcequota <name> -n <ns> 观察;为团队设定对象数量上限,避免无节制创建 ConfigMap/Secret/PVC 等。四 调度与拓扑感知优化
五 监控 自动伸缩与节点级保障
kubectl top nodes/pods 快速查看资源使用;结合 Prometheus + Grafana 搭建可视化与告警,定位 CPU 节流、内存压力、OOMKilled 等根因。--system-reserved=cpu=500m,memory=1Gi--kube-reserved=cpu=200m,memory=512Mi