排查CentOS Kubernetes(k8s)集群的故障,可以遵循以下步骤:
kubectl
命令:kubectl get nodes
kubectl get pods --all-namespaces
kubectl get services --all-namespaces
kubectl get events --sort-by=.metadata.creationTimestamp
kubectl describe node <node-name>
top nodes
kubectl describe pod <pod-name> -n <namespace>
kubectl logs <pod-name> -n <namespace>
kubectl describe service <service-name> -n <namespace>
kubectl get pods -n ingress-controller-namespace
kubectl get networkpolicy -n <namespace>
kubectl run -it --rm --image=busybox:1.28 netcat -- nc -zv <node-ip> <port>
kubectl get pv
kubectl get pvc
kubectl get sc
kubectl get pods -n kube-system | grep controller
systemctl status kube-scheduler
journalctl -u kube-apiserver -f
etcdctl member list
etcdctl endpoint health
yum update -y kubernetes*
systemctl restart kubelet
systemctl restart kube-proxy
systemctl restart kube-apiserver
systemctl restart kube-controller-manager
systemctl restart kube-scheduler
通过以上步骤,可以系统地排查CentOS Kubernetes集群的故障。根据具体问题,可能需要结合多个步骤进行深入分析。