Linux系统中Kubernetes管理指南
systemctl stop firewalld && systemctl disable firewalld)、禁用SELinux(sed -i 's/enforcing/disabled/' /etc/selinux/config && setenforce 0)、关闭swap分区(swapoff -a并注释/etc/fstab中的swap行);设置主机名与IP对应关系(编辑/etc/hosts文件)。sudo apt update && sudo apt install -y apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt update && sudo apt install -y docker-ce
sudo systemctl enable --now docker
使用kubeadm(官方推荐工具)、kubelet(节点代理)、kubectl(命令行工具):
sudo apt update && sudo apt install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
echo "deb https://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt update && sudo apt install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl # 锁定版本避免自动升级
在Master节点执行初始化命令(以指定Pod网络CIDR为例):
sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address=<Master_IP>
初始化完成后,按提示配置kubectl:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Master节点初始化后会生成kubeadm join命令(包含token和CA证书哈希),在Worker节点执行该命令即可加入集群:
sudo kubeadm join <Master_IP>:6443 --token <token> --discovery-token-ca-cert-hash sha256:<hash>
Kubernetes需要网络插件实现Pod间通信,常用Calico(适合生产环境):
kubectl apply -f https://docs.projectcalico.org/v3.25/manifests/calico.yaml
验证网络插件是否正常:
kubectl get pods -n kube-system # 查看网络插件Pod是否为Running状态
Ready):kubectl get nodes
kubectl get componentstatuses
Deployment控制器部署Nginx示例:kubectl create deployment nginx --image=nginx:latest
NodePort类型服务(外部可通过节点IP+端口访问):kubectl expose deployment nginx --port=80 --type=NodePort
kubectl get pods # 查看Pod状态
kubectl get svc # 查看服务状态
kubectl describe pod <pod_name> # 查看Pod详情(排查问题)
kubectl scale deployment nginx --replicas=3
kubectl delete deployment nginx
kubectl logs <pod_name>
kubectl exec -it <pod_name> -- /bin/bash
使用Prometheus+Grafana组合监控集群性能:
kubectl apply -f https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/master/bundle.yaml
kubectl apply -f https://raw.githubusercontent.com/grafana/grafana/master/deploy/kubernetes/deployment.yaml
kubectl apply -f https://raw.githubusercontent.com/grafana/grafana/master/deploy/kubernetes/datasource.yaml
admin/admin),导入Kubernetes监控Dashboard(如ID:3119)。使用**EFK(Elasticsearch+Fluentd+Kibana)**收集和分析日志:
kubectl apply -f https://raw.githubusercontent.com/elastic/elasticsearch-operator/master/deploy/deployment.yaml
kubectl apply -f https://raw.githubusercontent.com/fluent/fluentd-kubernetes-daemonset/master/fluentd-daemonset-elasticsearch.yaml
kubectl apply -f https://raw.githubusercontent.com/elastic/kibana/master/deploy/kubernetes/elasticsearch-kibana.yaml
logstash-*)并查看日志。通过Role和RoleBinding限制用户对资源的访问权限,例如创建pod-reader角色(允许读取default命名空间的Pod):
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: default
subjects:
- kind: User
name: alice # 用户名(需提前创建)
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
通过NetworkPolicy限制Pod间的通信,例如禁止所有Pod间的入站流量(默认拒绝):
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
spec:
podSelector: {} # 选择所有Pod
policyTypes:
- Ingress # 仅限制入站流量
- Egress # 可选:限制出站流量
使用kubeadm升级集群(以升级到v1.28.0为例):
sudo kubeadm upgrade plan v1.28.0 # 检查升级兼容性
sudo kubeadm upgrade apply v1.28.0 # 执行升级
升级后需重启kubelet服务:
sudo systemctl restart kubelet
etcdctl工具备份etcd数据:ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot save /opt/etcd-backup.db
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot restore /opt/etcd-backup.db --data-dir=/var/lib/etcd-new
git clone https://github.com/kubernetes-sigs/kubespray.git
cd kubespray
pip3 install -r requirements.txt
ansible-playbook -i inventory/mycluster/hosts.yaml cluster.yml
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo update
helm install my-nginx ingress-nginx/ingress-nginx