在Debian上实现Kafka高可用性需从集群部署、副本配置、故障恢复等方面入手,具体步骤如下:
安装JDK
Kafka依赖Java环境,使用OpenJDK 11+:
sudo apt update && sudo apt install openjdk-11-jdk
部署Zookeeper集群
sudo apt install zookeeper
/etc/zookeeper/conf/zoo.cfg
,设置多节点集群(如3节点):server.1=zookeeper1:2888:3888
server.2=zookeeper2:2888:3888
server.3=zookeeper3:2888:3888
sudo systemctl start zookeeper
配置Kafka Broker集群
wget https://downloads.apache.org/kafka/3.7.0/kafka_2.13-3.7.0.tgz
tar -xzf kafka_2.13-3.7.0.tgz
config/server.properties
(每节点配置不同broker.id
和监听地址):broker.id=1 # 每个节点唯一ID
listeners=PLAINTEXT://0.0.0.0:9092
advertised.listeners=PLAINTEXT://broker1:9092 # 对外可访问地址
zookeeper.connect=zookeeper1:2181,zookeeper2:2181,zookeeper3:2181
num.partitions=3 # 分区数建议为Broker数的倍数
default.replication.factor=3 # 副本因子≥2
min.insync.replicas=2 # 最小同步副本数
unclean.leader.election.enable=false # 禁止非同步副本成为Leader
sudo systemctl start kafka
创建高可用主题
使用kafka-topics.sh
创建主题,指定副本因子和最小同步副本数:
bin/kafka-topics.sh --create --topic my-topic --bootstrap-server broker1:9092 --replication-factor 3 --partitions 3 --min-insync.replicas 2
验证高可用性
sudo systemctl stop kafka
kafka-topics.sh --describe
查看分区状态,确保ISR(同步副本集合)包含足够副本。监控与运维
kafka-consumer-groups.sh
监控消费者滞后情况。关键机制:
retries
和acks=all
确保消息可靠投递。注意事项: