在CentOS上高效配置Kafka需要关注以下几个关键步骤:
Kafka依赖Zookeeper进行集群管理和元数据存储。首先,确保你已经安装了JDK和Zookeeper。
# 安装JDK(以JDK 11为例)
sudo yum install -y java-11-openjdk-devel
# 下载并解压Zookeeper
wget https://downloads.apache.org/zookeeper/zookeeper-3.8.0/apache-zookeeper-3.8.0-bin.tar.gz
tar -xzf apache-zookeeper-3.8.0-bin.tar.gz
sudo mv apache-zookeeper-3.8.0 /usr/local/zookeeper
编辑/usr/local/zookeeper/conf/zoo.cfg
文件:
dataDir=/usr/local/zookeeper/data
dataLogDir=/usr/local/zookeeper/logs
clientPort=2181
maxClientCnxns=0
initLimit=5
syncLimit=2
server.1=192.168.1.1:2801:3801
server.2=192.168.1.2:2802:3802
server.3=192.168.1.3:2803:3803
启动Zookeeper:
cd /usr/local/zookeeper/bin
./zkServer.sh start
下载并解压Kafka:
wget https://downloads.apache.org/kafka/2.8.1/kafka_2.13-2.8.1.tgz
tar -xzf kafka_2.13-2.8.1.tgz
sudo mv kafka_2.13-2.8.1 /usr/local/kafka
编辑Kafka的server.properties
文件:
# Kafka broker ID
broker.id=0
# Zookeeper连接字符串
zookeeper.connect=192.168.1.1:2181,192.168.1.2:2181,192.168.1.3:2181
# Kafka监听地址和端口
listeners=PLAINTEXT://192.168.1.1:9092
# 允许自动创建Topic
auto.create.topics.enable=false
# 网络相关配置
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=102400
socket.receive.buffer.bytes=102400
socket.request.max.bytes=104857600
# 日志相关配置
log.dirs=/usr/local/kafka/logs
num.partitions=3
num.recovery.threads.per.data.dir=1
log.retention.hours=168
log.segment.bytes=1073741824
log.retention.check.interval.ms=300000
启动Kafka:
cd /usr/local/kafka/bin
./kafka-server-start.sh -daemon ../config/server.properties
创建一个Topic:
./kafka-topics.sh --create --zookeeper 192.168.1.1:2181 --replication-factor 3 --partitions 3 --topic test
查看Topic列表:
./kafka-topics.sh --list --zookeeper 192.168.1.1:2181,192.168.1.2:2181,192.168.1.3:2181
发送和接收消息:
# 生产者
./kafka-console-producer.sh --broker-list 192.168.1.1:9092,192.168.1.2:9092,192.168.1.3:9092 --topic test
# 消费者
./kafka-console-consumer.sh --bootstrap-server 192.168.1.1:9092,192.168.1.2:9092,192.168.1.3:9092 --topic test --from-beginning
根据实际需求调整以下配置项:
fetch.min.bytes
: 设置为合适的值(如1MB)以平衡吞吐量和延迟。fetch.max.wait.ms
: 适当调整以减少延迟。max.partition.fetch.bytes
: 根据数据量调整。max.poll.records
: 控制单次拉取的消息数量,避免OOM。receive.buffer.bytes
和 send.buffer.bytes
: 设置为合适的网络缓冲区大小。配置监控和日志管理工具,如Prometheus和Grafana,以实时监控Kafka集群的性能和健康状态。
通过以上步骤,你可以在CentOS上高效地配置Kafka,并根据实际需求进行优化。