在CentOS上配置Hadoop分布式文件系统(HDFS)是一个涉及多个步骤的过程。以下是一个基本的配置步骤指南,基于CentOS 7或更高版本,使用Hadoop 3.x版本:
JAVA_HOME
环境变量。sudo yum install java-1.8.0-openjdk-devel
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk
export PATH=$PATH:$JAVA_HOME/bin
sudo yum install openssh-server openssh-clients
sudo systemctl start sshd
sudo systemctl enable sshd
ssh-keygen -t rsa
ssh-copy-id root@node2
ssh-copy-id root@node3
vi /etc/hosts
vi /etc/sysconfig/network-scripts/ifcfg-eth0
sudo yum install ntp
sudo systemctl start ntpd
sudo systemctl enable ntpd
wget https://downloads.apache.org/hadoop/core/hadoop-3.3.4/hadoop-3.3.4.tar.gz
tar -xzvf hadoop-3.3.4.tar.gz -C /usr/local/
sudo mv /usr/local/hadoop-3.3.4 /usr/local/hadoop
/etc/profile
文件:
vi /etc/profile
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source /etc/profile
core-site.xml
:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode:9000</value>
</property>
</configuration>
hdfs-site.xml
:
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/data/datanode</value>
</property>
</configuration>
mapred-site.xml
(如果需要):
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
(如果需要):
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>namenode</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
hdfs namenode -format
sbin/start-dfs.sh
jps
命令检查HDFS进程是否启动成功。http://namenode:50070
)确认配置。sudo firewall-cmd --permanent --zone=public --add-port=9000/tcp
sudo firewall-cmd --permanent --zone=public --add-port=50010/tcp
sudo firewall-cmd --permanent --zone=public --add-port=50020/tcp
sudo firewall-cmd --permanent --zone=public --add-port=50070/tcp
sudo firewall-cmd --permanent --zone=public --add-port=50075/tcp
sudo firewall-cmd --permanent --zone=public --add-port=50090/tcp
sudo firewall-cmd --reload
以上步骤提供了一个基本的指南,具体的配置可能会因版本和环境的不同而有所差异。建议参考Hadoop官方文档进行详细配置。