在Linux系统中配置Hadoop涉及多个步骤,以下是一个详细的指南:
安装Java环境:
sudo apt-get update
sudo apt-get install openjdk-8-jdk
sudo yum install java-1.8.0-openjdk-devel
java -version
下载并解压Hadoop:
/opt/hadoop
。wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-3.3.5/hadoop-3.3.5.tar.gz
sudo tar -zxvf hadoop-3.3.5.tar.gz -C /opt/hadoop
.bashrc
文件:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
source ~/.bashrc
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/path/to/hadoop/data/dfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/path/to/hadoop/data/dfs/datanode</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
格式化HDFS文件系统:
hdfs namenode -format
启动HDFS和YARN:
start-dfs.sh
start-yarn.sh
验证Hadoop是否启动成功:
jps
hdfs dfsadmin -report
yarn node -list
生成密钥:
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
将公钥复制到authorized_keys文件:
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
测试免密登录:
ssh localhost
以上步骤涵盖了在Linux系统中配置Hadoop的基本流程。根据具体需求,可能还需要进行更多的配置和优化。