在Ubuntu系统上集成Hadoop主要包括以下几个步骤:
sudo apt update && sudo apt upgrade
sudo apt install openjdk-8-jdk
安装完成后,验证Java是否已正确安装:
java -version
sudo apt install openssh-server
wget http://mirror.bit.edu.cn/apache/hadoop/common/hadoop-3.3.4/hadoop-3.3.4.tar.gz
tar -zxvf hadoop-3.3.4.tar.gz -C /opt/
export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
然后,使用以下命令使环境变量生效:
source ~/.bashrc
sudo vi /opt/hadoop/etc/hadoop/hadoop-env.sh
添加以下行(如果不存在):
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
sudo vi /opt/hadoop/etc/hadoop/core-site.xml
添加以下内容:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/data</value>
</property>
</configuration>
sudo vi /opt/hadoop/etc/hadoop/hdfs-site.xml
添加以下内容:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hadoop/data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/hadoop/data/hdfs/datanode</value>
</property>
</configuration>
sudo vi /opt/hadoop/etc/hadoop/mapred-site.xml
添加以下内容:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
sudo vi /opt/hadoop/etc/hadoop/yarn-site.xml
添加以下内容:
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
hdfs namenode -format
sudo -u hdfs hadoop-daemon.sh start namenode
sudo -u hdfs hadoop-daemon.sh start datanode
sudo -u yarn hadoop-daemon.sh start resourcemanager
sudo -u yarn hadoop-daemon.sh start nodemanager
http://localhost:50070
如果成功访问,则表示Hadoop安装成功。
http://localhost:8088
以上步骤涵盖了在Ubuntu系统上安装和配置Hadoop的基本流程。请注意,根据具体的Hadoop版本和个人需求,某些步骤可能需要进行相应的调整。