在Ubuntu上安装Hadoop可以分为几个主要步骤:准备环境、安装Java、下载Hadoop、配置Hadoop、启动Hadoop服务以及验证安装。以下是详细的步骤:
Hadoop需要Java运行环境,因此首先需要安装Java。
sudo apt update
sudo apt install openjdk-8-jdk
安装完成后,验证Java版本:
java -version
从Apache Hadoop官网下载最新版本的Hadoop,例如Hadoop 3.3.5。
wget https://downloads.apache.org/hadoop/common/hadoop-3.3.5/hadoop-3.3.5.tar.gz
sudo tar -zxvf hadoop-3.3.5.tar.gz -C /opt/hadoop
编辑~/.bashrc
文件,添加以下内容:
export HADOOP_HOME=/opt/hadoop-3.3.5
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
然后,使环境变量生效:
source ~/.bashrc
sudo groupadd hadoop
sudo useradd -g hadoop hadoop
sudo chgrp hadoop /opt/hadoop-3.3.5
sudo chmod 755 /opt/hadoop-3.3.5
sudo chown -R hadoop:hadoop /opt/hadoop-3.3.5
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
</configuration>
sudo -u hdfs hadoop namenode -format
sudo -u hdfs hadoop-daemon.sh start namenode
sudo -u hdfs hadoop-daemon.sh start datanode
sudo -u yarn hadoop-daemon.sh start resourcemanager
sudo -u yarn hadoop-daemon.sh start nodemanager
hdfs dfsadmin -report
sudo -u hadoop hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.4.jar wordcount /input /output
sudo -u hadoop hadoop dfs -cat /output/_words_count_
通过以上步骤,您已经在Ubuntu上成功安装并配置了Hadoop。现在,您可以开始使用Hadoop进行大数据处理了。