一、环境准备
二、基础配置
/etc/network/interfaces配置静态IP(如192.168.1.10 master、192.168.1.11 datanode1),避免DHCP导致IP变动;修改/etc/hosts文件,添加所有节点的IP与主机名映射,确保节点间可通过主机名互相访问。ssh-keygen -t rsa),将公钥复制到所有DataNode和ResourceManager节点(ssh-copy-id datanode1、ssh-copy-id ResourceManager),测试免密登录(ssh datanode1)。hadoop)并加入hadoop组(sudo useradd -m -G sudo hadoop),避免使用root用户运行Hadoop,提升安全性。三、Hadoop安装与配置
/opt目录(sudo tar -xzf hadoop-3.3.6.tar.gz -C /opt),创建软链接简化路径(sudo ln -sf /opt/hadoop-3.3.6 /opt/hadoop)。/etc/profile.d/hadoop.sh(全局生效)或~/.bashrc(当前用户生效),添加以下内容:export HADOOP_HOME=/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64 # 根据实际Java路径调整
执行source /etc/profile或source ~/.bashrc使配置生效。<configuration>
<property><name>fs.defaultFS</name><value>hdfs://master:9000</value></property>
<property><name>hadoop.tmp.dir</name><value>/opt/hadoop/tmp</value></property>
</configuration>
<configuration>
<property><name>dfs.replication</name><value>3</value></property>
<property><name>dfs.namenode.name.dir</name><value>/opt/hadoop/dfs/name</value></property>
<property><name>dfs.datanode.data.dir</name><value>/opt/hadoop/dfs/data</value></property>
</configuration>
<configuration>
<property><name>yarn.resourcemanager.hostname</name><value>master</value></property>
<property><name>yarn.nodemanager.aux-services</name><value>mapreduce_shuffle</value></property>
<property><name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property>
</configuration>
mapred-site.xml.template复制):设置MapReduce框架为YARN。<configuration>
<property><name>mapreduce.framework.name</name><value>yarn</value></property>
</configuration>
hdfs namenode -format)。四、启动与验证
start-dfs.sh,启动NameNode和DataNode服务;使用jps命令检查进程(NameNode节点应显示NameNode,DataNode节点应显示DataNode)。start-yarn.sh,启动ResourceManager和NodeManager服务;使用yarn node -list命令检查NodeManager是否注册。http://master:9870),查看DataNode是否正常连接;提交测试任务(如hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.3.6.jar pi 10 100),验证MapReduce功能是否正常。五、可选优化
yarn.nodemanager.resource.memory-mb),优化NameNode内存(如hadoop-env.sh中设置HADOOP_NAMENODE_OPTS="-Xmx4g")。