debian

Debian如何安装HDFS

小樊
40
2025-09-21 22:04:57
栏目: 智能运维

Prerequisites
Before installing HDFS on Debian, ensure your system is up-to-date and install essential tools:

sudo apt update && sudo apt upgrade -y
sudo apt install wget ssh vim -y

These commands update package lists, upgrade installed packages, and install wget (for downloading Hadoop), ssh (for remote access), and vim (for configuration editing).

1. Install Java Environment
Hadoop requires Java 8 or higher. Install OpenJDK 11 (recommended for compatibility):

sudo apt install openjdk-11-jdk -y

Verify the installation:

java -version

You should see output indicating OpenJDK 11 is installed.

2. Create a Dedicated Hadoop User
For security and isolation, create a non-root user (e.g., hadoop) and add it to the sudo group:

sudo adduser hadoop
sudo usermod -aG sudo hadoop

Switch to the new user:

su - hadoop

This user will manage all Hadoop operations.

3. Download and Extract Hadoop
Download the latest stable Hadoop release (e.g., 3.3.6) from the Apache website:

wget https://downloads.apache.org/hadoop/common/hadoop-3.3.6/hadoop-3.3.6.tar.gz

Extract the archive to /usr/local/ and rename the directory for simplicity:

sudo tar -xzvf hadoop-3.3.6.tar.gz -C /usr/local/
sudo mv /usr/local/hadoop-3.3.6 /usr/local/hadoop

Change ownership of the Hadoop directory to the hadoop user:

sudo chown -R hadoop:hadoop /usr/local/hadoop

4. Configure Environment Variables
Set up Hadoop-specific environment variables in /etc/profile (system-wide) or ~/.bashrc (user-specific). Open the file with vim:

vim ~/.bashrc

Add the following lines at the end:

export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64  # Adjust if using a different Java version

Load the changes into the current session:

source ~/.bashrc

Verify the variables are set:

echo $HADOOP_HOME  # Should output /usr/local/hadoop

5. Configure SSH Passwordless Login
Hadoop requires passwordless SSH between the NameNode and DataNodes. Generate an SSH key pair:

ssh-keygen -t rsa -b 4096 -C "hadoop@debian"

Press Enter to accept default file locations and skip passphrase entry. Copy the public key to the local machine (for single-node clusters) or other cluster nodes:

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys

Test passwordless login:

ssh localhost

You should log in without entering a password.

6. Configure Hadoop Core Files
Navigate to the Hadoop configuration directory:

cd $HADOOP_HOME/etc/hadoop

Edit the following files to define HDFS behavior:

7. Create HDFS Data Directories
Create the directories specified in hdfs-site.xml for NameNode and DataNode storage:

sudo mkdir -p /opt/hadoop/hdfs/namenode
sudo mkdir -p /opt/hadoop/hdfs/datanode
sudo chown -R hadoop:hadoop /opt/hadoop  # Change ownership to the hadoop user

8. Format the NameNode
The NameNode must be formatted once before starting HDFS. Run this command carefully (it will erase existing HDFS data):

hdfs namenode -format

You should see output indicating successful formatting.

9. Start HDFS Services
Start the HDFS daemons (NameNode and DataNode) using the start-dfs.sh script:

$HADOOP_HOME/sbin/start-dfs.sh

Check the status of HDFS processes with jps:

jps

You should see NameNode and DataNode running (along with other Java processes).

10. Verify HDFS Installation
Use HDFS commands to confirm the cluster is operational:

You should see the output Hello, HDFS!

Troubleshooting Tips

0
看了该问题的人还看了