debian

Debian如何实现HDFS高可用

小樊
47
2025-10-06 18:19:35
栏目: 智能运维

Prerequisites for HDFS High Availability (HA) on Debian
Before configuring HDFS HA, ensure the following prerequisites are met:

Step 1: Configure JournalNode Nodes
JournalNodes store edit logs (transaction records for HDFS metadata) and ensure consistency between Active and Standby NameNodes.

Step 2: Configure NameNode High Availability
This step enables two NameNodes (Active/Standby) to share metadata via JournalNodes.

Step 3: Start HDFS Services
Start all HDFS components in the correct order:

start-dfs.sh  # Starts JournalNodes, NameNodes, and DataNodes

Check cluster status with:

hdfs dfsadmin -report  # Lists DataNodes and their health
hdfs haadmin -getAllServiceStates  # Shows NameNode states (active/standby)

Access NameNode Web UIs (e.g., http://namenode1:9870, http://namenode2:9870) to confirm HA status.

Step 4: Test Automatic Failover
Simulate a failure to verify automatic failover works:

  1. Kill the Active NameNode Process:
    On nn1, find the NameNode PID (jps | grep NameNode) and kill it:
    kill -9 <NameNode_PID>
    
  2. Verify Standby Takes Over:
    On nn2, check its state:
    hdfs haadmin -getServiceState nn2  # Should return "active"
    
  3. Restore the Original Active NameNode:
    Restart the NameNode on nn1 and verify it becomes standby:
    hadoop-daemon.sh start namenode
    hdfs haadmin -getServiceState nn1  # Should return "standby"
    
  4. Check Data Availability:
    Create a test file in HDFS and verify it persists after failover:
    hdfs dfs -put /local/file.txt /test/
    hdfs dfs -get /test/file.txt /local/  # Should succeed after failover
    

Step 5: Monitor and Maintain
Set up monitoring to detect issues early:

0
看了该问题的人还看了