hadoop配置名称节点HA原理

发布时间:2020-10-04 18:37:02 作者:AricaCui
来源:网络 阅读:715

Architecture


In a typical HA clusiter, two separate machines are configured as NameNodes. At any point in time, exactly one of the NameNodes is in an Active state, and the other is in a Standby state. The Active NameNode is responsible for all client operations in the cluster, while the Standby is simply acting as a slave, maintaining enough state to provide a fast failover if necessary.

In order for the Standby node to keep its state synchronized with the Active node, both nodes communicate with a group of separate daemons called “JournalNodes” (JNs). When any namespace modification is performed by the Active node, it durably logs a record of the modification to a majority of these JNs. The Standby node is capable of reading the edits from the JNs, and is constantly watching them for changes to the edit log. As the Standby Node sees the edits, it applies them to its own namespace. In the event of a failover, the Standby will ensure that it has read all of the edits from the JounalNodes before promoting itself to the Active state. This ensures that the namespace state is fully synchronized before a failover occurs.

In order to provide a fast failover, it is also necessary that the Standby node have up-to-date information regarding the location of blocks in the cluster. In order to achieve this, the DataNodes are configured with the location of both NameNodes, and send block location information and heartbeats to both.

It is vital for the correct operation of an HA cluster that only one of the NameNodes be Active at a time. Otherwise, the namespace state would quickly diverge between the two, risking data loss or other incorrect results. In order to ensure this property and prevent the so-called “split-brain scenario,” the JournalNodes will only ever allow a single NameNode to be a writer at a time. During a failover, the NameNode which is to become active will simply take over the role of writing to the JournalNodes, which will effectively prevent the other NameNode from continuing in the Active state, allowing the new Active to safely proceed with failover.

HardWare Resources


In order to deploy an HA cluster, you should prepare the following:

Note that, in an HA cluster, the Standby NameNode also performs checkpoints of the namespace state, and thus it is not necessary to run a Secondary NameNode, CheckpointNode, or BackupNode in an HA cluster. In fact, to do so would be an error. This also allows one who is reconfiguring a non-HA-enabled HDFS cluster to be HA-enabled to reuse the hardware which they had previously dedicated to the Secondary NameNode.

Deployment


hdfs-site.xml

<property>

    <name>dfs.nameservices</name>

    <value>mycluster</value>

</property>

<property>

    <name>dfs.ha.namenodes.mycluster</name>

    <value>nn1,nn2</value>

</property>

<property>

<name>dfs.namenode.rpc-address.mycluster.nn1</name>

<value>192.168.153.201:8020</value>

</property>

<property>

<name>dfs.namenode.rpc-address.mycluster.nn2</name>

<value>192.168.153.205:8020</value>

</property>

<property>

    <name>dfs.namenode.http-address.mycluster.nn1</name>

    <value>192.168.153.201:50070</value>

</property>

<property>

    <name>dfs.namenode.http-address.mycluster.nn2</name>

    <value>192.168.153.205:50070</value>

</property>

<property>

    <name>dfs.namenode.shared.edits.dir</name>

    <value>qjournal://192.168.153.202:8485;192.168.153.203:8485;192.168.153.204:8485/mycluster</value>

</property>

<property>

    <name>dfs.client.failover.proxy.provider.mycluster</name>

    <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

<property>

    <name>dfs.ha.fencing.methods</name>

    <value>sshfence</value>

</property>

<property>

    <name>dfs.ha.fencing.ssh.private-key-files</name>

    <value>/home/centos/.ssh/id_rsa</value>

</property>

<property>

    <name>dfs.journalnode.edits.dir</name>

    <value>/home/centos/hadoop/hdfs/journal</value>

</property>

core-site.xml

<property>

    <name>fs.defaultFS</name>

    <value>hdfs://mycluster</value>

</property>

Notes


1、Currently, only a maximum of two NameNodes may be configured per nameservice.


推荐阅读:
  1. A/A模式HA
  2. hadoop配置名称节点HA基本流程

免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:is@yisu.com进行举报,并提供相关证据,一经查实,将立刻删除涉嫌侵权内容。

ha hadoop doop

上一篇:Linux查看分区文件系统类型的方法总结

下一篇:layui的表单验证支持ajax判断用户名是否重复的实例

相关阅读

您好,登录后才能下订单哦!

密码登录
登录注册
其他方式登录
点击 登录注册 即表示同意《亿速云用户服务条款》