Step-by-Step Guide to Building a High-Availability Hadoop HDFS and YARN Cluster
This article provides a comprehensive, step-by-step tutorial for setting up a high‑availability Hadoop cluster, covering user creation, JDK installation, host configuration, SSH setup, firewall and SELinux adjustments, Zookeeper deployment, HDFS and YARN HA configuration, essential XML files, and failover testing.
The article explains how to build a High‑Availability (HA) Hadoop cluster (HDFS + YARN) after Hadoop 2.x, detailing each required step.
Download links
http://www.trieuvan.com/apache/zookeeper/stable/zookeeper-3.4.6.tar.gz
http://www.trieuvan.com/apache/hadoop/common/stable/hadoop-2.6.0.tar.gz
http://download.oracle.com/otn-pub/java/jdk/7u75-b13/jdk-7u75-linux-x64.tar.gz
2.1 Create Hadoop user
useradd hadoop passwd hadoop chmod +w /etc/sudoers hadoop ALL=(root)NOPASSWD:ALL chmod -w /etc/sudoers2.2 Install JDK
sudo vi /etc/profile export JAVA_HOME=/usr/java/jdk1.7 export PATH=$PATH:$JAVA_HOME/bin source /etc/profile java -version2.3 Configure hosts
10.211.55.12 nna # NameNode Active 10.211.55.13 nns # NameNode Standby 10.211.55.14 dn1 # DataNode1 10.211.55.15 dn2 # DataNode2 10.211.55.16 dn3 # DataNode3 scp /etc/hosts hadoop@nns:/etc/2.4 Install SSH and configure password‑less login
ssh-keygen -t rsa cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys chmod 600 ~/.ssh/authorized_keys scp ~/.ssh/authorized_keys hadoop@nns:~/.ssh/ ssh nns2.5 Disable firewall (for testing)
chkconfig iptables off2.6 Set timezone to Shanghai
# cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime # vi /etc/sysconfig/clock ZONE="Asia/Shanghai" UTC=false2.7 Install, start and verify Zookeeper
tar -zxvf zk-{version}.tar.gz # edit conf/zoo.cfg (tickTime, initLimit, syncLimit, dataDir, clientPort, server.X entries) bin/zkServer.sh start jps # should show QuorumPeerMain bin/zkServer.sh status2.8 HDFS HA architecture diagram (image omitted)
2.9 Role assignment (image omitted)
2.10 Environment variable configuration
export JAVA_HOME=/usr/java/jdk1.7 export HADOOP_HOME=/home/hadoop/hadoop-2.6.0 export ZK_HOME=/home/hadoop/zookeeper-3.4.6 export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOM2.11 Core configuration files
mkdir -p /home/hadoop/tmp mkdir -p /home/hadoop/data/tmp/journal mkdir -p /home/hadoop/data/dfs/name mkdir -p /home/hadoop/data/dfs/data mkdir -p /home/hadoop/data/yarn/local mkdir -p /home/hadoop/log/yarncore-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster1</value>
</property>
...
<property>
<name>ha.zookeeper.quorum</name>
<value>dn1:2181,dn2:2181,dn3:2181</value>
</property>
</configuration>hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>cluster1</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster1</name>
<value>nna,nns</value>
</property>
...
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://dn1:8485;dn2:8485;dn3:8485/cluster1</value>
</property>
</configuration>map-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>nna:10020</value>
</property>
</configuration>yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
...
<property>
<name>yarn.resourcemanager.address.rm1</name>
<value>nna:8132</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm2</name>
<value>nns:8132</value>
</property>
</configuration>hadoop-env.sh / yarn-env.sh
# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.72.12 Edit slave file
dn1
dn2
dn32.13 Start services (order matters)
On each DataNode: bin/zkServer.sh start and verify with jps (QuorumPeerMain).
On NameNode(s): hadoop-daemons.sh start journalnode (or hadoop-daemon.sh start journalnode ).
Format HDFS (first time): hadoop namenode -format .
Format Zookeeper for HA: hdfs zkfc -formatZK .
Start HDFS and YARN: start-dfs.sh and start-yarn.sh . Verify on active NN with jps (DFSZKFailoverController, NameNode, ResourceManager).
On standby NN (NNS) manually start hadoop-daemon.sh start namenode and yarn-daemon.sh start resourcemanager .
Bootstrap standby metadata: hdfs namenode -bootstrapStandby .
2.14 HA failover
If automatic failover is configured, the standby becomes active when the active node fails. For manual failover, run:
hdfs haadmin -failover --forcefence --forceactive nna nns2.15 Result screenshots (images omitted).
Source: http://www.cnblogs.com/smartloli/p/4298430.html
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.