Big Data 20 min read

Step-by-Step Guide to Building a High-Availability Hadoop HDFS and YARN Cluster

This article provides a comprehensive, step-by-step tutorial for setting up a high‑availability Hadoop cluster, covering user creation, JDK installation, host configuration, SSH setup, firewall and SELinux adjustments, Zookeeper deployment, HDFS and YARN HA configuration, essential XML files, and failover testing.

Architecture Digest
Architecture Digest
Architecture Digest
Step-by-Step Guide to Building a High-Availability Hadoop HDFS and YARN Cluster

The article explains how to build a High‑Availability (HA) Hadoop cluster (HDFS + YARN) after Hadoop 2.x, detailing each required step.

Download links

http://www.trieuvan.com/apache/zookeeper/stable/zookeeper-3.4.6.tar.gz

http://www.trieuvan.com/apache/hadoop/common/stable/hadoop-2.6.0.tar.gz

http://download.oracle.com/otn-pub/java/jdk/7u75-b13/jdk-7u75-linux-x64.tar.gz

2.1 Create Hadoop user

useradd hadoop
passwd hadoop
chmod +w /etc/sudoers
hadoop ALL=(root)NOPASSWD:ALL
chmod -w /etc/sudoers

2.2 Install JDK

sudo vi /etc/profile
export JAVA_HOME=/usr/java/jdk1.7
export PATH=$PATH:$JAVA_HOME/bin
source /etc/profile
java -version

2.3 Configure hosts

10.211.55.12    nna   # NameNode Active
10.211.55.13    nns   # NameNode Standby
10.211.55.14    dn1   # DataNode1
10.211.55.15    dn2   # DataNode2
10.211.55.16    dn3   # DataNode3
scp /etc/hosts hadoop@nns:/etc/

2.4 Install SSH and configure password‑less login

ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys hadoop@nns:~/.ssh/
ssh nns

2.5 Disable firewall (for testing)

chkconfig iptables off

2.6 Set timezone to Shanghai

# cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
# vi /etc/sysconfig/clock
ZONE="Asia/Shanghai"
UTC=false

2.7 Install, start and verify Zookeeper

tar -zxvf zk-{version}.tar.gz
# edit conf/zoo.cfg (tickTime, initLimit, syncLimit, dataDir, clientPort, server.X entries)
bin/zkServer.sh start
jps   # should show QuorumPeerMain
bin/zkServer.sh status

2.8 HDFS HA architecture diagram (image omitted)

2.9 Role assignment (image omitted)

2.10 Environment variable configuration

export JAVA_HOME=/usr/java/jdk1.7
export HADOOP_HOME=/home/hadoop/hadoop-2.6.0
export ZK_HOME=/home/hadoop/zookeeper-3.4.6
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOM

2.11 Core configuration files

mkdir -p /home/hadoop/tmp
mkdir -p /home/hadoop/data/tmp/journal
mkdir -p /home/hadoop/data/dfs/name
mkdir -p /home/hadoop/data/dfs/data
mkdir -p /home/hadoop/data/yarn/local
mkdir -p /home/hadoop/log/yarn

core-site.xml

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://cluster1</value>
  </property>
  ...
  <property>
    <name>ha.zookeeper.quorum</name>
    <value>dn1:2181,dn2:2181,dn3:2181</value>
  </property>
</configuration>

hdfs-site.xml

<configuration>
  <property>
    <name>dfs.nameservices</name>
    <value>cluster1</value>
  </property>
  <property>
    <name>dfs.ha.namenodes.cluster1</name>
    <value>nna,nns</value>
  </property>
  ...
  <property>
    <name>dfs.namenode.shared.edits.dir</name>
    <value>qjournal://dn1:8485;dn2:8485;dn3:8485/cluster1</value>
  </property>
</configuration>

map-site.xml

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>nna:10020</value>
  </property>
</configuration>

yarn-site.xml

<configuration>
  <property>
    <name>yarn.resourcemanager.ha.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.resourcemanager.ha.rm-ids</name>
    <value>rm1,rm2</value>
  </property>
  ...
  <property>
    <name>yarn.resourcemanager.address.rm1</name>
    <value>nna:8132</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address.rm2</name>
    <value>nns:8132</value>
  </property>
</configuration>

hadoop-env.sh / yarn-env.sh

# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.7

2.12 Edit slave file

dn1
dn2
dn3

2.13 Start services (order matters)

On each DataNode: bin/zkServer.sh start and verify with jps (QuorumPeerMain).

On NameNode(s): hadoop-daemons.sh start journalnode (or hadoop-daemon.sh start journalnode ).

Format HDFS (first time): hadoop namenode -format .

Format Zookeeper for HA: hdfs zkfc -formatZK .

Start HDFS and YARN: start-dfs.sh and start-yarn.sh . Verify on active NN with jps (DFSZKFailoverController, NameNode, ResourceManager).

On standby NN (NNS) manually start hadoop-daemon.sh start namenode and yarn-daemon.sh start resourcemanager .

Bootstrap standby metadata: hdfs namenode -bootstrapStandby .

2.14 HA failover

If automatic failover is configured, the standby becomes active when the active node fails. For manual failover, run:

hdfs haadmin -failover --forcefence --forceactive nna nns

2.15 Result screenshots (images omitted).

Source: http://www.cnblogs.com/smartloli/p/4298430.html

Big DataHigh AvailabilityZookeeperYARNHDFSHadoopCluster Setup
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.