Big Data 11 min read

How to Plan, Configure, and Launch a Hadoop 3.3.5 Cluster on Three Nodes

This guide walks through planning a three‑node Hadoop 3.3.5 cluster, explains default and custom configuration files, details core‑site, hdfs‑site, yarn‑site, and mapred‑site settings, shows how to distribute configs, start HDFS and YARN, and perform basic file‑system tests.

Efficient Ops
Efficient Ops
Efficient Ops
How to Plan, Configure, and Launch a Hadoop 3.3.5 Cluster on Three Nodes

1. Cluster Planning

Deploy Hadoop version 3.3.5 on three hardware nodes. Hadoop configuration files are divided into default files and custom files.

Key Knowledge Points

HDFS is the open‑source implementation of Google File System (GFS); MapReduce implements Google’s MapReduce; HBase is the open‑source version of BigTable.

Hadoop 2.0+ consists of four major components: HDFS, MapReduce, YARN (Yet Another Resource Negotiator), and COMMON.

The two most prominent features of Hadoop are its distributed architecture and fault‑tolerance mechanisms.

Hadoop follows a master‑slave structure for both computation and storage.

YARN scheduling: the ResourceManager receives a job request from a client, selects a worker node (NodeManager) to run the job, and the NodeManager handles execution while the ResourceManager manages resources.

HDFS storage: the NameNode stores only metadata; DataNodes store the actual data blocks. Clients first query the NameNode for block locations, then retrieve data from the appropriate DataNode.

2. Default Configuration Files

After extracting the Hadoop package, the default JAR files are located under

hadoop-3.3.5\share\hadoop

.

3. Custom Configuration Files

The four main custom XML files are

core-site.xml

,

hdfs-site.xml

,

yarn-site.xml

, and

mapred-site.xml

, stored in

$HADOOP_HOME/etc/hadoop

. Modify them according to project requirements.

4. Cluster Configuration

4.1 Core Configuration (core-site.xml)

<code>[antares@hadoop102 ~]$ cd $HADOOP_HOME/etc/hadoop
[antares@hadoop hadoop]$ vim core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <!-- Specify NameNode address -->
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://hadoop102:8020</value>
  </property>
  <!-- Specify Hadoop temporary directory -->
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/module/hadoop-3.3.5/data</value>
  </property>
  <!-- Set static user for HDFS web UI -->
  <property>
    <name>hadoop.http.staticuser.user</name>
    <value>antares</value>
  </property>
</configuration>
</code>

4.2 HDFS Configuration (hdfs-site.xml)

<code>[antares@hadoop hadoop]$ vim hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>dfs.namenode.http-address</name>
    <value>hadoop102:9870</value>
  </property>
  <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>hadoop104:9868</value>
  </property>
</configuration>
</code>

4.3 YARN Configuration (yarn-site.xml)

<code>[antares@hadoop hadoop]$ vim yarn-site.xml
<configuration>
  <!-- Enable MapReduce shuffle -->
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <!-- ResourceManager hostname -->
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>hadoop103</value>
  </property>
  <!-- Inherit environment variables -->
  <property>
    <name>yarn.nodemanager.env-whitelist</name>
    <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
  </property>
</configuration>
</code>

4.4 MapReduce Configuration (mapred-site.xml)

<code>[antares@hadoop hadoop]$ vim mapred-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <!-- Run MapReduce on YARN -->
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>
</code>

4.5 Distribute Configuration Files

Copy the edited XML files to all nodes (e.g., using

scp

) and ensure the original files are removed or renamed to avoid conflicts.

5. Configure Workers

<code>[antares@hadoop102 hadoop]$ vim /opt/module/hadoop-3.3.5/etc/hadoop/workers
hadoop102
hadoop103
hadoop104
</code>

6. Start the Cluster

6.1 First‑time NameNode Formatting

On the NameNode (hadoop102) format the NameNode. Note that formatting generates a new cluster ID; if you need to re‑format later, stop all NameNode and DataNode processes, delete

data

and

logs

directories on every machine, then format again.

<code>[antares@hadoop102 hadoop-3.3.5]$ pwd
/opt/module/hadoop-3.3.5
[antares@hadoop102 hadoop-3.3.5]$ hdfs namenode -format
</code>

6.2 Start HDFS

<code>[antares@hadoop102 hadoop-3.3.5]$ sbin/start-dfs.sh
</code>

6.3 Start YARN

Run on the ResourceManager node (hadoop103):

<code>[antares@hadoop103 hadoop-3.3.5]$ sbin/start-yarn.sh
</code>

6.4 Verify via Web UI

HDFS NameNode UI: http://hadoop102:9870

YARN ResourceManager UI: http://hadoop103:8088

7. Basic Cluster Tests

Upload Small File

<code>[antares@hadoop102 ~]$ hadoop fs -mkdir /test
[antares@hadoop102 ~]$ hadoop fs -put $HADOOP_HOME/testinput/kk.txt /test
</code>

If the file is missing, locate it under

$HADOOP_HOME

and retry.

Upload Large File

<code>[antares@hadoop102 ~]$ hadoop fs -put /opt/software/jdk-8u391-linux-x64.tar.gz /test
</code>

The large file is replicated three times by default.

Check Storage Path

<code>[antares@hadoop102 subdir0]$ pwd
/opt/module/hadoop-3.3.5/data/dfs/data/current/BP-1445008223-192.168.193.161-1706011370209/current/finalized/subdir0/subdir0
</code>

Inspect block files (e.g.,

blk_1073741826

) to see how Hadoop splits large files.

After confirming replication on nodes hadoop103 and hadoop104, the cluster setup is complete.

Big DataconfigurationYARNHDFSHadoopCluster Setup
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.