Databases 34 min read

Master MongoDB Replication Sets and Sharding: Step‑by‑Step Guide

This comprehensive guide walks you through MongoDB replica set fundamentals, member roles, architecture, and configuration commands, then dives into sharding concepts, cluster components, chunk management, shard key selection, deployment steps, balancing operations, and practical troubleshooting tips, all illustrated with clear code examples and diagrams.

Raymond Ops
Raymond Ops
Raymond Ops
Master MongoDB Replication Sets and Sharding: Step‑by‑Step Guide

1.1 MongoDB Replica Set Overview

A MongoDB replica set is a group of

mongod

processes that maintain the same data set, providing redundancy and high availability for production deployments.

1.1.1 Purpose of a Replica Set

Replica sets ensure data redundancy and reliability by storing copies on different machines, protecting against single‑point failures and improving read capacity through distributed read servers.

1.1.2 Simple Introduction

Each replica set contains one primary that receives all write operations and one or more secondary members that replicate data from the primary. If the primary fails, a secondary is elected as the new primary. An arbiter participates in elections without storing data.

1.2 Basic Architecture of Replication

A typical three‑member replica set consists of two data‑bearing members and either a third data member or an arbiter.

1.2.1 Three Data‑Bearing Members

One primary; two secondaries that can each become primary if the current primary goes down.
Replica set diagram
Replica set diagram

1.2.2 Arbiter Node

An arbiter does not store data; it only votes during elections, helping maintain a majority in an even‑sized set.

Arbiter node diagram
Arbiter node diagram

1.2.3 Primary Election

After initializing a replica set with

replSetInitiate

(or

rs.initiate()

), members exchange heartbeats and elect a primary that receives a majority of votes. "Majority" is defined as

N/2 + 1

where

N

is the number of voting members.

Voting members

Majority

Tolerable failures

1

1

0

2

2

0

3

2

1

4

3

1

5

3

2

6

4

2

7

4

3

1.3 Replica Set Member Types

Secondary : Replicates from the primary, can serve read traffic, and may become primary during elections.

Arbiter : Votes only, never becomes primary, does not store data.

Priority 0 : Never elected as primary; useful for geographic placement.

Vote 0 : Does not participate in elections (max 7 voting members in MongoDB 3.0).

Hidden : Not visible to drivers, cannot become primary; ideal for backups or reporting.

Delayed : A hidden node that lags behind the primary by a configurable amount (e.g., 1 hour) for recovery purposes.

1.4 Configuring a MongoDB Replica Set

1.4.1 Environment Preparation

<code># Create mongod user
useradd -u800 mongod
echo 123456|passwd --stdin mongod
# Install MongoDB
mkdir -p /mongodb/bin
cd /mongodb
wget http://downloads.mongodb.org/linux/mongodb-linux-x86_64-rhel62-3.2.8.tgz
tar xf mongodb-linux-x86_64-3.2.8.tgz
cd mongodb-linux-x86_64-3.2.8/bin && cp * /mongodb/bin
chown -R mongod.mongod /mongodb
# Switch to mongod user
su - mongod</code>

1.4.2 Create Directories for Instances

<code>for i in 28017 28018 28019 28020; do
  mkdir -p /mongodb/$i/conf
  mkdir -p /mongodb/$i/data
  mkdir -p /mongodb/$i/log
done</code>

1.4.3 Configure Multiple Instances

<code>cat >>/mongodb/28017/conf/mongod.conf <<'EOF'
systemLog:
  destination: file
  path: /mongodb/28017/log/mongodb.log
  logAppend: true
storage:
  journal:
    enabled: true
  dbPath: /mongodb/28017/data
  directoryPerDB: true
net:
  port: 28017
replication:
  oplogSizeMB: 2048
  replSetName: my_repl
EOF
# Copy config to other ports and replace port numbers
for i in 28018 28019 28020; do
  cp /mongodb/28017/conf/mongod.conf /mongodb/$i/conf/mongod.conf
  sed -i "s#28017#$i#g" /mongodb/$i/conf/mongod.conf
done</code>

1.4.4 Start Services

<code>for i in 28017 28018 28019 28020; do
  mongod -f /mongodb/$i/conf/mongod.conf
 done</code>

1.4.5 Initialize the Replica Set

<code>mongo --port 28017
config = { _id: 'my_repl', members: [
  { _id: 0, host: '10.0.0.152:28017' },
  { _id: 1, host: '10.0.0.152:28018' },
  { _id: 2, host: '10.0.0.152:28019' }
]}
rs.initiate(config)</code>

1.4.6 Test Replication

<code># Insert data on primary
my_repl:PRIMARY> db.movies.insert([
  { "title": "Jaws", "year": 1975, "imdb_rating": 8.1 },
  { "title": "Batman", "year": 1989, "imdb_rating": 7.6 }
])
# Query on primary
my_repl:PRIMARY> db.movies.find().pretty()
# Read from secondary (enable reads)
my_repl:SECONDARY> rs.slaveOk()
my_repl:SECONDARY> db.movies.find().pretty()</code>

1.4.7 Replica Set Management Commands

rs.status()

– view overall replica set status.

rs.isMaster()

– check if the current node is primary.

rs.add("ip:port")

– add a new secondary.

rs.addArb("ip:port")

– add an arbiter.

rs.remove("ip:port")

– remove a member.

rs.stepDown()

– force primary to step down.

rs.freeze(300)

– prevent a secondary from becoming primary for 300 seconds.

2 MongoDB Sharding (分片) Overview

Sharding distributes large collections across multiple servers, providing horizontal scalability. Unlike MySQL partitioning, MongoDB handles most sharding tasks automatically once a shard key is defined.

2.1 Sharding Purpose

High‑volume workloads can overwhelm a single server’s CPU, storage, and I/O. Sharding (horizontal scaling) spreads data and traffic across many machines.

2.2 Sharding Architecture

Config Server : Stores metadata about shards and chunk distribution (usually a three‑node replica set).

Mongos : Routing process that forwards client requests to the appropriate shard(s).

Mongod (Shard) : Stores actual application data; multiple shard servers form the data layer.

Sharding components
Sharding components

2.3 Chunk Management

Data within a shard is divided into chunks . When a chunk exceeds the configured

chunkSize

(default 64 MB), MongoDB automatically splits it (splitting) and may move chunks between shards to balance load (balancing).

Chunk split illustration
Chunk split illustration

Balancing migrates chunks from overloaded shards to under‑utilized ones.

Balancing process
Balancing process

2.4 Shard Key Selection

A shard key must be indexed, immutable, and ≤ 512 bytes. Choices include:

Increasing key : Simple but can create write hotspots.

Random key : Distributes writes evenly but may cause scattered reads.

Compound key : Combines benefits of both.

2.5 Deploying a Sharded Cluster

2.5.1 Shard Servers

<code># Example shard server config (mongod.conf)
systemLog:
  destination: file
  path: /mongodb/28021/log/mongodb.log
  logAppend: true
storage:
  journal:
    enabled: true
  dbPath: /mongodb/28021/data
  directoryPerDB: true
net:
  bindIp: 10.0.0.152
  port: 28021
replication:
  replSetName: sh1
sharding:
  clusterRole: shardsvr
processManagement:
  fork: true
EOF
# Copy and adjust ports for additional shards (28022‑28026) and start them with mongod -f ...</code>

2.5.2 Config Server Replica Set

<code># Config server config (mongod.conf)
systemLog:
  destination: file
  path: /mongodb/28018/log/mongodb.log
  logAppend: true
storage:
  journal:
    enabled: true
  dbPath: /mongodb/28018/data
  directoryPerDB: true
net:
  bindIp: 10.0.0.152
  port: 28018
replication:
  replSetName: configReplSet
sharding:
  clusterRole: configsvr
processManagement:
  fork: true
EOF
# Initialize config server replica set with rs.initiate(...)</code>

2.5.3 Mongos Router

<code># mongos.conf
systemLog:
  destination: file
  path: /mongodb/28017/log/mongos.log
  logAppend: true
net:
  bindIp: 10.0.0.152
  port: 28017
sharding:
  configDB: configReplSet/10.0.0.152:28018,10.0.0.152:28019,10.0.0.152:28020
processManagement:
  fork: true
EOF
mongos -f /mongodb/28017/conf/mongos.conf</code>

2.5.4 Adding Shards

<code>mongo --host 10.0.0.152 --port 28017 -u admin -p password
sh.addShard("sh1/10.0.0.152:28021,10.0.0.152:28022,10.0.0.152:28023")
sh.addShard("sh2/10.0.0.152:28024,10.0.0.152:28025,10.0.0.152:28026")
sh.status()</code>

2.5.5 Enabling Sharding for a Database and Collection

<code># Enable sharding on database "test"
sh.enableSharding("test")
# Create an index on the shard key
use test
db.vast.createIndex({ id: 1 })
# Shard the collection
sh.shardCollection("test.vast", { id: 1 })
# Verify
sh.status()</code>

2.6 Balancer Operations

Check balancer state:

<code>sh.getBalancerState()   // true or false</code>

Start or stop the balancer:

<code>sh.startBalancer()   // if no activeWindow is set
sh.stopBalancer()
sh.setBalancerState(true)   // enable
sh.setBalancerState(false)  // disable</code>

Configure a balancer active window (times are in 24‑hour

HH:MM

format):

<code>use config
db.settings.update(
  { _id: "balancer" },
  { $set: { activeWindow: { start: "00:00", stop: "05:00" } } },
  { upsert: true }
)</code>

Remove the active window to allow the balancer to run anytime:

<code>use config
db.settings.update({ _id: "balancer" }, { $unset: { activeWindow: "" } })</code>

Enable or disable balancing for a specific collection:

<code>sh.disableBalancing("students.grades")   // stop balancing this collection
sh.enableBalancing("students.grades")    // re‑enable
// Verify
db.getSiblingDB("config").collections.findOne({ _id: "students.grades" }).noBalance</code>

2.7 Common Troubleshooting

To temporarily stop automatic balancing (e.g., during heavy write periods):

<code>use config
db.settings.update({ _id: "balancer" }, { $set: { stopped: true } }, true)</code>

To set a custom balancing window:

<code>use config
db.settings.update({ _id: "balancer" }, { $set: { activeWindow: { start: "21:00", stop: "09:00" } } }, true)</code>

After adjustments, monitor balancer activity with

sh.isBalancerRunning()

and

sh.status()

.

operationsDatabaseShardingReplicationMongoDB
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.