How to Build, Manage, and Recover an etcd Cluster with TLS on CentOS
This guide walks you through setting up a three‑node etcd cluster on CentOS 7 using static configuration and self‑signed TLS certificates, covering member addition and removal, data backup via snapshots, and full cluster restoration from those snapshots.
Preface
Previously I was not familiar with etcd, so I decided to learn it and recorded the process to help others.
Environment
Three virtual machines running CentOS 7.7 with etcd version 3.4.16 and self‑signed certificates:
IP: 10.211.55.50, Hostname: etcd1
IP: 10.211.55.51, Hostname: etcd2
IP: 10.211.55.52, Hostname: etcd3
1. Cluster Setup
Deploy etcd on the VMs using static configuration, specifying each node in the configuration.
Download, extract, and move etcd binaries to
/usr/local/bin/and set permissions.
<code># Download address: https://github.com/etcd-io/etcd/releases
wget https://github.com/coreos/etcd/releases/download/v3.4.16/etcd-v3.4.16-linux-amd64.tar.gz
tar -zxvf etcd-v3.4.16-linux-amd64.tar.gz
cd etcd*
mv etcdctl etcd /usr/local/bin
chmod +x /usr/local/bin/etcd*</code>Prepare certificate files (
ca-config.json,
etcd-ca-csr.json,
etcd-csr.json).
<code># ca-config.json etcd-ca-csr.json etcd-csr.json
cat ca-config.json
{
"signing": {
"default": {"expiry": "876000h"},
"profiles": {"kubernetes": {"usages": ["signing","key encipherment","server auth","client auth"],"expiry": "876000h"}}
}
}
cat etcd-ca-csr.json
{
"CN": "etcd",
"key": {"algo": "rsa","size": 2048},
"names": [{"C": "CN","ST": "Shenzhen","L": "Shenzhen","O": "etcd","OU": "Etcd Security"}]
}
cat etcd-csr.json
{
"CN": "etcd",
"hosts": ["127.0.0.1","10.211.55.50","10.211.55.51","10.211.55.52"],
"key": {"algo": "rsa","size": 2048},
"names": [{"C": "CN","ST": "Shenzhen","L": "Shenzhen","O": "etcd","OU": "Etcd Security"}]
}</code>Generate the etcd CA certificate.
<code>cfssl gencert -initca etcd-ca-csr.json | cfssljson -bare etcd-ca
# List generated files
ls -al</code>Generate the etcd server certificate.
<code>cfssl gencert -ca=etcd-ca.pem -ca-key=etcd-ca-key.pem -config=ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd
# List generated files
ls -al</code>Copy certificates to the etcd SSL directory and distribute to other nodes.
<code>mkdir -pv /etc/etcd/ssl && cp etcd*.pem /etc/etcd/ssl
scp -r /etc/etcd root@etcd2:/etc/etcd
scp -r /etc/etcd root@etcd3:/etc/etcd</code>Create a systemd service file for etcd (modify IP and name per node).
<code>[Unit]
Description=Etcd Server
After=network.target network-online.target
Wants=network-online.target
Documentation=https://github.com/coreos
[Service]
Type=notify
WorkingDirectory=/var/lib/etcd/
EnvironmentFile=-/etc/etcd/etcd.conf
ExecStart=/usr/local/bin/etcd \
--name=etcd1 \
--cert-file=/etc/etcd/ssl/etcd.pem \
--key-file=/etc/etcd/ssl/etcd-key.pem \
--peer-cert-file=/etc/etcd/ssl/etcd.pem \
--peer-key-file=/etc/etcd/ssl/etcd-key.pem \
--trusted-ca-file=/etc/etcd/ssl/etcd-ca.pem \
--peer-trusted-ca-file=/etc/etcd/ssl/etcd-ca.pem \
--initial-advertise-peer-urls=https://10.211.55.50:2380 \
--listen-peer-urls=https://10.211.55.50:2380 \
--listen-client-urls=https://10.211.55.50:2379,http://127.0.0.1:2379 \
--advertise-client-urls=https://10.211.55.50:2379 \
--initial-cluster-token=etcd-cluster-0 \
--initial-cluster=etcd1=https://10.211.55.50:2380,etcd2=https://10.211.55.51:2380,etcd3=https://10.211.55.52:2380 \
--initial-cluster-state=new \
--data-dir=/var/lib/etcd
Restart=on-failure
RestartSec=5
LimitNOFILE=65536
[Install]
WantedBy=multi-user.target</code>Start etcd on all three nodes.
<code>systemctl daemon-reload && systemctl enable etcd && systemctl start etcd</code>Check cluster status:
<code># List members
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.52:2379 member list -w table
# Verify leader and write a key
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.50:2379,https://10.211.55.51:2379,https://10.211.55.52:2379 put foo4 bar4
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.50:2379,https://10.211.55.51:2379,https://10.211.55.52:2379 get foo4</code>2. Member Changes
Remove a Member
Because of limited resources, we first simulate member removal.
<code># Get member IDs
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.50:2379,https://10.211.55.51:2379,https://10.211.55.52:2379 endpoint status -w table
# Remove etcd3 (ID ca2cb14b2acc776)
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.50:2379,https://10.211.55.51:2379 member remove ca2cb14b2acc776
# Verify remaining members
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.50:2379,https://10.211.55.51:2379 endpoint status -w table</code>Add a Member
Add a new node by first using
etcdctlthen updating the service file.
<code># Add member etcd3
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.50:2379,https://10.211.55.51:2379 member add etcd3 --peer-urls="https://10.211.55.52:2380"
# Update etcd.service on the new node
[Unit]
Description=Etcd Server
...
ExecStart=/usr/local/bin/etcd \
--name=etcd3 \
--cert-file=/etc/etcd/ssl/etcd.pem \
--key-file=/etc/etcd/ssl/etcd-key.pem \
--peer-cert-file=/etc/etcd/ssl/etcd.pem \
--peer-key-file=/etc/etcd/ssl/etcd-key.pem \
--trusted-ca-file=/etc/etcd/ssl/etcd-ca.pem \
--peer-trusted-ca-file=/etc/etcd/ssl/etcd-ca.pem \
--initial-advertise-peer-urls=https://10.211.55.52:2380 \
--listen-peer-urls=https://10.211.55.52:2380 \
--listen-client-urls=https://10.211.55.52:2379,http://127.0.0.1:2379 \
--advertise-client-urls=https://10.211.55.52:2379 \
--initial-cluster-token=etcd-cluster-0 \
--initial-cluster=etcd1=https://10.211.55.50:2380,etcd2=https://10.211.55.51:2380,etcd3=https://10.211.55.52:2380 \
--initial-cluster-state=existing \
--data-dir=/var/lib/etcd
# Start the new member
systemctl start etcd && systemctl status etcd
# Verify cluster status
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.50:2379,https://10.211.55.51:2379,https://10.211.55.52:2379 endpoint status -w table</code>3. Data Backup
Perform a manual snapshot for testing purposes.
<code># Write a test key
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.50:2379,https://10.211.55.51:2379,https://10.211.55.52:2379 put xiyangxixi boys
# Create snapshot
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.50:2379 snapshot save snapshot-xiyangxixi.db</code>4. Cluster Data Recovery
Restore the cluster from the same snapshot file on each node.
<code># On etcd1
etcdctl snapshot restore /root/snapshot-xiyangxixi.db \
--cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem \
--name etcd1 \
--initial-advertise-peer-urls=https://10.211.55.50:2380 \
--initial-cluster-token=etcd-cluster-1 \
--initial-cluster=etcd1=https://10.211.55.50:2380,etcd2=https://10.211.55.51:2380,etcd3=https://10.211.55.52:2380
# Repeat on etcd2 and etcd3 with appropriate --name and --initial-advertise-peer-urls
# Reload systemd and start etcd on each node
systemctl daemon-reload && systemctl start etcd
# Verify restored data
etcdctl --cacert /etc/etcd/ssl/etcd-ca.pem --cert /etc/etcd/ssl/etcd.pem --key /etc/etcd/ssl/etcd-key.pem --endpoints=https://10.211.55.50:2379,https://10.211.55.51:2379,https://10.211.55.52:2379 get xiyangxixi</code>References
https://etcd.io/docs/v3.4/op-guide/hardware/
https://etcd.io/docs/v3.4/op-guide/recovery/
https://etcd.io/docs/v3.4/op-guide/clustering/
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.