How to Manually Deploy Prometheus Federation on Kubernetes – Step‑by‑Step Guide
This guide walks through manually deploying a Prometheus federation on Kubernetes, covering environment setup with sealos, creating storage classes, persistent volumes, ConfigMaps, StatefulSets, services, applying manifests, and verifying the federation to aggregate metrics across multiple clusters.
Manually deploy Prometheus federation in Kubernetes.
When you have multiple Kubernetes clusters, you often need to aggregate metrics. This example assumes an external Prometheus federation that scrapes the
kube-systemand
defaultnamespaces from each cluster.
Environment
The local environment uses
sealosfor one‑click deployment, mainly for testing.
OS
Kubernetes
HostName
IP
Service
Ubuntu 18.04 1.17.7 sealos-k8s-m1 192.168.1.151 node-exporter prometheus-federate-0 Ubuntu 18.04 1.17.7 sealos-k8s-m2 192.168.1.152 node-exporter grafana alertmanager-0 Ubuntu 18.04 1.17.7 sealos-k8s-m3 192.168.1.150 node-exporter alertmanager-1 Ubuntu 18.04 1.17.7 sealos-k8s-node1 192.168.1.153 node-exporter prometheus-0 kube-state-metrics Ubuntu 18.04 1.17.7 sealos-k8s-node2 192.168.1.154 node-exporter prometheus-1 Ubuntu 18.04 1.17.7 sealos-k8s-node2 192.168.1.155 node-exporter prometheus-2Deploy Prometheus Federation Cluster
Create the data directory for
prometheus-federate:
<code># On m1
mkdir /data/prometheus-federate/
chown -R 65534:65534 /data/prometheus-federate/</code>Create the StorageClass configuration:
<code>cd /data/manual-deploy/prometheus/
cat prometheus-federate-storageclass.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: prometheus-federate-lpv
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer</code>Create the PersistentVolume configuration:
<code>apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-federate-lpv-0
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: prometheus-federate-lpv
local:
path: /data/prometheus-federate
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- sealos-k8s-m1</code>Create the ConfigMap configuration:
<code>cat prometheus-federate-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: prometheus-federate-config
namespace: kube-system
data:
alertmanager_rules.yaml: |
groups:
- name: example
rules:
- alert: InstanceDown
expr: up == 0
for: 1m
labels:
severity: page
annotations:
summary: "Instance {{ $labels.instance }} down"
description: "{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 1 minutes."
- alert: NodeMemoryUsage
expr: (node_memory_MemTotal_bytes -(node_memory_MemFree_bytes+node_memory_Buffers_bytes+node_memory_Cached_bytes )) / node_memory_MemTotal_bytes * 100 > 80
for: 1m
labels:
team: ops
annotations:
summary: "cluster:{{ $labels.cluster }} {{ $labels.instance }}: High Memory usage detected"
description: "{{ $labels.instance }}: Memory usage is above 55% (current value is: {{ $value }})"
prometheus.yml: |
global:
scrape_interval: 30s
evaluation_interval: 30s
alerting:
alertmanagers:
- static_configs:
- targets:
- alertmanager-0.alertmanager-operated:9093
- alertmanager-1.alertmanager-operated:9093
rule_files:
- "/etc/prometheus/alertmanager_rules.yaml"
scrape_configs:
- job_name: 'federate'
scrape_interval: 30s
honor_labels: true
metrics_path: '/federate'
params:
'match[]':
- '{job=~"kubernetes.*"}'
- '{job="prometheus"}'
static_configs:
- targets:
- 'prometheus-0.prometheus:9090'
- 'prometheus-1.prometheus:9090'
- 'prometheus-2.prometheus:9090'</code>Create the StatefulSet configuration:
<code>cat prometheus-federate-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: prometheus-federate
namespace: kube-system
labels:
k8s-app: prometheus-federate
spec:
serviceName: "prometheus-federate"
podManagementPolicy: "Parallel"
replicas: 1
selector:
matchLabels:
k8s-app: prometheus-federate
template:
metadata:
labels:
k8s-app: prometheus-federate
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values:
- prometheus-federate
topologyKey: "kubernetes.io/hostname"
priorityClassName: system-cluster-critical
hostNetwork: true
dnsPolicy: ClusterFirstWithHostNet
containers:
- name: prometheus-federate-configmap-reload
image: "jimmidyson/configmap-reload:v0.4.0"
args:
- --volume-dir=/etc/config
- --webhook-url=http://localhost:9091/-/reload
volumeMounts:
- name: config-volume
mountPath: /etc/config
readOnly: true
resources:
limits:
cpu: 10m
memory: 10Mi
requests:
cpu: 10m
memory: 10Mi
securityContext:
runAsUser: 0
privileged: true
- name: prometheus
image: prom/prometheus:v2.20.0
args:
- "--web.listen-address=0.0.0.0:9091"
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention=24h"
- "--web.console.libraries=/etc/prometheus/console_libraries"
- "--web.console.templates=/etc/prometheus/consoles"
- "--web.enable-lifecycle"
ports:
- containerPort: 9091
protocol: TCP
volumeMounts:
- name: prometheus-federate-data
mountPath: "/prometheus"
- name: config-volume
mountPath: "/etc/prometheus"
readinessProbe:
httpGet:
path: /-/ready
port: 9091
initialDelaySeconds: 30
timeoutSeconds: 30
livenessProbe:
httpGet:
path: /-/healthy
port: 9091
initialDelaySeconds: 30
timeoutSeconds: 30
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 1000m
memory: 2500Mi
securityContext:
runAsUser: 0
privileged: true
serviceAccountName: prometheus
volumes:
- name: config-volume
configMap:
name: prometheus-federate-config
volumeClaimTemplates:
- metadata:
name: prometheus-federate-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "prometheus-federate-lpv"
resources:
requests:
storage: 5Gi</code>Create the Service configuration:
<code>cat prometheus-service-statefulset.yaml
apiVersion: v1
kind: Service
metadata:
name: prometheus
namespace: kube-system
spec:
ports:
- name: prometheus
port: 9090
targetPort: 9090
selector:
k8s-app: prometheus
clusterIP: None</code>Deploy the resources:
<code>cd /data/manual-deploy/prometheus/
# Apply manifests in order
kubectl apply -f prometheus-federate-storageclass.yaml
kubectl apply -f prometheus-federate-pv.yaml
kubectl apply -f prometheus-federate-configmap.yaml
kubectl apply -f prometheus-federate-statefulset.yaml
kubectl apply -f prometheus-federate-service-statefulset.yaml</code>Verify the deployment:
<code># Check PVC
kubectl -n kube-system get pvc | grep federate
# Check Pods
kubectl -n kube-system get pod | grep federate</code>After these steps, the federation is up. You can open
http://192.168.1.151:9091in a browser to view targets, rules, and trigger alerts to confirm that the Alertmanager cluster is working.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.