What One RBAC Mistake Taught Me the Hard Way: Kubernetes Production Security Lessons
A late‑night production outage caused by a mis‑configured RBAC role sparked a deep dive into Kubernetes security, covering the principle of least privilege, proper ServiceAccount usage, network policies, audit scripts, and a practical checklist to harden clusters and avoid costly incidents.
Background
At 3 am an alarm sounded: an unauthorized access alert on a live Kubernetes cluster. An intern had mistakenly granted excessive RBAC permissions, almost deleting core services. The incident highlighted that Kubernetes security is not optional—it must be treated as a mandatory requirement.
1. RBAC – Did You Configure It Correctly?
1.1 The Art of Least‑Privilege
Many teams give the cluster-admin role to developers for convenience, which is equivalent to handing every key in a house to a temporary worker.
Practical configuration example:
# Development environment – developer role
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: dev
name: developer-role
rules:
- apiGroups: [""]
resources: ["pods", "pods/log", "pods/exec"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "update", "patch"]
---
# Production environment – ops manager (tiered)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: ops-manager
rules:
- apiGroups: [""]
resources: ["nodes", "namespaces"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments", "daemonsets", "statefulsets"]
verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: [""]
resources: ["pods/exec"]
verbs: ["create"] # only for emergencies1.2 Proper ServiceAccount Usage
Using the default ServiceAccount for every pod is dangerous. Create a dedicated ServiceAccount per application and bind only the required permissions.
# Create a dedicated ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
name: app-reader
namespace: production
---
# Bind minimal permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: production
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: pod-reader
subjects:
- kind: ServiceAccount
name: app-reader
namespace: production1.3 Auditing Permissions
After configuring RBAC, verify that permissions are correct. The following commands are useful:
# Check if a user can delete pods
kubectl auth can-i delete pods --as=developer -n production
# List all permissions for a user
kubectl auth can-i --list --as=developer -n dev
# Simple script to find high‑privilege bindings
#!/bin/bash
echo "=== Checking high‑privilege role bindings ==="
kubectl get clusterrolebindings -o json | jq -r '.items[] | select(.roleRef.name=="cluster-admin") | .metadata.name + ": " + (.subjects[]|.name)'2. Network Policies – Building a Zero‑Trust Network
2.1 Default‑Deny All Traffic
Start by denying all inbound and outbound traffic, then open only what is required.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress2.2 Fine‑Grained Traffic Control
Example: front‑end pods can only talk to back‑end API, back‑end can only talk to the database, and the database only accepts traffic from back‑end.
# Front‑end → Back‑end API
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: frontend-to-backend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
---
# Back‑end → Database
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: backend-to-database
namespace: production
spec:
podSelector:
matchLabels:
app: database
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: backend
namespaceSelector:
matchLabels:
environment: production
ports:
- protocol: TCP
port: 33062.3 Cross‑Namespace Communication Control
Production workloads should never be reachable from development namespaces.
# Allow only specific namespaces to access a service
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: cross-namespace-policy
namespace: shared-services
spec:
podSelector:
matchLabels:
app: logging-service
ingress:
- from:
- namespaceSelector:
matchLabels:
environment: production
- namespaceSelector:
matchLabels:
environment: staging
ports:
- protocol: TCP
port: 92003. Real‑World Pitfalls
3.1 RBAC Misconfiguration Causing Outage
Scenario: A developer was given read‑only access to view logs, but ops granted cluster‑admin rights. The developer accidentally deleted a critical ConfigMap.
Use read‑only roles for log inspection.
Enforce all production changes through CI/CD pipelines, not direct kubectl commands.
Regularly audit and revoke temporary permissions.
3.2 NetworkPolicy Mistake Leading to Service Disruption
Scenario: A NetworkPolicy omitted DNS (port 53), causing pods to fail name resolution.
Correct configuration:
# Allow DNS resolution
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 533.3 Monitoring and Alerting
After hardening, set up continuous monitoring for suspicious activity.
# Watch for unauthorized API calls
kubectl logs -n kube-system kube-apiserver-master | grep "Unauthorized"
# Deploy Falco for runtime security
helm install falco falcosecurity/falco \
--set falco.grpc.enabled=true \
--set falco.grpcOutput.enabled=true4. Security Hardening Checklist
RBAC Checklist
Remove unnecessary cluster-admin bindings.
Create a dedicated ServiceAccount for each application.
Apply the principle of least privilege.
Audit permission assignments regularly.
Disable anonymous access.
Enable audit logging.
Network Policy Checklist
Implement a default‑deny policy.
Restrict cross‑namespace communication.
Protect system components in kube-system.
Allow necessary DNS resolution.
Restrict egress to known services.
Test policy effectiveness regularly.
Additional Hardening Measures
Enable Pod Security Standards.
Use admission controllers such as OPA/Gatekeeper.
Keep Kubernetes versions up to date.
Scan container images for vulnerabilities.
Encrypt etcd data.
Use TLS/mTLS for network encryption.
5. Automated Security Compliance Checks
Below is a Bash script that automates common security audits.
#!/bin/bash
# K8s security audit script
echo "======= K8s Security Audit ======="
# Check for high‑privilege accounts
echo "[*] Checking cluster‑admin bindings..."
kubectl get clusterrolebindings -o json | \
jq '.items[] | select(.roleRef.name=="cluster-admin") | .metadata.name'
# Check default ServiceAccount usage
echo "[*] Checking default SA usage..."
kubectl get pods --all-namespaces -o json | \
jq '.items[] | select(.spec.serviceAccount=="default") | .metadata.namespace + "/" + .metadata.name'
# Find namespaces without NetworkPolicy
echo "[*] Namespaces without NetworkPolicy..."
for ns in $(kubectl get ns -o name | cut -d/ -f2); do
policies=$(kubectl get networkpolicy -n $ns 2>/dev/null | wc -l)
if [ $policies -eq 0 ]; then
echo " - $ns: No NetworkPolicy found!"
fi
done
# Check for privileged containers
echo "[*] Checking privileged containers..."
kubectl get pods --all-namespaces -o json | \
jq '.items[] | select(.spec.containers[].securityContext.privileged==true) | .metadata.namespace + "/" + .metadata.name'6. Recommended Tools
Kubescape – scans manifests against the NSA hardening guide. Polaris – checks for Kubernetes best‑practice configuration. Kube‑bench – runs CIS benchmark tests on the cluster.
Conclusion – Security Is a Marathon
Hardening a Kubernetes cluster is an ongoing effort. Start with RBAC, then incrementally apply network policies, automate audits, and regularly rehearse incident response. The cost of a security breach far exceeds the investment in preventive measures.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
