Boost Spring Boot Service Availability to 99.9% with Smart K8s Probe Configurations
The article walks through common Kubernetes health‑probe pitfalls for Spring Boot services and presents a concrete set of liveness, readiness, graceful‑shutdown, autoscaling, and configuration‑separation techniques that together raise production availability to 99.9%, backed by real‑world incidents and code snippets.
Health Checks
Problem: A liveness probe that only checks the home page returns 200 OK even when the database pool crashes, so Kubernetes keeps routing traffic and payment requests fail.
Solution:
Liveness probe targeting core dependencies
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 50000
failureThreshold: 3Use a separate management port to avoid traffic overload on the health endpoint.
Readiness probe to avoid half‑ready pods
readinessProbe:
httpGet:
path: /actuator/health/readiness
initialDelaySeconds: 20
periodSeconds: 5
timeoutSeconds: 1Missing initialDelaySeconds caused restart loops during startup.
Graceful Shutdown
Problem: Killing a pod while an order‑processing request is in flight caused payment failures and reconciliation chaos.
Solution:
Enable graceful shutdown in Spring Boot:
# application.yaml
server:
shutdown: graceful
lifecycle:
timeout-per-shutdown-phase: 30sAdd a preStop hook with a sleep buffer:
spec:
template:
spec:
containers:
- name: app
lifecycle:
preStop:
exec:
command: ["sh","-c","sleep 30; kill -SIGTERM 1"]Without the sleep the old pod was killed instantly, leaving a service gap.
Dynamic Scaling
Observation: Scaling on CPU > 80 % can be misleading; a social app hit 90 % CPU but HPA added pods that increased latency because the thread pool was saturated.
Three‑pronged scaling strategy:
Scale on readiness ratio:
metrics:
- type: Object
object:
metric:
name: ready_pods_ratio
target:
type: Value
value: 0.8Scale on QPS per pod:
type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: 100Protect JVM memory with limits:
resources:
limits:
memory: "1Gi"
requests:
memory: "512Mi"Missing memory limits once let a leaking pod bring down the whole node.
Configuration Separation
Incident: Storing the database password in application.yaml and pushing the file to a public GitHub repository led to a database breach.
Recommended practice:
Create a ConfigMap for non‑secret configuration:
kubectl create cm app-config --from-file=application-prod.yamlStore secrets in a Kubernetes Secret:
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-secret
key: passwordBuild a lean container image without configuration files:
FROM openjdk:17-alpine
COPY target/*.jar /app.jar
CMD ["java","-jar","/app.jar"]Including test‑environment config in the image once caused production to connect to the wrong database.
This solution has been validated in production; adjust parameters to fit your own workload.
After applying these probes in a logistics system, deployment failure dropped from 15 % to 0.3 % and midnight alerts ceased.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Architect Essentials
Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
