How One Engineer Runs a Full SaaS on Kubernetes with Minimal Effort
This article details how a solo engineer built and operated a SaaS platform on AWS using Kubernetes, covering infrastructure overview, automatic DNS, TLS, load balancing, CI/CD rollouts, autoscaling, caching, secret management, monitoring, logging, error tracking, and cost‑effective operations.
Kubernetes is an open‑source system for deploying and managing containerized applications at scale, running on Amazon EC2 clusters and handling deployment, maintenance, and scaling.
The story shares how a Costa‑Rican engineer used Kubernetes in a startup to handle load balancing, cron‑job monitoring, alerts, and keep a one‑person company running smoothly.
AWS simplifies running Kubernetes with the managed Amazon Elastic Kubernetes Service (EKS), providing scalable, highly‑available virtual machines and community‑supported integrations.
Overall Architecture Overview
The infrastructure can serve multiple projects; the author uses Panelbear as a concrete example. The SaaS processes massive global requests, stores data efficiently for real‑time queries, and is still early in its business lifecycle.
After several iterations, the stack consists of a Django monolith, Postgres for the app database, ClickHouse for analytics, Redis for caching, Celery for background tasks, all running on a managed EKS cluster.
Automatic DNS, SSL, Load Balancing
Traffic enters the private cluster via an
ingress-nginxcontroller, which routes requests to services and applies rate‑limiting and other shaping rules. The example uses a Django app served by Uvicorn.
<code>apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
namespace: example
name: example-api
annotations:
kubernetes.io/ingress.class: "nginx"
nginx.ingress.kubernetes.io/limit-rpm: "5000"
cert-manager.io/cluster-issuer: "letsencrypt-prod"
external-dns.alpha.kubernetes.io/cloudflare-proxied: "true"
spec:
tls:
- hosts:
- api.example.com
secretName: example-api-tls
rules:
- host: api.example.com
http:
paths:
- path: "/"
backend:
serviceName: example-api
servicePort: http</code>Automatic Rollout and Rollback
When a new Docker image is pushed, the
fluxcomponent syncs the cluster to the latest image and triggers an incremental rollout.
<code>panelbear/panelbear-webserver:6a54bb3</code>Horizontal Autoscaling
The app scales based on CPU/memory usage; Kubernetes packs workloads onto nodes and adds or removes nodes as needed, scaling the Panelbear API pods from 2 up to 8 replicas.
CDN Static Asset Caching
Ingress rules include
cloudflare-proxied: "true"to route traffic through Cloudflare. Application responses set standard HTTP cache headers, e.g.:
<code># Cache this response for 5 minutes
response["Cache-Control"] = "public, max-age=300"</code>Static files are served directly from the container using Whitenoise, avoiding separate uploads to Nginx/CloudFront/S3.
Application Data Caching
Heavy‑compute results, Django models, and rate‑limit counters are cached for 15 minutes using a decorator:
<code>@cache(ttl=60 * 15)
def has_enough_capacity(site: Site) -> bool:
"""Returns True if a Site has enough capacity to accept incoming events, or False if it already went over the plan limits, and the grace period is over."""
</code>Per‑Endpoint Rate Limiting
Django Ratelimit with Redis backend enforces limits per endpoint; exceeding the limit returns HTTP 429.
Application Management
Django’s built‑in admin panel assists with customer support. Access is restricted to staff and protected with 2FA. Security emails are sent on new logins.
Scheduled Jobs
Cron‑style jobs run via Celery workers and Celery beat, using Redis as the task queue. Monitoring uses Healthchecks.io, Cronitor, or CronHub, with alerts sent to Slack.
<code>def some_hourly_job():
# Task logic
...
# Ping monitoring service once task completes
TaskMonitor(
name="send_quota_depleted_email",
expected_schedule=timedelta(hours=1),
grace_period=timedelta(hours=2),
).ping()</code>App Configuration
All settings are driven by environment variables, e.g.:
<code>INVITE_ONLY = env.str("INVITE_ONLY", default=False)</code>ConfigMaps in Kubernetes inject these variables into containers.
<code>apiVersion: v1
kind: ConfigMap
metadata:
namespace: panelbear
name: panelbear-webserver-config
data:
INVITE_ONLY: "True"
DEFAULT_FROM_EMAIL: "The Panelbear Team <[email protected]>"
SESSION_COOKIE_SECURE: "True"
SECURE_HSTS_PRELOAD: "True"
SECURE_SSL_REDIRECT: "True"</code>Encryption
Secrets are sealed with
kubesealand decrypted only inside the cluster.
<code>DATABASE_CONN_URL='postgres://user:pass@my-rds-db:5432/db'
SESSION_COOKIE_SECRET='this-is-supposed-to-be-very-secret'</code>DNS‑Based Service Discovery
Kubernetes automatically creates DNS records for services, enabling containers to communicate via URLs like
redis://redis.weekend-project.svc.cluster:6379.
Version‑Controlled Infrastructure
All infrastructure lives in a monorepo with Docker, Terraform, and Kubernetes manifests, enabling one‑command creation or destruction of the entire stack.
<code># Cloud resource Terraform example
resource "aws_s3_bucket" "panelbear_app" {
bucket = "panelbear-app"
acl = "private"
tags = {
Name = "panelbear-app"
Environment = "production"
}
lifecycle_rule {
id = "backups"
enabled = true
prefix = "backups/"
expiration { days = 30 }
}
server_side_encryption_configuration {
rule {
apply_server_side_encryption_by_default { sse_algorithm = "AES256" }
}
}
}</code>Logging
Logs are streamed to stdout and collected by Kubernetes; tools like
sternhelp tail logs across pods.
Monitoring and Alerting
Initially using self‑hosted Prometheus/Grafana, the author later migrated to New Relic, forwarding metrics via the
django‑prometheuslibrary.
<code># Example metric registration
from django_prometheus import metrics
my_counter = metrics.Counter('my_counter', 'Description')</n</code>Error Tracking
Sentry aggregates exceptions from the Django app, providing context for each error. Alerts are routed to a Slack #alerts channel.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.