Cloud Computing 21 min read

How One Engineer Runs a Full SaaS on Kubernetes with Minimal Effort

This article details how a solo engineer built and operated a SaaS platform on AWS using Kubernetes, covering infrastructure overview, automatic DNS, TLS, load balancing, CI/CD rollouts, autoscaling, caching, secret management, monitoring, logging, error tracking, and cost‑effective operations.

Efficient Ops

Oct 9, 2024

How One Engineer Runs a Full SaaS on Kubernetes with Minimal Effort

Kubernetes is an open‑source system for deploying and managing containerized applications at scale, running on Amazon EC2 clusters and handling deployment, maintenance, and scaling.

The story shares how a Costa‑Rican engineer used Kubernetes in a startup to handle load balancing, cron‑job monitoring, alerts, and keep a one‑person company running smoothly.

AWS simplifies running Kubernetes with the managed Amazon Elastic Kubernetes Service (EKS), providing scalable, highly‑available virtual machines and community‑supported integrations.

Overall Architecture Overview

The infrastructure can serve multiple projects; the author uses Panelbear as a concrete example. The SaaS processes massive global requests, stores data efficiently for real‑time queries, and is still early in its business lifecycle.

After several iterations, the stack consists of a Django monolith, Postgres for the app database, ClickHouse for analytics, Redis for caching, Celery for background tasks, all running on a managed EKS cluster.

Automatic DNS, SSL, Load Balancing

Traffic enters the private cluster via an ingress-nginx controller, which routes requests to services and applies rate‑limiting and other shaping rules. The example uses a Django app served by Uvicorn.

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  namespace: example
  name: example-api
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/limit-rpm: "5000"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    external-dns.alpha.kubernetes.io/cloudflare-proxied: "true"
spec:
  tls:
  - hosts:
    - api.example.com
    secretName: example-api-tls
  rules:
  - host: api.example.com
    http:
      paths:
      - path: "/"
        backend:
          serviceName: example-api
          servicePort: http

Automatic Rollout and Rollback

When a new Docker image is pushed, the flux component syncs the cluster to the latest image and triggers an incremental rollout.

panelbear/panelbear-webserver:6a54bb3

Horizontal Autoscaling

The app scales based on CPU/memory usage; Kubernetes packs workloads onto nodes and adds or removes nodes as needed, scaling the Panelbear API pods from 2 up to 8 replicas.

CDN Static Asset Caching

Ingress rules include cloudflare-proxied: "true" to route traffic through Cloudflare. Application responses set standard HTTP cache headers, e.g.:

# Cache this response for 5 minutes
response["Cache-Control"] = "public, max-age=300"

Static files are served directly from the container using Whitenoise, avoiding separate uploads to Nginx/CloudFront/S3.

Application Data Caching

Heavy‑compute results, Django models, and rate‑limit counters are cached for 15 minutes using a decorator:

@cache(ttl=60 * 15)
def has_enough_capacity(site: Site) -> bool:
    """Returns True if a Site has enough capacity to accept incoming events, or False if it already went over the plan limits, and the grace period is over."""

Per‑Endpoint Rate Limiting

Django Ratelimit with Redis backend enforces limits per endpoint; exceeding the limit returns HTTP 429.

Application Management

Django’s built‑in admin panel assists with customer support. Access is restricted to staff and protected with 2FA. Security emails are sent on new logins.

Scheduled Jobs

Cron‑style jobs run via Celery workers and Celery beat, using Redis as the task queue. Monitoring uses Healthchecks.io, Cronitor, or CronHub, with alerts sent to Slack.

def some_hourly_job():
    # Task logic
    ...
    # Ping monitoring service once task completes
    TaskMonitor(
        name="send_quota_depleted_email",
        expected_schedule=timedelta(hours=1),
        grace_period=timedelta(hours=2),
    ).ping()

App Configuration

All settings are driven by environment variables, e.g.:

INVITE_ONLY = env.str("INVITE_ONLY", default=False)

ConfigMaps in Kubernetes inject these variables into containers.

apiVersion: v1
kind: ConfigMap
metadata:
  namespace: panelbear
  name: panelbear-webserver-config
data:
  INVITE_ONLY: "True"
  DEFAULT_FROM_EMAIL: "The Panelbear Team <[email protected]>"
  SESSION_COOKIE_SECURE: "True"
  SECURE_HSTS_PRELOAD: "True"
  SECURE_SSL_REDIRECT: "True"

Encryption

Secrets are sealed with kubeseal and decrypted only inside the cluster.

DATABASE_CONN_URL='postgres://user:pass@my-rds-db:5432/db'
SESSION_COOKIE_SECRET='this-is-supposed-to-be-very-secret'

DNS‑Based Service Discovery

Kubernetes automatically creates DNS records for services, enabling containers to communicate via URLs like redis://redis.weekend-project.svc.cluster:6379.

Version‑Controlled Infrastructure

All infrastructure lives in a monorepo with Docker, Terraform, and Kubernetes manifests, enabling one‑command creation or destruction of the entire stack.

# Cloud resource Terraform example
resource "aws_s3_bucket" "panelbear_app" {
  bucket = "panelbear-app"
  acl    = "private"
  tags = {
    Name        = "panelbear-app"
    Environment = "production"
  }
  lifecycle_rule {
    id      = "backups"
    enabled = true
    prefix  = "backups/"
    expiration { days = 30 }
  }
  server_side_encryption_configuration {
    rule {
      apply_server_side_encryption_by_default { sse_algorithm = "AES256" }
    }
  }
}

Logging

Logs are streamed to stdout and collected by Kubernetes; tools like stern help tail logs across pods.

Monitoring and Alerting

Initially using self‑hosted Prometheus/Grafana, the author later migrated to New Relic, forwarding metrics via the django‑prometheus library.

# Example metric registration
from django_prometheus import metrics
my_counter = metrics.Counter('my_counter', 'Description')</n

Error Tracking

Sentry aggregates exceptions from the Django app, providing context for each error. Alerts are routed to a Slack #alerts channel.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Monitoring ci/cd Kubernetes Autoscaling aws infrastructure-as-code

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.