Production Considerations for Deploying Linkerd: HA, Helm Charts, Prometheus, and Multi‑Cluster
This article explains how to prepare Linkerd for production use by covering high‑availability deployment, Helm chart installation, Prometheus metric handling, external Prometheus integration, multi‑cluster communication, and additional operational best‑practices such as resource tuning and security considerations.
High Availability (HA)
HA mode eliminates a single point of failure in the Linkerd control plane by deploying multiple replicas of core components (Controller, Destination, Identity, Proxy Injector, Service Profile Validator). The linkerd install --ha flag also configures resource requests and pod anti‑affinity to spread instances across nodes. Non‑core components like Grafana and Prometheus are not replicated, but the data plane continues to function.
When using HA, the proxy injector requires proper annotations; otherwise pod creation may fail. Adding the label config.linkerd.io/admission-webhooks: disabled to the kube-system namespace allows pods to be created even if the injector is unavailable.
Helm Chart Installation
For production, Helm is preferred over the Linkerd CLI. The Helm chart provides a values‑ha.yaml template that can be used to enable HA. Helm does not generate certificates automatically, so you must supply your own root and issuer certificates and set a long validity period (e.g., 10 years).
Prometheus Metrics
Linkerd ships with an embedded Prometheus instance that stores only the last six hours of data. In production you should export metrics to a long‑term storage solution such as Cortex, Thanos, or Victoriametrics (the latter is recommended). If an external Prometheus cluster already exists, you can configure Linkerd to scrape its metrics.
Configure External Prometheus
Add the following scrape configuration to the external Prometheus server:
- job_name: "grafana"
kubernetes_sd_configs:
- role: pod
namespaces:
names: ["linkerd-viz"]
relabel_configs:
- source_labels: [__meta_kubernetes_pod_container_name]
action: keep
regex: ^grafana$
- job_name: "linkerd-controller"
relabel_configs:
- source_labels: [__meta_kubernetes_pod_container_port_name]
action: keep
regex: admin-http
- source_labels: [__meta_kubernetes_pod_container_name]
action: replace
target_label: component
kubernetes_sd_configs:
- role: pod
namespaces:
names: ["linkerd", "linkerd-viz"]
- job_name: "linkerd-service-mirror"
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_label_linkerd_io_control_plane_component, __meta_kubernetes_pod_container_port_name]
action: keep
regex: linkerd-service-mirror;admin-http$
- source_labels: [__meta_kubernetes_pod_container_name]
action: replace
target_label: component
- job_name: "linkerd-proxy"
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_container_name, __meta_kubernetes_pod_container_port_name, __meta_kubernetes_pod_label_linkerd_io_control_plane_ns]
action: keep
regex: ^linkerd-proxy;linkerd-admin;linkerd$
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: pod
- source_labels: [__meta_kubernetes_pod_label_linkerd_io_proxy_job]
action: replace
target_label: k8s_job
- action: labeldrop
regex: __meta_kubernetes_pod_label_linkerd_io_proxy_job
- action: labelmap
regex: __meta_kubernetes_pod_label_linkerd_io_proxy_(.+)
- action: labeldrop
regex: __meta_kubernetes_pod_label_linkerd_io_proxy_(.+)
- action: labelmap
regex: __meta_kubernetes_pod_label_linkerd_io_(.+)
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
replacement: __tmp_pod_label_$1
- action: labeldrop
regex: __tmp_pod_label_linkerd_io_(.+)After updating the ConfigMap ( kubectl get cm -n linkerd-viz prometheus-config -o yaml ) ensure the external Prometheus can scrape the metrics.
If you want to disable the built‑in Prometheus when using an external one, set:
prometheus:
enabled: falseExample command to install Linkerd Viz with an external Prometheus URL and disable the internal instance:
$ linkerd viz install --set prometheusUrl=http://prometheus.kube-mon.svc.cluster.local:9090,prometheus.enabled=false | kubectl apply -f -Multi‑Cluster Communication
Linkerd’s multi‑cluster feature allows services in different Kubernetes clusters to communicate transparently, preserving mTLS, metrics, and reliability. Install the components with linkerd multi-cluster install , which creates the linkerd-multi-cluster namespace containing service-mirror and linkerd-gateway . All participating clusters must run the Linkerd control plane.
All clusters must share the same trust root for mTLS; therefore the same root certificate must be used during installation.
Other Operational Tips
Resource configuration: HA mode sets default CPU/memory requests for control‑plane components; you may need to adjust them and use the config.linkerd.io/proxy-cpu-limit annotation for high‑traffic proxies.
Clock skew: Ensure node clocks are synchronized (e.g., via NTP) because large drift can break mTLS verification.
NET_ADMIN capability: The proxy‑init container requires NET_ADMIN to set iptables rules; if you prefer not to grant this, use the Linkerd CNI plugin.
linkerd viz tap permissions: Use Kubernetes RBAC to restrict access to the potentially sensitive tap command output.
Ingress integration: Linkerd does not provide its own Ingress controller; combine it with an existing Ingress controller and inject Linkerd sidecars to obtain mTLS and observability from the edge.
Understanding these considerations and reviewing the official Linkerd documentation are essential before deploying Linkerd in a production environment.
DevOps Cloud Academy
Exploring industry DevOps practices and technical expertise.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.