Operations 12 min read

Production Considerations for Deploying Linkerd: HA, Helm Charts, Prometheus, and Multi‑Cluster

This article explains how to prepare Linkerd for production use by covering high‑availability deployment, Helm chart installation, Prometheus metric handling, external Prometheus integration, multi‑cluster communication, and additional operational best‑practices such as resource tuning and security considerations.

DevOps Cloud Academy

Oct 4, 2022

Production Considerations for Deploying Linkerd: HA, Helm Charts, Prometheus, and Multi‑Cluster

High Availability (HA)

HA mode eliminates a single point of failure in the Linkerd control plane by deploying multiple replicas of core components (Controller, Destination, Identity, Proxy Injector, Service Profile Validator). The linkerd install --ha flag also configures resource requests and pod anti‑affinity to spread instances across nodes. Non‑core components like Grafana and Prometheus are not replicated, but the data plane continues to function.

When using HA, the proxy injector requires proper annotations; otherwise pod creation may fail. Adding the label config.linkerd.io/admission-webhooks: disabled to the kube-system namespace allows pods to be created even if the injector is unavailable.

Helm Chart Installation

For production, Helm is preferred over the Linkerd CLI. The Helm chart provides a values‑ha.yaml template that can be used to enable HA. Helm does not generate certificates automatically, so you must supply your own root and issuer certificates and set a long validity period (e.g., 10 years).

Prometheus Metrics

Linkerd ships with an embedded Prometheus instance that stores only the last six hours of data. In production you should export metrics to a long‑term storage solution such as Cortex, Thanos, or Victoriametrics (the latter is recommended). If an external Prometheus cluster already exists, you can configure Linkerd to scrape its metrics.

Configure External Prometheus

Add the following scrape configuration to the external Prometheus server:

- job_name: "grafana"
  kubernetes_sd_configs:
    - role: pod
      namespaces:
        names: ["linkerd-viz"]
  relabel_configs:
    - source_labels: [__meta_kubernetes_pod_container_name]
      action: keep
      regex: ^grafana$
- job_name: "linkerd-controller"
  relabel_configs:
    - source_labels: [__meta_kubernetes_pod_container_port_name]
      action: keep
      regex: admin-http
    - source_labels: [__meta_kubernetes_pod_container_name]
      action: replace
      target_label: component
  kubernetes_sd_configs:
    - role: pod
      namespaces:
        names: ["linkerd", "linkerd-viz"]
- job_name: "linkerd-service-mirror"
  kubernetes_sd_configs:
    - role: pod
  relabel_configs:
    - source_labels: [__meta_kubernetes_pod_label_linkerd_io_control_plane_component, __meta_kubernetes_pod_container_port_name]
      action: keep
      regex: linkerd-service-mirror;admin-http$
    - source_labels: [__meta_kubernetes_pod_container_name]
      action: replace
      target_label: component
- job_name: "linkerd-proxy"
  kubernetes_sd_configs:
    - role: pod
  relabel_configs:
    - source_labels: [__meta_kubernetes_pod_container_name, __meta_kubernetes_pod_container_port_name, __meta_kubernetes_pod_label_linkerd_io_control_plane_ns]
      action: keep
      regex: ^linkerd-proxy;linkerd-admin;linkerd$
    - source_labels: [__meta_kubernetes_namespace]
      action: replace
      target_label: namespace
    - source_labels: [__meta_kubernetes_pod_name]
      action: replace
      target_label: pod
    - source_labels: [__meta_kubernetes_pod_label_linkerd_io_proxy_job]
      action: replace
      target_label: k8s_job
    - action: labeldrop
      regex: __meta_kubernetes_pod_label_linkerd_io_proxy_job
    - action: labelmap
      regex: __meta_kubernetes_pod_label_linkerd_io_proxy_(.+)
    - action: labeldrop
      regex: __meta_kubernetes_pod_label_linkerd_io_proxy_(.+)
    - action: labelmap
      regex: __meta_kubernetes_pod_label_linkerd_io_(.+)
    - action: labelmap
      regex: __meta_kubernetes_pod_label_(.+)
      replacement: __tmp_pod_label_$1
    - action: labeldrop
      regex: __tmp_pod_label_linkerd_io_(.+)

After updating the ConfigMap ( kubectl get cm -n linkerd-viz prometheus-config -o yaml) ensure the external Prometheus can scrape the metrics.

If you want to disable the built‑in Prometheus when using an external one, set:

prometheus:
  enabled: false

Example command to install Linkerd Viz with an external Prometheus URL and disable the internal instance:

$ linkerd viz install --set prometheusUrl=http://prometheus.kube-mon.svc.cluster.local:9090,prometheus.enabled=false | kubectl apply -f -

Multi‑Cluster Communication

Linkerd’s multi‑cluster feature allows services in different Kubernetes clusters to communicate transparently, preserving mTLS, metrics, and reliability. Install the components with linkerd multi-cluster install, which creates the linkerd-multi-cluster namespace containing service-mirror and linkerd-gateway. All participating clusters must run the Linkerd control plane.

All clusters must share the same trust root for mTLS; therefore the same root certificate must be used during installation.

Other Operational Tips

Resource configuration: HA mode sets default CPU/memory requests for control‑plane components; you may need to adjust them and use the config.linkerd.io/proxy-cpu-limit annotation for high‑traffic proxies.

Clock skew: Ensure node clocks are synchronized (e.g., via NTP) because large drift can break mTLS verification.

NET_ADMIN capability: The proxy‑init container requires NET_ADMIN to set iptables rules; if you prefer not to grant this, use the Linkerd CNI plugin.

linkerd viz tap permissions: Use Kubernetes RBAC to restrict access to the potentially sensitive tap command output.

Ingress integration: Linkerd does not provide its own Ingress controller; combine it with an existing Ingress controller and inject Linkerd sidecars to obtain mTLS and observability from the edge.

Understanding these considerations and reviewing the official Linkerd documentation are essential before deploying Linkerd in a production environment.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

high availability Kubernetes Prometheus Service Mesh helm Linkerd Multi‑Cluster

Written by

DevOps Cloud Academy

Exploring industry DevOps practices and technical expertise.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.