Envoy Hot Restart and Dynamic Configuration Overview
This article explains Envoy's hot‑restart capability, various dynamic configuration options—including static, SDS/EDS, CDS, RDS, and LDS—and the initialization and drain processes that enable seamless updates and graceful connection handling in cloud‑native deployments.
Hot Restart
Ease of operation is a primary goal of Envoy. In addition to powerful statistics and a local management UI, Envoy supports "hot" or "real‑time" restarts, allowing it to fully reload its code and configuration without dropping any connections. The generic architecture includes shared memory for stats and locks, inter‑process communication via a Unix‑domain socket, a new process initializing itself before taking over the listening socket, a drain phase where the old process gracefully closes existing connections, and finally a shutdown of the old process, all configurable via command‑line options. A Python‑based example parent process is provided in the source distribution.
Dynamic Configuration
Envoy's architecture enables various configuration management approaches, from fully static setups to increasingly complex dynamic configurations that rely on external REST‑based configuration provider APIs. The document outlines the currently available options.
Top‑level configuration reference.
Reference configuration.
Envoy v2 API overview.
Fully Static
In a fully static configuration, the implementer supplies a set of listeners (and filter chains), clusters, and optional HTTP routing configuration. Dynamic host discovery is limited to DNS‑based service discovery, and configuration reloads must use the built‑in hot‑restart mechanism. Although simple, static configuration combined with graceful hot restarts can support fairly complex deployments.
SDS / EDS Only
The Service Discovery Service (SDS) API, renamed Endpoint Discovery Service (EDS) in the v2 API, allows Envoy to discover upstream cluster members. Building on static configuration, SDS lets deployments avoid DNS limitations and consume richer load‑balancing and routing information such as canary status and region.
SDS / EDS and CDS
The Cluster Discovery Service (CDS) API lets Envoy discover upstream clusters at routing time. Envoy can add, update, and delete clusters as specified by the API, enabling topologies where the initial configuration does not need to know all upstream clusters. When used with HTTP routing (without a Route Discovery Service), the router can forward requests to clusters indicated in HTTP request headers.
Although static clusters can be used without SDS/EDS, we recommend using SDS/EDS with CDS for graceful cluster updates. Existing connection pools are drained and re‑connected, but SDS/EDS‑added or removed hosts do not affect existing connections.
SDS / EDS, CDS and RDS
The Route Discovery Service (RDS) API allows Envoy to discover the full HTTP connection manager filter routing configuration at runtime. When combined with SDS/EDS and CDS, it enables complex routing topologies (traffic shifting, blue/green deployments) without requiring an Envoy binary restart.
SDS / EDS, CDS, RDS and LDS
The Listener Discovery Service (LDS) adds a layer where Envoy can discover entire listeners at runtime, including all filter stacks and embedded RDS references. Adding LDS makes almost every aspect of Envoy dynamically configurable; only rare changes (admin, tracing drivers, binary updates) still require a hot restart.
Initialization
Envoy's startup initialization is multi‑stage. The cluster manager first initializes static/DNS clusters, then predefined SDS clusters, followed by optional CDS clusters. If health checking is enabled, an active HC round runs. After the cluster manager finishes, RDS and LDS initialize (if applicable). The server does not accept connections until at least one LDS (and possibly RDS) response is received, a process known as listener warm‑up. Only after all prior steps does the listener begin accepting new connections, ensuring that during a hot restart the new process can handle traffic before the old one drains.
Drain
Drain is Envoy's graceful shutdown process for connections, triggered by manual health‑check failure, hot restart, or listener modification/removal via LDS. Each listener has a drain_type setting; the default mode responds to all three events, while modify_only ignores admin‑drain events, useful for setups with separate inbound and outbound listeners.
Drain must be supported at the network filter level; currently only the HTTP connection manager, Redis, and Mongo filters support normal draining.
Scripts
Envoy experimentally supports Lua scripts as dedicated HTTP filters.
Architects Research Society
A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.