Cloud Native 10 min read

Graceful Shutdown in Kubernetes with Spring Boot and Nacos: Concepts, Cases, and Optimizations

This article explains the concept of graceful shutdown, demonstrates it with Kubernetes‑SpringBoot‑Nacos case studies, analyzes common issues, and provides optimization strategies such as adjusting terminationGracePeriodSeconds, using PreStop hooks, handling MQ and scheduled tasks, and leveraging actuator shutdown for reliable service termination.

Architect's Guide
Architect's Guide
Architect's Guide
Graceful Shutdown in Kubernetes with Spring Boot and Nacos: Concepts, Cases, and Optimizations

Graceful shutdown refers to a controlled termination process that ensures data safety, prevents errors, and minimizes disruption to users. The article outlines typical steps: backing up data, stopping new requests, processing in‑flight requests, notifying dependent components, and finally shutting down once all elements have exited safely.

To illustrate these ideas, a detailed case study is presented that combines Kubernetes, Spring Boot, and Nacos. The workflow includes the Kubernetes pod deletion sequence, network rule updates, container cleanup, and the addition of a PreStop hook that performs Nacos deregistration and a 35‑second sleep. The terminationGracePeriodSeconds is set to 35 seconds to allow the hook to complete.

The article identifies several problems: Spring Boot may terminate in only 2 seconds, insufficient for completing asynchronous tasks, MQ consumption, or scheduled jobs; the wide‑gap between Nacos deregistration and Ribbon cache refresh can cause lingering requests; and the default termination grace period may be too short.

Optimization suggestions include:

Reducing the post‑deregistration sleep time by enabling UDP discovery or listening for Nacos change events to refresh Ribbon caches promptly.

Setting terminationGracePeriodSeconds to a value slightly larger than the sum of PreStop hook duration and Spring Boot shutdown time (e.g., 10 seconds + 30 seconds).

Enabling Spring Boot’s graceful shutdown and implementing custom shutdown logic to handle MQ messages, scheduled tasks, and thread‑pool jobs.

Example listener for Nacos instance change events:

/**
 * 订阅 nacos 实例变更通知
 * 手动刷新 ribbon 服务实例缓存
 * nacos client 1.4.6 【1.4.1有重大缺陷,要注意】
 */
@Component
@Slf4j
public class NacosInstancesChangeEventListener extends Subscriber<InstancesChangeEvent> {

    @Resource
    private SpringClientFactory springClientFactory;

    @PostConstruct
    public void registerToNotifyCenter(){
        NotifyCenter.registerSubscriber(this);
    }
    @Override
    public void onEvent(InstancesChangeEvent event) {
        String service = event.getServiceName();
        // service: DEFAULT_GROUP@@demo         ribbonService: demo
        String ribbonService = service.substring(service.indexOf("@@") + 2);
        log.info("#### 接收到微服务nacos实例变更事件:{} ribbonServiceName: {}", event.getServiceName(), ribbonService);
        ILoadBalancer loadBalancer = springClientFactory.getLoadBalancer(ribbonService);
        if(loadBalancer != null){
            ((ZoneAwareLoadBalancer
) loadBalancer).updateListOfServers();
            log.info("刷新 ribbon 服务实例:{} 缓存成功", ribbonService);
        }
    }

    @Override
    public Class
subscribeType() {
        return InstancesChangeEvent.class;
    }

    /**
     * nacos 1.4.4 ~ 1.4.6 需要加这个方法的实现, 2.1.2以后版本修复了该问题
     * 多注册中心时,变更事件没有隔离,因此需要实现该方法来判断事件是否需要处理
     * @see
ISSUE #8428 - Nacos InstancesChange Event Scope
*/
    @Override
    public boolean scopeMatches(InstancesChangeEvent event) {
        return true;
    }
}

When using Spring Boot’s actuator shutdown, the process may be interrupted by a SIGKILL if the thread pool is not configured to wait for tasks to finish. The following snippet shows the required configuration:

// Without these settings, threads may be terminated abruptly on kill -15
threadPoolTaskExecutor.setWaitForTasksToCompleteOnShutdown(true);
threadPoolTaskExecutor.setAwaitTerminationSeconds(30);

Further optimizations discuss handling MQ listeners and scheduled tasks by reacting to Nacos deregistration events, and controlling traffic through gateways when Kubernetes pod traffic control is not employed.

The conclusion summarizes that while the procedural steps for graceful shutdown are well‑documented, the real challenge lies in business‑level logic: long‑running tasks, data persistence on shutdown, and ensuring idempotent APIs.

cloud nativemicroservicesoperationsKubernetesNacosSpring BootGraceful Shutdown
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.