Fundamentals 15 min read

Optimizing Network Latency in TencentOS: Kernel SoftIRQ and ksoftirqd Tuning

This article details a systematic investigation and multi‑stage optimization of TencentOS network latency, uncovering kernel soft‑interrupt bottlenecks, proposing and upstreaming patches to net_rx_action, RPS handling, and ksoftirqd scheduling, ultimately reducing handshake failures by over 80% while sharing practical insights for Linux kernel developers.

Tencent Architect

May 16, 2023

Optimizing Network Latency in TencentOS: Kernel SoftIRQ and ksoftirqd Tuning

When Tencent's core services migrated to the cloud, the TencentOS kernel experienced a sharp rise in network connection failures, with peak failure counts reaching 59 per monitoring interval. The business required sub‑50 ms handshake latency, but many connections exceeded this threshold, prompting a deep dive into the kernel networking stack.

The investigation identified several potential delay sources: delayed ACK (≈40 ms), RTO retransmission (≈200 ms), and SYN retransmission back‑off (up to several seconds). The team focused on the packet reception path from the host (mother) VM to the guest (child) VM, examining queueing, interrupt handling, and soft‑IRQ processing.

First root‑cause analysis revealed that the soft‑IRQ entry point net_rx_action() was being delayed, with the function exiting early due to budget or time limits, causing packets to be deferred to the next soft‑IRQ cycle. The relevant code snippet:

net_rx_action()
{
    // 第一层干扰项: time
    time_limit = jiffies + usecs_to_jiffies(netdev_budget_usecs);
    // 第二层干扰项: budget
    budget = netdev_budget;
    budget -= napi_poll(n, &repoll);
    …
    if (budget <= 0 || time_after_eq(jiffies, time_limit)) {
        break;
    }
    …
}

Further profiling showed that the child VM's network driver often failed to drain packets, leaving the host unable to re‑enable interrupts. Consequently, packets waited for the kernel thread ksoftirqd to process them, adding latency.

Upper‑layer mitigation involved unbinding the business processes from specific CPUs, which immediately reduced handshake failure rates to below five per interval.

Subsequent kernel‑level attempts included:

First kernel attempt : Adjusting /proc/sys/net/core/netdev_budget and netdev_budget_usecs, and tuning virtio_net.napi_weight. These changes yielded modest improvements but did not match the unbinding effect.

Second attempt (RPS optimization) : Modifying the RPS scheduling code to avoid unnecessary soft‑IRQ wake‑ups. Relevant snippet:

napi_schedule_rps(struct softnet_data *sd)
{
    struct softnet_data *mysd = this_cpu_ptr(&softnet_data);
#ifdef CONFIG_RPS
    if (sd != mysd) {
        sd->rps_ipi_next = mysd->rps_ipi_list;
        mysd->rps_ipi_list = sd;
        __raise_softirq_irqoff(NET_RX_SOFTIRQ);
        return 1;
    }
#endif
    __napi_schedule_irqoff(&mysd->backlog);
    return 0;
}

This patch reduced NET_RX_SOFTIRQ invocations by ~10% but still left latency issues under load.

Third attempt (eliminating ksoftirqd delegation) : Introducing a patch that prevents the kernel from fully delegating network soft‑IRQ handling to ksoftirqd when the local CPU is not overloaded. Key code:

static inline void invoke_softirq(void)
{
    if (ksoftirqd_running(local_softirq_pending()))
        return;
    if (!force_irqthreads() || !__this_cpu_read(ksoftirqd)) {
        __do_softirq();
    }
}

After upstreaming this change, the handshake failure count dropped by about 87.5%, as shown by before/after graphs.

All three patches were submitted to the Linux kernel community, with two being accepted upstream. The author also shared observation tools, links to mailing list discussions, and detailed performance graphs.

In conclusion, the article documents a comprehensive, multi‑stage optimization process—from business‑level CPU binding adjustments to kernel‑level soft‑IRQ redesign—demonstrating how deep kernel knowledge and upstream collaboration can dramatically improve cloud network latency.

At the end of the article, the TencentOS team posted a recruitment notice for senior Linux kernel engineers, inviting interested candidates to contact them via email.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Optimization Linux kernel network performance tencentos softirq ksoftirqd

Written by

Tencent Architect

We share insights on storage, computing, networking and explore leading industry technologies together.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.