Operations 19 min read

Why Switching Linux Pages from 4KB to 2MB Can Destroy Performance

Changing the default Linux page size from 4KB to 2MB can dramatically increase TLB hit rates but, for typical microservice workloads with many small allocations, it leads to massive internal fragmentation, higher cache‑coherency overhead, and severe latency spikes, ultimately causing overall performance to collapse.

Deepin Linux
Deepin Linux
Deepin Linux
Why Switching Linux Pages from 4KB to 2MB Can Destroy Performance

Linux uses 4KB pages by default. Each memory access requires virtual‑to‑physical address translation, which is cached in the tiny Translation Lookaside Buffer (TLB). A 4KB page layout creates millions of page‑table entries for a gigabyte of memory, overwhelming the TLB and causing frequent TLB misses that cost dozens of CPU cycles each.

What 2MB HugePages change

Switching to 2MB HugePages reduces the number of page‑table entries from ~260,000 to 512 for the same 1GB region, allowing the TLB to hold almost all mappings and raising the hit rate to near 100 %. The address‑translation overhead drops dramatically, which can boost throughput by 10‑30 % for workloads that continuously stream large, contiguous memory blocks (e.g., databases, DPDK, big‑data jobs).

Microservice case study

The author replaced all 4KB pages with 2MB HugePages in a typical high‑QPS microservice cluster. After deployment, CPU soft‑interrupts spiked, memory usage jumped by >30 %, and latency became erratic, eventually crashing the service. Detailed investigation revealed three root causes:

Each 1KB request‑level allocation forced the kernel to reserve an entire 2MB page, causing severe internal fragmentation and rapid exhaustion of the pre‑allocated huge‑page pool.

Large pages increased cache‑line sharing across threads, amplifying MESI‑based cache‑coherency traffic and raising lock contention.

While overall TLB misses fell, the cost of a single 2MB page fault is far higher than a 4KB fault, inflating tail‑latency (P99/P999) in the microservice workload.

Code snippets illustrate the contrast. The first benchmark walks 1GB of memory using 4KB pages versus 2MB pages, showing the raw traversal time difference. The second snippet mimics a microservice request that allocates only 1KB per call; under 2MB pages this allocation repeatedly consumes a full huge page, reproducing the memory‑blow‑up observed in production.

#include <stdio.h>
#include <sys/mman.h>
#include <string.h>
#include <sys/time.h>
#define SIZE (1UL << 30) // 1GB
long long get_time_us(){
    struct timeval tv; gettimeofday(&tv, NULL);
    return tv.tv_sec*1000000LL + tv.tv_usec;
}
int main(){
    char *normal_mem = mmap(NULL, SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
    char *huge_mem   = mmap(NULL, SIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0);
    long long start = get_time_us();
    for(long i=0;i<SIZE;i+=4096) normal_mem[i]=1;
    long long time_normal = get_time_us()-start;
    start = get_time_us();
    for(long i=0;i<SIZE;i+=2048*1024) huge_mem[i]=1;
    long long time_huge = get_time_us()-start;
    printf("4KB page traversal: %lld µs
", time_normal);
    printf("2MB hugepage traversal: %lld µs
", time_huge);
    return 0;
}

After pinpointing the failure, the author rolled back to the default 4KB configuration by editing /etc/sysctl.conf (setting vm.nr_hugepages = 0 and commenting out vm.hugepagesz), reloading the sysctl settings, unmounting /dev/hugepages, and rebooting. Post‑rollback tests showed response times returning to ~550 ms and throughput recovering to >900 requests per second.

Guidelines

HugePages are beneficial only for workloads that allocate large, long‑lived memory regions (databases, caches, HPC, DPDK). For typical microservices with many tiny, short‑lived allocations, the default 4KB pages provide finer granularity, better memory utilization, lower cache‑coherency overhead, and more predictable latency. Performance tuning must always consider the specific access pattern; a metric improvement (e.g., higher TLB hit rate) does not guarantee better end‑user experience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Memory ManagementMicroservicesperformance tuningLinuxTLBHugePages
Deepin Linux
Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.