Fundamentals 6 min read

Experimental Study of Sequential vs Random Memory Access Latency in C

This article presents a series of C‑based experiments that measure how sequential and random memory accesses behave across different array sizes and strides, revealing that random accesses can be up to four times slower than sequential ones due to cache and memory I/O effects.

Refining Core Development Skills
Refining Core Development Skills
Refining Core Development Skills
Experimental Study of Sequential vs Random Memory Access Latency in C

Based on the earlier article "Memory Random Access Is Slower Than Sequential Access, Deep Dive into Memory I/O Process," this write‑up implements C code to empirically observe memory access latency under various scenarios.

Sequential Access Tests

The core functions initialize a double array and perform accesses with a configurable stride:

void init_data(double *data, int n){
    int i;
    for (i = 0; i < n; i++) {
        data[i] = i;
    }
}

void seque_access(int elems, int stride) {
    int i;
    double result = 0.0;
    volatile double sink;
    for (i = 0; i < elems; i += stride) {
        result += data[i];
    }
    sink = result;
}

Two variables are varied: the total array size (affecting cache hit rate) and the stride (affecting access locality). Additional overheads such as the addition operation and timing system calls are deliberately ignored.

Scenario 1: Fixed array size of 2 KB, varying stride. With the array fitting entirely in L1 cache, observed latency is around 1 ns, reflecting only virtual‑address translation and L1 access.

Scenario 2: Fixed stride of 8, array size from 32 KB to 64 MB. As the array grows beyond cache capacities, cache misses increase, causing latency to rise.

Scenario 3: Fixed stride of 32, array size from 32 KB to 64 MB. Larger stride reduces spatial locality, further increasing cache‑line misses; latency stabilises around 9 ns when accesses remain mostly sequential despite occasional memory I/O.

Noticeable latency drops occur at 32 KB, 256 KB, and 8 MB, corresponding to the L1 (32 KB), L2 (256 KB), and L3 (≈12 MB) cache sizes of the test machine.

Random Access Tests

To eliminate sequential patterns, a pre‑shuffled index array is used to access the data in random order:

void random_access(int* random_index_arr, int count) {
    int i;
    double result = 0.0;
    volatile double sink;
    for (i = 0; i < count; i++) {
        result += data[*(random_index_arr+i)];
    }
    sink = result;
}
The random‑index array itself is small and assumed to stay in cache, so its access cost is ignored.

When the data set is small, caches still absorb most accesses, but for larger sizes (16 MB, 64 MB) the random pattern forces many cache‑line evictions, leading to a dramatic latency increase; at 64 MB the measured latency reaches about 38.4 ns, matching the earlier theoretical estimate.

Conclusion

The experimental results confirm the claim that random memory accesses are significantly slower than sequential accesses—approximately a 4:1 latency ratio in this environment—highlighting the importance of data locality and cache‑friendly designs in performance‑critical software.

CachelatencyC programmingrandom accesssequential access
Refining Core Development Skills
Written by

Refining Core Development Skills

Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.