Big Data 10 min read

Why Kafka Is So Fast: Sequential Writes, Memory‑Mapped Files, and Zero‑Copy

This article explains how Kafka achieves high throughput by using sequential disk writes, memory‑mapped files, batch compression, and zero‑copy sendfile for reads, while also covering data retention policies and the role of offsets in consumer processing.

Architecture Digest

Jan 19, 2020

Why Kafka Is So Fast: Sequential Writes, Memory‑Mapped Files, and Zero‑Copy

Kafka stores messages on disk; although disk I/O is slower than memory, Kafka achieves high throughput.

Even on ordinary servers, Kafka can handle millions of writes per second, surpassing most middleware, making it popular for log processing and massive data scenarios.

Benchmark reference: Apache Kafka benchmark – 2 million writes per second on three cheap machines ( link ).

The article analyzes why Kafka is fast, covering both data write and read paths.

Write Path

Kafka writes all received messages to disk, guaranteeing durability. It optimizes write speed using two techniques: sequential writes and memory‑mapped files (MMFile).

Sequential Write

Disk performance depends on access pattern; sequential I/O can approach memory speed, while random I/O is slow due to mechanical seeking.

In sequential I/O scenarios, certain optimizations allow disk read/write speeds to be comparable to memory.

Note: details are omitted; see http://searene.me/2017/07/09/Why-is-Kafka-so-fast/

Hard disks dislike random I/O and favor sequential I/O. Linux also provides read‑ahead, write‑behind, and caching optimizations.

Sequential disk I/O can be faster than random memory I/O.

JVM garbage‑collection overhead is high; using disk avoids large heap pressure.

Disk cache remains usable after a cold start.

Each partition is stored as a file; new messages are appended to the file end. Kafka does not delete data; instead, each consumer tracks its position with an offset stored in Zookeeper.

Data retention is managed by two policies: time‑based and size‑based, configurable in Kafka's settings.

Memory Mapped Files

Even with sequential writes, disk is slower than memory. Kafka uses mmap to map files into virtual memory, allowing the OS to flush data to disk asynchronously.

Memory‑mapped files let a process read/write as if it were memory, avoiding copies between user and kernel space and providing significant I/O gains.

However, data written to mmap is not guaranteed to be on disk until the OS flushes it.

Kafka’s producer.type controls flushing: synchronous producers flush after mmap, asynchronous producers do not.

Read Path

Kafka uses sendfile for zero‑copy transmission, eliminating multiple copies between kernel and user buffers.

Zero‑Copy with sendfile

Traditional read/write involves four copies: disk → kernel buffer → user buffer → socket buffer → protocol engine.

The sendfile system call copies data directly from the kernel file cache to the socket buffer, reducing copies and context switches. sendfile(socket, file, len); Since kernel 2.1, sendfile reduced the number of copies; later kernels further simplified the path.

Web servers such as Apache, Nginx, and Lighttpd use sendfile to boost file‑transfer performance.

Kafka combines mmap and sendfile to deliver messages efficiently to consumers.

Batch Compression

Network I/O is often the bottleneck; Kafka compresses batches of messages rather than individual ones, supporting Gzip and Snappy compression protocols.

Conclusion

Kafka’s speed stems from treating all messages as a single append‑only log, using sequential writes, memory‑mapped files, batch compression, and zero‑copy sendfile for reads, while retaining data and managing offsets via Zookeeper.

Author: Binyue Original article: cnblogs.com/binyue/p/10308754.html

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

High Throughput big-data Memory Mapped Files Sequential Write Data Streaming

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.