Big Data 8 min read

Introduction to Apache Kafka: Core Concepts, Message Delivery, Partition Storage, and Consumption

This article introduces Apache Kafka as a distributed streaming platform, explaining its three core capabilities, key concepts such as producers, topics, brokers, partitions and consumers, and detailing how messages are delivered, stored in partitions, and consumed by consumer groups.

System Architect Go

Apr 11, 2019

Introduction to Apache Kafka: Core Concepts, Message Delivery, Partition Storage, and Consumption

Kafka, as a distributed streaming platform, is increasingly used in the big‑data field; this article provides an overview of Kafka’s essential features.

Key Capabilities

Kafka offers three major abilities: publishing and subscribing to message streams (similar to a message queue), fault‑tolerant storage of streams, and real‑time processing of streams.

It is typically applied to build real‑time data pipelines for reliable data transfer between systems, and to develop applications that transform or react to streaming data.

Related Concepts

Producer : publishes messages to Kafka.

Topic : logical classification of messages; every message belongs to a specific topic.

Broker : a Kafka server; a cluster consists of multiple brokers.

Partition : each topic is divided into ordered, immutable partitions; each partition can have multiple replicated copies for fault tolerance.

Consumer : pulls messages from Kafka and belongs to a consumer group.

Message Delivery

Each message consists of a key, value, and timestamp.

Messages are stored in partitions; placement follows three rules: (1) producer explicitly specifies a partition, (2) if no partition and no key, messages are round‑robin distributed across partitions, (3) if a key is present, the key determines the partition.

The acknowledgment setting (acks) controls producer‑side reliability: acks=0 (no wait), acks=1 (wait for leader), acks=-1 or all (wait for all in‑sync replicas).

Leaders and followers are broker roles; each partition has one leader handling reads/writes, while followers replicate the leader’s data. Synchronized brokers are listed in the ISR (In‑Sync Replicas) set.

Partition Storage

Topics are logical; partitions are the actual storage units. Each partition is an ordered, immutable sequence of records, appended sequentially, and each record has a unique offset.

Within a single partition, order is guaranteed, but overall topic order is not.

Messages persist until a configured retention period expires, independent of consumption.

Message Consumption

Consumers belong to a consumer group; each partition is consumed by only one consumer in the group, though a consumer may handle multiple partitions.

If there are more consumers than partitions, the extra consumers remain idle.

Consumers actively pull messages, allowing them to control offset and replay historical data.

Older Kafka versions offered low‑level and high‑level consumer APIs; the high‑level API handled partition assignment and rebalancing automatically. Newer versions unify the API while still permitting custom or automatic partition assignment.

Conclusion

Kafka provides high throughput, low latency, scalability, persistence, fault tolerance, and high concurrency, making it a powerful foundation for modern data streaming architectures.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Big Data Kafka Message Queue Partitioning Distributed Streaming consumer groups

Written by

System Architect Go

Programming, architecture, application development, message queues, middleware, databases, containerization, big data, image processing, machine learning, AI, personal growth.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.