Backend Development 10 min read

Understanding Distributed Caching: Principles and Implementation of Memcached

This article explains the fundamentals of caching, the role of distributed caches like memcached in high‑concurrency environments, and details the algorithms—including remainder hashing and consistent hashing with virtual nodes—that enable memcached to distribute data across multiple servers efficiently.

Qunar Tech Salon

May 29, 2016

Understanding Distributed Caching: Principles and Implementation of Memcached

Author Introduction

Lu Chen (Float Lu) is a Java engineer at Meituan-Dianping, an open‑source enthusiast who enjoys exploring foundational technologies and middleware in his spare time. Blog: http://my.oschina.net/andylucc.

Article

In high‑concurrency scenarios, massive read/write requests overwhelm databases, making disk I/O a bottleneck and causing high response latency; caching therefore becomes essential. Whether single‑node or distributed, caches have distinct use cases and trade‑offs, with Redis and Memcached being the most common distributed cache products. This article focuses on the distributed implementation principles of the cache service Memcached.

1. The Essence of Cache

Computer Architecture Cache

Based on the von Neumann model, a computer consists of CPU, control unit, memory, input and output devices. Modern CPUs contain registers and multiple cache levels (L1, L2, L3) before main memory. For example, a typical PC may have 356 GB disk, 4 GB RAM, 3 MB L3 cache, and 256 KB L2 cache. Data is first sought in the nearest (fastest) cache layer, which is the most expensive per byte.

Storage pyramid

All storage levels above the slowest (disk) are considered caches, accelerating data access.

Cache in Application Systems

Application data flow follows: Cache → DB. The application first queries the fast cache; if the data is present and valid, it returns immediately. Otherwise, it falls back to the slower database, returns the result, and updates the cache.

Typical cache‑augmented storage access model

2. Introduction to Memcached

What is Memcached

Memcached, originally developed by Brad Fitzpatrick at Danga Interactive (LiveJournal), is a high‑performance distributed memory cache server used by many large services (e.g., Facebook, Mixi). It reduces database load, improves response speed, and enhances scalability.

Key Features of Memcached

Simple protocol

Event‑driven architecture based on libevent

In‑memory storage

Servers operate independently without inter‑communication

3. Distributed Principles of Memcached

Because Memcached servers do not communicate with each other, distribution is handled entirely by the client library.

Memcached distribution diagram

When a client receives data, it uses the key to select a Memcached server based on a hashing algorithm; the same algorithm is used for retrieval, ensuring that the data is stored and fetched from the same node.

Remainder (Modulo) Hashing

The standard method is: CRC($key)%N where N is the number of servers. Issues include rehashing on server failure and high reshuffling cost when servers are added or removed.

Consistent Hashing

Each server is assigned a hash value on a 0‑2³² ring; keys are also hashed onto the same ring. A key is stored on the first server encountered clockwise. This limits the impact of server changes to a small segment of the ring.

Consistent hashing principle

Optimized Consistent Hashing with Virtual Nodes

To avoid uneven key distribution, each physical server is represented by multiple virtual nodes, each placed at a different point on the ring. Keys map to virtual nodes, which then map to the underlying physical server, achieving a more balanced load even with few physical nodes.

4. Summary

After covering basic cache concepts, this article introduced Memcached’s distributed algorithms, showing that distribution is achieved by client‑side hashing techniques such as remainder hashing, consistent hashing, and its virtual‑node‑enhanced variant.

References: 《Large‑Scale Distributed Web Architecture Design and Practice》, 《Comprehensive Analysis of Memcached》.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

consistent hashing distributed caching Memcached cache algorithms

Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.