Backend Development 13 min read

Design and Implementation of a High‑Performance Short‑Link Platform

This article details the architecture, core algorithms, security measures, and performance optimizations of a high‑throughput short‑link service, covering hash‑based ID generation, Base62 encoding, distributed ID schemes, caching, database indexing, sharding, and monitoring to ensure efficient and secure URL shortening.

Architect
Architect
Architect
Design and Implementation of a High‑Performance Short‑Link Platform

1 Background Introduction

Zhuanzhuan is a leading Chinese second‑hand trading platform; links are essential for user interaction and information exchange on the platform.

Traditional long URLs contain many characters and special symbols, making them hard to remember and share, and they are inconvenient for SMS, QR codes, and social media due to length limitations.

2 Working Principle

2.1 Short Link Generation and Storage

When the short‑link service receives a long URL, it first checks a MD5 hash to see if a mapping already exists; if not, it generates a unique ID using a segment‑mode allocator, selects a generation algorithm (Base62), converts the ID to a short link, and persists the mapping for later lookup.

2.2 Short Link Return and Distribution

After successful generation, the short link is returned to the business side, which can embed it in webpages, send via SMS, or share on social media for users to access the resource.

2.3 User Click and Redirection

When a user clicks a short link, the browser requests the short‑link service, which looks up the mapping and redirects the user to the original long URL using an efficient retrieval mechanism.

HTTP status codes 301 (permanent) and 302 (temporary) both perform redirects; 301 may cause browsers to cache the redirect and distort click statistics, while 302 forces a fresh request each time, increasing service load.

3 Core Algorithms

The conversion from long to short URL is the core function and requires an efficient, unique algorithm.

3.1 Hash Algorithms

3.1.1 MD5

MD5 produces a 128‑bit hash value and can be used as a basic hash for short‑link generation.

3.1.2 SHA‑256

SHA‑256 generates a 256‑bit hash, offering higher security but resulting in longer strings, which is less suitable for short links.

3.2 Distributed ID

Using raw hash results can cause collisions and long URLs; therefore, unique identifiers are employed.

3.2.1 Global Increment

Auto‑increment IDs (e.g., MySQL auto‑increment primary key or Redis INCR) are simple, efficient, and widely used.

3.2.2 Segment Mode

Each node receives a range of IDs (a segment); the node generates IDs locally until the segment is exhausted, then requests a new segment, guaranteeing global uniqueness.

3.2.3 SnowFlake

SnowFlake splits a 64‑bit integer into timestamp, machine ID, data‑center ID, and sequence number, producing unique, ordered IDs. However, clock rollback can cause out‑of‑order IDs.

3.3 Base62 Encoding

Base62 encodes data using 62 characters (0‑9, a‑z, A‑Z), providing readable and stable short strings for URLs.

import java.util.ArrayList;
import java.util.List;
public class Base62Encoder {
private static final String BASE62_CHARACTERS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
public static String encode(long num) {
StringBuilder sb = new StringBuilder();
do {
int remainder = (int) (num % 62);
sb.insert(0, BASE62_CHARACTERS.charAt(remainder));
num /= 62;
} while (num != 0);
return sb.toString();
}
}

A 6‑digit Base62 code can represent about 568 billion (62⁶) distinct values.

4 Security and Protection

Ensuring data security and platform stability is paramount; a series of safeguards are applied.

4.1 Long‑Link Legitimacy Check

Before shortening, the original URL is validated for domain legitimacy and trusted target resources, including checks on query‑parameter domains.

4.2 Duplicate Short‑Link Protection

Idempotent design based on the MD5 of the long URL prevents creation of duplicate short links for identical requests.

4.3 Short‑Link Validity Verification

When a short link is accessed, the service quickly checks its existence in the database; if the mapping is absent, an error is returned.

5 System Performance Optimization

Optimizations ensure high efficiency and stability under heavy load.

5.1 Database Indexing

The unique ID serves as the primary key; the MD5 hash of the long URL is indexed to accelerate validity checks and redirection.

5.2 Cache Usage

Redis is used as a distributed cache; short‑link mappings are asynchronously stored in cache to reduce database pressure during high concurrency.

5.3 Segment Mode Optimization

An independent monitoring thread checks segment usage; when usage exceeds a threshold, a new segment is pre‑allocated to avoid bottlenecks during spikes.

5.4 Table Sharding Strategy

Data is sharded into 64 tables based on ID % 64, alleviating single‑table pressure and improving scalability.

5.5 Business Monitoring

Prometheus collects metrics such as request rates for short‑link generation, long‑link retrieval, and security checks, providing real‑time insight for operational decisions.

6 Conclusion

Through extensive research and practice, Zhuanzhuan's short‑link platform delivers efficient and secure link services, and will continue to innovate to meet evolving user needs.

Performance Optimizationbackend architecturesecuritydistributed IDURL ShorteningBase62short link
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.