Design and Implementation of a High‑Performance URL Shortening Platform
This article details the architecture, core algorithms, security measures, and performance optimizations of a URL shortener platform, covering hash functions, distributed ID generation, Base62 encoding, caching, database indexing, sharding, and monitoring to achieve efficient and secure link redirection.
1 Background
Zhuanzhuan is a leading second‑hand trading platform in China, where links are essential for user interaction and information exchange.
2 Working Principle
2.1 Short‑Link Generation and Storage
When a long URL is received, the platform first checks for an existing mapping using an MD5 hash; if none exists, it generates a unique ID via a segment‑allocation mode, encodes it with Base62, and persists the mapping for later lookup.
2.2 Short‑Link Return and Distribution
The generated short link is returned to the business side, which can embed it in webpages, SMS, or social media for user access.
2.3 User Click and Redirection
Upon a user click, the platform looks up the short link, retrieves the original long URL, and redirects the user, requiring fast data retrieval and redirection mechanisms.
HTTP 301 (permanent) redirects may be cached by browsers, causing inaccurate click statistics, while 302 (temporary) redirects always hit the short‑link service, increasing load.
3 Core Algorithms
3.1 Hash Algorithms
3.1.1 MD5
MD5 produces a 128‑bit hash used as a basic fingerprint for long URLs.
3.1.2 SHA‑256
SHA‑256 offers stronger security but yields longer hashes, affecting short‑link length.
3.2 Distributed ID
To avoid hash collisions and control link length, unique identifiers are generated.
3.2.1 Global Auto‑Increment
Auto‑increment IDs (e.g., MySQL primary key or Redis INCR) provide a simple, efficient way to generate unique IDs.
3.2.2 Segment Mode
Each node receives a range of IDs; the node increments locally until the segment is exhausted, then requests a new segment, ensuring global uniqueness.
3.2.3 SnowFlake
SnowFlake splits a 64‑bit integer into timestamp, machine ID, data‑center ID, and sequence number, guaranteeing unique, ordered IDs, though it is vulnerable to clock rollback.
3.3 Base62 Encoding
Base62 uses 62 characters (0‑9, a‑z, A‑Z) to produce compact, readable strings; a 6‑character Base62 string can represent about 568 billion values.
import java.util.ArrayList;
import java.util.List;
public class Base62Encoder {
private static final String BASE62_CHARACTERS = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz";
public static String encode(long num) {
StringBuilder sb = new StringBuilder();
do {
int remainder = (int) (num % 62);
sb.insert(0, BASE62_CHARACTERS.charAt(remainder));
num /= 62;
} while (num != 0);
return sb.toString();
}
}4 Security and Protection
4.1 Long‑Link Legitimacy Validation
Before shortening, the platform validates the original URL’s domain against a whitelist and checks query‑parameter domains to prevent malicious links.
4.2 Duplicate Short‑Link Prevention
By using the MD5 of the long URL in an idempotent design, repeated requests produce the same short link, avoiding waste and confusion.
4.3 Short‑Link Validity Verification
The service quickly checks the database to confirm whether a short link exists; if not, it returns an error response.
5 System Performance Optimization
5.1 Database Indexing
The unique ID serves as the primary key, while the MD5 hash is indexed to accelerate validity checks and redirection.
5.2 Cache Utilization
Redis is used as a distributed cache to store short‑link mappings, reducing database load and improving response time under high concurrency.
5.3 Segment‑Mode Optimization
A monitoring thread pre‑allocates new ID segments when usage crosses a threshold, preventing bottlenecks during peak traffic.
5.4 Table Sharding
Link records are sharded into 64 tables based on ID modulo 64, distributing load and enhancing scalability.
5.5 Business Monitoring
Prometheus collects metrics such as request rates for short‑link generation and retrieval, as well as security‑check statistics, providing real‑time insight for operations.
6 Conclusion
Through extensive research and practice, Zhuanzhuan’s short‑link platform delivers efficient and secure link services, and will continue to innovate to meet evolving user needs.
Zhuanzhuan Tech
A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.