Evolution and Optimization of Bilibili Membership Ticketing System for High‑Concurrency Scenarios
Bilibili’s ticketing platform evolved from a single‑transaction, synchronous design to an asynchronous batch system and finally to a Redis‑cached inventory layer, adding DB isolation changes, sharding, bloom‑filter protection and adaptive rate limiting, which together enabled handling up to 930 k requests per second and stable high‑concurrency ticket sales.
Background
Bilibili Membership Ticketing started from zero in 2017 and now covers most 2D/2.5D exhibitions and performances nationwide. Core services include ticket抢票, movie tickets, seat selection, check‑in tools, and fast settlement. These services form a complete end‑to‑end ticketing flow.
Challenges
Rapid growth of exhibition and movie traffic creates high‑concurrency抢购 scenarios where inventory is far lower than demand. The original synchronous architecture suffers from DB bottlenecks, large transactions, and lock contention, leading to performance degradation and even service snow‑balling.
Evolution Goals
1. Reduce traffic peaks on the order‑creation path. 2. Optimize database pressure. 3. Shorten user‑perceived waiting time.
Evolution Process
The order‑creation pipeline was iterated in three stages:
3.1 Initial Version – Synchronous Transaction Processing
All requests performed real‑time inventory deduction and order insertion within a single DB transaction. This met normal traffic but failed under抢购 because large transactions blocked the DB connection pool and caused row‑level lock hot‑spots.
Problems identified:
High‑concurrency large‑transaction blocking – DB connection pool exhaustion.
Row‑level lock contention – InnoDB deadlocks, >30% deduction failure.
No elastic scaling – risk of service avalanche.
3.2 Asynchronous Order – Peak‑Shaving
To decouple request receipt from transaction execution, an asynchronous batch‑processing layer was introduced.
Solution Overview
Front‑end obtains an order token and polls for result (average 5‑8 s).
Back‑end batches inventory freeze, parallel coupon validation, and bulk SQL inserts.
Key code snippets:
-- 事务A
BEGIN;
SELECT * FROM orders
WHERE user_id=1 AND product_id=100
FOR UPDATE; -- 对已有行加行锁(假设当前无历史订单,此查询无数据,无法锁定任何行)
-- 此时加锁失败,其他事务仍可插入相同条件的订单After batch processing, the user experience suffers from polling latency, but DB pressure is dramatically reduced.
3.3 Redis‑Cache Inventory Deduction
Even asynchronous processing could not handle extreme spikes (e.g., 10× forecasted traffic). The new design adds a two‑layer inventory system:
90 % of stock pre‑deducted in Redis via Lua scripts (atomic).
Remaining 10 % falls back to MySQL with a circuit‑breaker and periodic stock reconciliation.
Sample Lua script:
-- Lua脚本实现原子化操作(示例)
local key = "limit:user_1:product_100"
local limit = 2
local current = redis.call('GET', key) or 0
if tonumber(current) < limit then
redis.call('INCR', key)
return "OK"
else
return "EXCEED_LIMIT"
endAdditional improvements:
Unique index for strict per‑user‑product limits: ALTER TABLE orders ADD UNIQUE INDEX idx_user_product (user_id, product_id);
Bloom filter + multi‑level cache to block cache‑penetration attacks: // 初始化布隆过滤器 BloomFilter bloomFilter = BloomFilter.create( Funnels.stringFunnel(Charset.forName("UTF-8")), 1000000, 0.01); if (!bloomFilter.mightContain(key)) { return null; // 直接拦截 }
Hash‑based sharding for large keys: int shard = Math.abs(key.hashCode()) % 1024; String shardKey = "user:" + shard + ":" + userId;
Range‑based sharding for order IDs: String shardKey = "order:" + (orderId >> 16) + ":" + orderId;
Database Isolation Tuning
The team switched from REPEATABLE READ (RR) to READ COMMITTED (RC) to reduce lock granularity. RC eliminates Gap and Next‑Key locks, increasing concurrency while accepting “semi‑consistent reads”. Detailed comparison of lock types and transaction behavior is provided.
Connection‑Pool & Rate Limiting
Redis Lettuce pool tuned for peak QPS:
spring.redis.lettuce.pool:
max-active: 20 # 根据QPS计算:(平均请求耗时(ms) * 峰值QPS) / 1000
max-idle: 10
min-idle: 5
max-wait: 50ms # 超过阈值触发扩容
test-while-idle: true
time-between-eviction-runs: 60sGuava RateLimiter pre‑heat implementation:
// Guava RateLimiter的预热实现
RateLimiter.create(permitsPerSecond, warmupPeriod, timeUnit);Dynamic throttling policies are defined as:
请求量区间 策略
---------------|-----------------
< 50%阈值 | 正常处理
50%-80%阈值 | 延迟响应
80%-100%阈值 | 返回缓存数据
> 100%阈值 | 直接拒绝Results
After the optimizations, the system handled 930 k req/s CDN peak and 300 k req/s service peak during the 2024 BW event, achieving stable ticket sales and meeting business targets.
Conclusion
Iterative backend refactoring—synchronous → asynchronous → Redis‑cache, combined with DB isolation tuning, sharding, bloom filters, and adaptive rate limiting—significantly improved order‑throughput and stability, while still leaving room for further enhancements.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.