Backend Development 18 min read

Evolution and Optimization of Bilibili Membership Ticketing System for High‑Concurrency Scenarios

Bilibili’s ticketing platform evolved from a single‑transaction, synchronous design to an asynchronous batch system and finally to a Redis‑cached inventory layer, adding DB isolation changes, sharding, bloom‑filter protection and adaptive rate limiting, which together enabled handling up to 930 k requests per second and stable high‑concurrency ticket sales.

Bilibili Tech

Mar 14, 2025

Evolution and Optimization of Bilibili Membership Ticketing System for High‑Concurrency Scenarios

Background

Bilibili Membership Ticketing started from zero in 2017 and now covers most 2D/2.5D exhibitions and performances nationwide. Core services include ticket抢票, movie tickets, seat selection, check‑in tools, and fast settlement. These services form a complete end‑to‑end ticketing flow.

Challenges

Rapid growth of exhibition and movie traffic creates high‑concurrency抢购 scenarios where inventory is far lower than demand. The original synchronous architecture suffers from DB bottlenecks, large transactions, and lock contention, leading to performance degradation and even service snow‑balling.

Evolution Goals

1. Reduce traffic peaks on the order‑creation path. 2. Optimize database pressure. 3. Shorten user‑perceived waiting time.

Evolution Process

The order‑creation pipeline was iterated in three stages:

3.1 Initial Version – Synchronous Transaction Processing

All requests performed real‑time inventory deduction and order insertion within a single DB transaction. This met normal traffic but failed under抢购 because large transactions blocked the DB connection pool and caused row‑level lock hot‑spots.

Problems identified:

High‑concurrency large‑transaction blocking – DB connection pool exhaustion.

Row‑level lock contention – InnoDB deadlocks, >30% deduction failure.

No elastic scaling – risk of service avalanche.

3.2 Asynchronous Order – Peak‑Shaving

To decouple request receipt from transaction execution, an asynchronous batch‑processing layer was introduced.

Solution Overview

Front‑end obtains an order token and polls for result (average 5‑8 s).

Back‑end batches inventory freeze, parallel coupon validation, and bulk SQL inserts.

Key code snippets:

-- 事务A
BEGIN;
SELECT * FROM orders 
WHERE user_id=1 AND product_id=100 
FOR UPDATE; -- 对已有行加行锁（假设当前无历史订单，此查询无数据，无法锁定任何行）

-- 此时加锁失败，其他事务仍可插入相同条件的订单

After batch processing, the user experience suffers from polling latency, but DB pressure is dramatically reduced.

3.3 Redis‑Cache Inventory Deduction

Even asynchronous processing could not handle extreme spikes (e.g., 10× forecasted traffic). The new design adds a two‑layer inventory system:

90 % of stock pre‑deducted in Redis via Lua scripts (atomic).

Remaining 10 % falls back to MySQL with a circuit‑breaker and periodic stock reconciliation.

Sample Lua script:

-- Lua脚本实现原子化操作（示例）
local key = "limit:user_1:product_100"
local limit = 2
local current = redis.call('GET', key) or 0
if tonumber(current) < limit then
    redis.call('INCR', key)
    return "OK"
else
    return "EXCEED_LIMIT"
end

Additional improvements:

Unique index for strict per‑user‑product limits:

ALTER TABLE orders ADD UNIQUE INDEX idx_user_product (user_id, product_id);

Bloom filter + multi‑level cache to block cache‑penetration attacks:

// 初始化布隆过滤器
BloomFilter<String> bloomFilter = BloomFilter.create(
    Funnels.stringFunnel(Charset.forName("UTF-8")), 
    1000000, 
    0.01);

if (!bloomFilter.mightContain(key)) {
    return null; // 直接拦截
}

Hash‑based sharding for large keys:

int shard = Math.abs(key.hashCode()) % 1024;
String shardKey = "user:" + shard + ":" + userId;

Range‑based sharding for order IDs:

String shardKey = "order:" + (orderId >> 16) + ":" + orderId;

Database Isolation Tuning

The team switched from REPEATABLE READ (RR) to READ COMMITTED (RC) to reduce lock granularity. RC eliminates Gap and Next‑Key locks, increasing concurrency while accepting “semi‑consistent reads”. Detailed comparison of lock types and transaction behavior is provided.

Connection‑Pool & Rate Limiting

Redis Lettuce pool tuned for peak QPS:

spring.redis.lettuce.pool:
  max-active: 20      # 根据QPS计算：(平均请求耗时(ms) * 峰值QPS) / 1000
  max-idle: 10
  min-idle: 5
  max-wait: 50ms     # 超过阈值触发扩容
  test-while-idle: true
  time-between-eviction-runs: 60s

Guava RateLimiter pre‑heat implementation:

// Guava RateLimiter的预热实现
RateLimiter.create(permitsPerSecond, warmupPeriod, timeUnit);

Dynamic throttling policies are defined as:

请求量区间       策略
---------------|-----------------
< 50%阈值      | 正常处理
50%-80%阈值    | 延迟响应
80%-100%阈值   | 返回缓存数据
> 100%阈值     | 直接拒绝

Results

After the optimizations, the system handled 930 k req/s CDN peak and 300 k req/s service peak during the 2024 BW event, achieving stable ticket sales and meeting business targets.

Conclusion

Iterative backend refactoring—synchronous → asynchronous → Redis‑cache, combined with DB isolation tuning, sharding, bloom filters, and adaptive rate limiting—significantly improved order‑throughput and stability, while still leaving room for further enhancements.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

high concurrency asynchronous processing backend optimization database isolation redis caching ticketing system

Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.