Design and Implementation of a High‑Performance, High‑Availability OTT Advertising System
This article presents the research and implementation of a high‑performance, highly available OTT advertising platform, detailing its system architecture, key technologies such as multi‑domain routing, containerized modules, two‑level caching, double‑layer Boolean retrieval, dynamic CDN scheduling, graceful degradation, and comprehensive disaster‑recovery strategies.
Source: Comet Plan
Abstract
In China’s online video industry, television screens hold higher authority than PC or mobile screens; users trust TV ads more, giving TV a huge monetization potential. Therefore, a high‑performance, high‑availability OTT advertising system can generate massive commercial value. This article reports the research outcomes of the author’s company‑developed “Micro‑Whale Advertising Platform”.
Introduction
The era of household internet is arriving, with smart TVs becoming the centerpiece of living‑room economics. By 2016, Chinese internet TV penetration was projected to exceed 130 million devices, with 57.7 % of OTT users belonging to three‑person families, indicating that a single ad exposure can reach many viewers. Users are mainly concentrated in East and South China, are increasingly young (66 % aged 20‑40), well‑educated, and have strong purchasing power, representing a high‑end consumer segment. Internet TV breaks the linear broadcast model, allowing users to actively select content, and offers short ad slots that enhance ad effectiveness. A 2016 white‑paper shows that consumer trust in TV ads reaches 54 %, the highest among all channels. However, OTT TV advertising is still in an exploratory phase, and a robust system is crucial for market development.
“Family‑Centric” Strategy and Customized Advertising
Future OTT will adopt a “family‑centric” approach, leveraging big‑data analysis to deliver personalized experiences. Based on user and content models, the system can perform precise ad targeting, turning ads into information.
Figure 2‑1 User Model and Content Model
Figure 2‑2 Big‑Data Architecture of the Advertising System
Personalized ad delivery is attractive to advertisers but imposes high performance demands. Traditional systems serve identical content to all users and rely on CDN for non‑real‑time assets. OTT personalized advertising must meet real‑time and differentiated content requirements while maintaining high performance.
OTT Advertising System Structure
Advertising System Architecture Diagram
The architecture (Figure 3‑1) consists of routing, retrieval, ranking, DMP, billing, delivery, long‑connection, transcoding, distribution, monitoring, high‑speed data channels, and terminal modules. The following sections detail the key modules and technologies.
Figure 3‑1 Advertising System Architecture
Advertising System Deployment Diagram
Figure 3‑2 Advertising System Deployment
Big‑System Small‑Do
The design follows a “big‑system small‑do” principle: complex functionality is broken into independent small services or modules, reducing coupling and simplifying maintenance. Each sub‑system can be developed and deployed by a small agile team, enabling rapid response to requirement changes and higher development efficiency.
Key Technology Description
Multi‑Domain Issuance
To ensure stability and high availability, the system uses a multi‑domain strategy to split requests, reducing the impact of domain hijacking or CDN failures. As shown in Figure 4‑1, the TV requests a domain policy from the terminal system; the terminal returns Domain A and Domain B, and the TV tests their availability before proceeding.
Figure 4‑1 Multi‑Domain Issuance Diagram
Each domain is backed by multiple CDN and service clusters to further mitigate failures.
Container Model: Efficient Advertising Service Operation
The retrieval side adopts a container model to isolate failures, prevent single‑point loss, and handle traffic spikes. SET units (containers) are grouped by consistent hashing; a failure in one SET only affects its users, while others continue serving ads. Automatic data drift and SET removal enable minimal loss.
Fault Isolation and Minimal Loss
Figure 4‑2 SET Failure Transfer Diagram
Dynamic Scaling (Expand/Reduce)
When traffic grows, a new SET can be cloned and data drift performed, increasing capacity by ~50 %. Conversely, excess SETs can be removed to save costs.
Figure 4‑3 SET Dynamic Expansion Diagram
Gray Release and A/B Testing
Each SET’s independence enables staged upgrades and A/B experiments by updating only a subset of SETs and observing metrics before full rollout.
Graceful Degradation (Flexible Availability)
Under traffic spikes, the system performs graceful degradation, dropping non‑essential services first. Two mechanisms are used:
Service Tier Management
Figure 4‑4 Service Level Diagram
Random Service Refusal
If the system cannot handle load, the retrieval side may refuse to serve ads, allowing the TV to play local ads or continue viewing without interruption.
Two‑Level Memory Cache
To boost retrieval speed, the system uses an in‑process memory cache and a Redis cache (Figure 4‑5). The ad index tree resides in instance memory, achieving millisecond‑level lookup.
Figure 4‑5 Two‑Level Memory Cache Diagram
Frequent write‑heavy data is stored in Redis and later persisted to a database.
Data Heterogeneity Redundancy and Asynchronous Invocation
Data Heterogeneity Redundancy
To reduce coupling, data from terminal, member, content, and feature systems are replicated into the ad retrieval system (Figure 4‑6), ensuring retrieval continues even if upstream services fail.
Figure 4‑6 Data Heterogeneity Redundancy Diagram
Asynchronous Invocation
When the retrieval side receives an ad‑count request, it sends a message to a high‑speed data channel and immediately returns, deferring index updates to the billing system (Figure 4‑7).
Figure 4‑7 Asynchronous Call Diagram
Double‑Layer Boolean Retrieval: Ultra‑Fast Ad Search
Boolean Retrieval
Targeting conditions form Boolean expressions that cannot be efficiently handled by traditional inverted indexes. The expressions are converted to Disjunctive Normal Form (DNF):
a1 = (age∈{20,30}∩geo∈{广东}∩gender∈{男}) ∪ (age∈{40,50}∩geo∈{广东}∩gender∈{男})
a2 = (age∈{25,45}∩geo∈{广东})
j1 = (age∈{20,30}∩geo∈{广东}∩gender∈{男})
j2 = (age∈{40,50}∩geo∈{广东}∩gender∈{男})
j3 = (age∈{25,45}∩geo∈{广东})Thus a1 = j1 ∪ j2 and a2 = j3. The double‑layer approach first prunes branches whose size exceeds the number of required conjunctions, then traverses the remaining branches, dramatically reducing match operations (Figure 4‑9).
Figure 4‑9 Double‑Layer Boolean Retrieval Index
Fast Ad Playback Experience: Dynamic CDN Scheduling + Bitrate Adaptation + Intelligent Pre‑Cache
Dynamic CDN Scheduling
The system distributes ad assets to multiple CDNs with priority settings. If a CDN fails, a server‑side switch module and long‑connection messages trigger a rapid CDN switch (Figure 4‑10). The TV also monitors CDN speed and switches client‑side when needed.
Figure 4‑10 Dynamic CDN Scheduling Diagram
Bitrate Adaptation
Based on regional bandwidth variations (Figure 4‑11), the server transcodes ads into multiple bitrates. The client selects the appropriate bitrate according to real‑time network conditions, ensuring smooth playback.
Figure 4‑11 24‑Hour Average Bandwidth Trend
Intelligent Pre‑Cache
For poor network conditions, the system pre‑caches high‑quality ads during good connectivity (Figure 4‑12). Startup ads are cached locally in 4K; pre‑roll and pause ads are predicted and pushed to the TV for local playback, minimizing revenue loss.
Figure 4‑12 Intelligent Pre‑Cache Diagram
Long‑Connection System: High‑Speed Highway for Ad Information
Broadcast Mode
All online TVs receive messages simultaneously, ensuring minimal latency. Offline TVs receive pending messages once they reconnect.
Progressive Mode
To avoid “cliff‑shaped” request spikes (Figure 4‑13), the system batches message delivery, smoothing backend load (Figure 4‑14, Figure 4‑15).
Figure 4‑13 Cliff‑Shaped Spike Diagram
Figure 4‑14 Progressive Mode Diagram
Figure 4‑15 Progressive Mode Request Trend
Security Protection
HTTPS mutual authentication and an attack‑monitoring module protect the system from malicious traffic (Figure 4‑16).
Figure 4‑16 Security Protection Diagram
Physical Disaster Recovery
Multi‑IDC Deployment
Core services are deployed in Beijing, Hangzhou, and Shenzhen to achieve multi‑active availability (Figure 4‑17).
Figure 4‑17 Multi‑IDC Deployment Diagram
Transmission Channel Disaster Recovery
Dedicated lines are used for ad material transfer; if they fail, traffic switches to public networks or P2P channels (Figure 4‑18).
Figure 4‑18 Transmission Channel Disaster Recovery Diagram
Module Hot‑Backup Disaster Recovery
Each SET’s modules (retrieval, partition, messaging, Redis) have hot‑backup instances to avoid single‑point failures (Figure 4‑19).
Figure 4‑19 Module Hot‑Backup Diagram
Conclusion
This paper analyzes the massive commercial value of OTT advertising, outlines the immature state of the OTT ad ecosystem, and proposes a high‑performance, high‑availability OTT advertising system design. It details key aspects such as big‑system small‑do, multi‑domain issuance, container model, physical disaster recovery, graceful degradation, long‑connection architecture, double‑layer Boolean retrieval, two‑level caching, data heterogeneity, asynchronous calls, fast playback experience, and security protection. Future work will explore precise ad targeting under the “family‑centric” strategy and predictive ad placement.
© Content sourced from the internet; rights belong to the original author. If any infringement is identified, please notify us for removal.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.