Backend Development 27 min read

Design and Implementation of a High‑Performance, High‑Availability OTT Advertising System

This article presents the research and implementation of a high‑performance, highly available OTT advertising platform, detailing its system architecture, key technologies such as multi‑domain routing, containerized modules, two‑level caching, double‑layer Boolean retrieval, dynamic CDN scheduling, graceful degradation, and comprehensive disaster‑recovery strategies.

Architecture Digest
Architecture Digest
Architecture Digest
Design and Implementation of a High‑Performance, High‑Availability OTT Advertising System
Source: Comet Plan

Abstract

In China’s online video industry, television screens hold higher authority than PC or mobile screens; users trust TV ads more, giving TV a huge monetization potential. Therefore, a high‑performance, high‑availability OTT advertising system can generate massive commercial value. This article reports the research outcomes of the author’s company‑developed “Micro‑Whale Advertising Platform”.

Introduction

The era of household internet is arriving, with smart TVs becoming the centerpiece of living‑room economics. By 2016, Chinese internet TV penetration was projected to exceed 130 million devices, with 57.7 % of OTT users belonging to three‑person families, indicating that a single ad exposure can reach many viewers. Users are mainly concentrated in East and South China, are increasingly young (66 % aged 20‑40), well‑educated, and have strong purchasing power, representing a high‑end consumer segment. Internet TV breaks the linear broadcast model, allowing users to actively select content, and offers short ad slots that enhance ad effectiveness. A 2016 white‑paper shows that consumer trust in TV ads reaches 54 %, the highest among all channels. However, OTT TV advertising is still in an exploratory phase, and a robust system is crucial for market development.

“Family‑Centric” Strategy and Customized Advertising

Future OTT will adopt a “family‑centric” approach, leveraging big‑data analysis to deliver personalized experiences. Based on user and content models, the system can perform precise ad targeting, turning ads into information.

Figure 2‑1 User Model and Content Model

Figure 2‑2 Big‑Data Architecture of the Advertising System

Personalized ad delivery is attractive to advertisers but imposes high performance demands. Traditional systems serve identical content to all users and rely on CDN for non‑real‑time assets. OTT personalized advertising must meet real‑time and differentiated content requirements while maintaining high performance.

OTT Advertising System Structure

Advertising System Architecture Diagram

The architecture (Figure 3‑1) consists of routing, retrieval, ranking, DMP, billing, delivery, long‑connection, transcoding, distribution, monitoring, high‑speed data channels, and terminal modules. The following sections detail the key modules and technologies.

Figure 3‑1 Advertising System Architecture

Advertising System Deployment Diagram

Figure 3‑2 Advertising System Deployment

Big‑System Small‑Do

The design follows a “big‑system small‑do” principle: complex functionality is broken into independent small services or modules, reducing coupling and simplifying maintenance. Each sub‑system can be developed and deployed by a small agile team, enabling rapid response to requirement changes and higher development efficiency.

Key Technology Description

Multi‑Domain Issuance

To ensure stability and high availability, the system uses a multi‑domain strategy to split requests, reducing the impact of domain hijacking or CDN failures. As shown in Figure 4‑1, the TV requests a domain policy from the terminal system; the terminal returns Domain A and Domain B, and the TV tests their availability before proceeding.

Figure 4‑1 Multi‑Domain Issuance Diagram

Each domain is backed by multiple CDN and service clusters to further mitigate failures.

Container Model: Efficient Advertising Service Operation

The retrieval side adopts a container model to isolate failures, prevent single‑point loss, and handle traffic spikes. SET units (containers) are grouped by consistent hashing; a failure in one SET only affects its users, while others continue serving ads. Automatic data drift and SET removal enable minimal loss.

Fault Isolation and Minimal Loss

Figure 4‑2 SET Failure Transfer Diagram

Dynamic Scaling (Expand/Reduce)

When traffic grows, a new SET can be cloned and data drift performed, increasing capacity by ~50 %. Conversely, excess SETs can be removed to save costs.

Figure 4‑3 SET Dynamic Expansion Diagram

Gray Release and A/B Testing

Each SET’s independence enables staged upgrades and A/B experiments by updating only a subset of SETs and observing metrics before full rollout.

Graceful Degradation (Flexible Availability)

Under traffic spikes, the system performs graceful degradation, dropping non‑essential services first. Two mechanisms are used:

Service Tier Management

Figure 4‑4 Service Level Diagram

Random Service Refusal

If the system cannot handle load, the retrieval side may refuse to serve ads, allowing the TV to play local ads or continue viewing without interruption.

Two‑Level Memory Cache

To boost retrieval speed, the system uses an in‑process memory cache and a Redis cache (Figure 4‑5). The ad index tree resides in instance memory, achieving millisecond‑level lookup.

Figure 4‑5 Two‑Level Memory Cache Diagram

Frequent write‑heavy data is stored in Redis and later persisted to a database.

Data Heterogeneity Redundancy and Asynchronous Invocation

Data Heterogeneity Redundancy

To reduce coupling, data from terminal, member, content, and feature systems are replicated into the ad retrieval system (Figure 4‑6), ensuring retrieval continues even if upstream services fail.

Figure 4‑6 Data Heterogeneity Redundancy Diagram

Asynchronous Invocation

When the retrieval side receives an ad‑count request, it sends a message to a high‑speed data channel and immediately returns, deferring index updates to the billing system (Figure 4‑7).

Figure 4‑7 Asynchronous Call Diagram

Double‑Layer Boolean Retrieval: Ultra‑Fast Ad Search

Boolean Retrieval

Targeting conditions form Boolean expressions that cannot be efficiently handled by traditional inverted indexes. The expressions are converted to Disjunctive Normal Form (DNF):

a1 = (age∈{20,30}∩geo∈{广东}∩gender∈{男}) ∪ (age∈{40,50}∩geo∈{广东}∩gender∈{男})

a2 = (age∈{25,45}∩geo∈{广东})

j1 = (age∈{20,30}∩geo∈{广东}∩gender∈{男})

j2 = (age∈{40,50}∩geo∈{广东}∩gender∈{男})

j3 = (age∈{25,45}∩geo∈{广东})

Thus a1 = j1 ∪ j2 and a2 = j3. The double‑layer approach first prunes branches whose size exceeds the number of required conjunctions, then traverses the remaining branches, dramatically reducing match operations (Figure 4‑9).

Figure 4‑9 Double‑Layer Boolean Retrieval Index

Fast Ad Playback Experience: Dynamic CDN Scheduling + Bitrate Adaptation + Intelligent Pre‑Cache

Dynamic CDN Scheduling

The system distributes ad assets to multiple CDNs with priority settings. If a CDN fails, a server‑side switch module and long‑connection messages trigger a rapid CDN switch (Figure 4‑10). The TV also monitors CDN speed and switches client‑side when needed.

Figure 4‑10 Dynamic CDN Scheduling Diagram

Bitrate Adaptation

Based on regional bandwidth variations (Figure 4‑11), the server transcodes ads into multiple bitrates. The client selects the appropriate bitrate according to real‑time network conditions, ensuring smooth playback.

Figure 4‑11 24‑Hour Average Bandwidth Trend

Intelligent Pre‑Cache

For poor network conditions, the system pre‑caches high‑quality ads during good connectivity (Figure 4‑12). Startup ads are cached locally in 4K; pre‑roll and pause ads are predicted and pushed to the TV for local playback, minimizing revenue loss.

Figure 4‑12 Intelligent Pre‑Cache Diagram

Long‑Connection System: High‑Speed Highway for Ad Information

Broadcast Mode

All online TVs receive messages simultaneously, ensuring minimal latency. Offline TVs receive pending messages once they reconnect.

Progressive Mode

To avoid “cliff‑shaped” request spikes (Figure 4‑13), the system batches message delivery, smoothing backend load (Figure 4‑14, Figure 4‑15).

Figure 4‑13 Cliff‑Shaped Spike Diagram

Figure 4‑14 Progressive Mode Diagram

Figure 4‑15 Progressive Mode Request Trend

Security Protection

HTTPS mutual authentication and an attack‑monitoring module protect the system from malicious traffic (Figure 4‑16).

Figure 4‑16 Security Protection Diagram

Physical Disaster Recovery

Multi‑IDC Deployment

Core services are deployed in Beijing, Hangzhou, and Shenzhen to achieve multi‑active availability (Figure 4‑17).

Figure 4‑17 Multi‑IDC Deployment Diagram

Transmission Channel Disaster Recovery

Dedicated lines are used for ad material transfer; if they fail, traffic switches to public networks or P2P channels (Figure 4‑18).

Figure 4‑18 Transmission Channel Disaster Recovery Diagram

Module Hot‑Backup Disaster Recovery

Each SET’s modules (retrieval, partition, messaging, Redis) have hot‑backup instances to avoid single‑point failures (Figure 4‑19).

Figure 4‑19 Module Hot‑Backup Diagram

Conclusion

This paper analyzes the massive commercial value of OTT advertising, outlines the immature state of the OTT ad ecosystem, and proposes a high‑performance, high‑availability OTT advertising system design. It details key aspects such as big‑system small‑do, multi‑domain issuance, container model, physical disaster recovery, graceful degradation, long‑connection architecture, double‑layer Boolean retrieval, two‑level caching, data heterogeneity, asynchronous calls, fast playback experience, and security protection. Future work will explore precise ad targeting under the “family‑centric” strategy and predictive ad placement.

© Content sourced from the internet; rights belong to the original author. If any infringement is identified, please notify us for removal.

system architectureadvertisingbig datahigh-availabilityOTT
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.