Real-Time Super-Resolution Algorithm for League of Legends S12 Live Streaming
A real‑time super‑resolution network specially designed for the League of Legends S12 live broadcast upscales 1080p streams to 4K at 75 fps by compressing parameters, employing pixel‑unshuffle/shuffle, structural re‑parameterization, and a multi‑loss (L1, perceptual, Sobel, GAN) training pipeline, delivering markedly sharper textures and lower latency for live game streaming.
Preface: The 2022 League of Legends S12 World Championship began on September 29, attracting a massive audience. To improve the viewing experience, a real‑time video super‑resolution algorithm tailored for S12 live streaming was developed. The algorithm enhances detail and texture, up‑scaling video from 1080p to 4K while maintaining a processing speed of 75 fps.
01 Lightweight Game Live Streaming – Real‑Time Super‑Resolution Model Design Image super‑resolution is not new; it is divided into non‑real‑time and real‑time types. Non‑real‑time methods achieve high quality but are computationally heavy and unsuitable for live streaming. In game‑streaming scenarios, a real‑time super‑resolution network is required.
When dealing with 1080p@60 fps game streams, both super‑resolution quality and low latency are essential. Offline ESR (Efficient Super‑Resolution) models are mature but cannot meet live‑stream latency requirements. The proposed model restructures a typical ESR network for real‑time inference. The overall architecture is shown in Figure 1.
Figure 1. Real‑time Super‑Resolution Network Architecture for Game Live Streaming
To achieve real‑time performance, the model parameters are heavily compressed, which reduces fitting capacity. Most existing super‑resolution networks avoid down‑sampling, but such designs often compromise quality. Our model first applies a pixel‑unshuffle operation to shrink feature maps, reducing computation while keeping parameter count constant. At the final stage, pixel‑shuffle reconstructs the high‑resolution image. The pixel‑unshuffle/shuffle structures are illustrated in Figure 2 and involve negligible, lossless, reversible computation.
Figure 2. Pixel‑Unshuffle / Pixel‑Shuffle Structure
We also adopt structural re‑parameterization: during training, a multi‑branch network (including 3×3 conv, 1×1 conv residual, and identity residual branches) is used to improve convergence and fitting ability. During inference, an Op‑fusion strategy merges these branches into a single 3×3 convolution, as shown in Figure 3, speeding up deployment.
Figure 3. Op‑Fusion Strategy
02 Loss Function Design for Texture Detail Preservation
Training employs a multi‑loss strategy: L1 loss, perceptual loss, Sobel‑based texture loss, and GAN loss. Using only L1 loss yields overly smooth outputs, losing high‑frequency details (illustrated in Figure 4). The “Blurred Average” phenomenon occurs because the network treats all pixels equally, biasing toward low‑frequency reconstruction.
To better preserve texture, we add perceptual loss (using a pre‑trained network to compare high‑level features) and Sobel‑based texture loss (weighting L1 loss by edge gradients). Figure 5 shows an ablation comparison, where the addition of these losses markedly improves texture detail.
Figure 4. Low‑Quality / Super‑Resolved / High‑Quality / Residual Images
Figure 5. Texture Detail Comparison
GAN loss is also incorporated to fine‑tune the generator, making its output distribution closer to real data.
03 Multi‑Stage Warm‑Start Training Strategy
Two datasets are constructed: Dataset I (high‑quality game frames paired with synthetically degraded low‑quality versions) and Dataset II (online low‑resolution streams paired with high‑quality outputs from a strong non‑real‑time super‑resolution model). Training proceeds in three stages: model warm‑up, fine‑tuning, and artifact adjustment. During warm‑up, only L1 loss is used on Dataset I. After ~50 epochs, perceptual and Sobel‑based losses are enabled and fine‑tuning continues on Dataset II with a lower learning rate. Finally, GAN‑based training on Dataset I reduces artifacts and enhances visual quality. The full training configuration is listed in Table 1.
Table 1. Training Configuration for Game Live‑Streaming Super‑Resolution Network
04 Result Demonstration
The algorithm is deployed on a V100 GPU. With an input resolution of 1920×1080 and output of 3840×2160, each frame is processed in 13 ms, achieving 75 fps on a single card—ample headroom for 60 fps streams. Figure 6 shows side‑by‑side comparisons of the original and super‑resolved League of Legends live video, highlighting reduced noise around health bars and enhanced detail on terrain.
Figure 6. League of Legends Live Stream Super‑Resolution Comparison
The enhanced texture around health bars, grass, and stone demonstrates clear visual improvement.
05 Summary and Outlook
Game live streaming is a key service for Bilibili, and the S12 League of Legends broadcast attracted massive viewership. The presented high‑efficiency online super‑resolution algorithm transforms low‑quality game footage into high‑quality, richly textured video in real time, delivering a superior viewing experience. The single‑card 4K 75 fps capability expands the applicability of super‑resolution in live streaming. Future work includes extending the method to other real‑time game streams and combining it with 3D super‑resolution techniques.
Bilibili Tech
Provides introductions and tutorials on Bilibili-related technologies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.