Artificial Intelligence 8 min read

Real-CUGAN: An Open‑Source AI Super‑Resolution Model for Anime Video Upscaling

Real‑CUGAN is an open‑source AI super‑resolution model that upscales anime video up to 4× using a million‑patch, frequency‑domain‑supervised dataset, delivering faster inference than Real‑ESRGAN, seamless Waifu2x compatibility, and superior texture, line and artifact handling, with code released on GitHub.

Bilibili Tech
Bilibili Tech
Bilibili Tech
Real-CUGAN: An Open‑Source AI Super‑Resolution Model for Anime Video Upscaling

With the growing demand for ultra‑high‑definition video (4K/8K), the production of such content remains challenging due to high equipment and post‑processing requirements. Upscaling low‑resolution anime footage through AI‑based super‑resolution offers a cost‑effective solution.

AI super‑resolution is a sub‑field of image restoration. In anime production, source material often suffers from aliasing, halo, color blocks, noise, and blur caused by low‑resolution rendering and subsequent upscaling. Traditional manual pipelines require extensive filtering and hand‑crafted repairs, leading to high labor costs.

The proposed workflow first cuts anime frames into patches, scores them with an image‑quality model, and selects a million high‑quality patches to form a private training set. A multi‑stage degradation process then down‑samples these patches to generate low‑quality inputs, enabling the AI model to learn the inverse mapping from low‑ to high‑resolution images.

The resulting model, named Real‑CUGAN (Real Cascaded‑U‑Net‑style Generative Adversarial Networks), adopts the same network architecture as Waifu2x‑CUNet but is trained on a new, large‑scale dataset and incorporates additional frequency‑domain supervision. The inference code and model parameters have been open‑sourced.

Comparative experiments with Waifu2x and Real‑ESRGAN show that Real‑CUGAN achieves:

Speed: Approximately 2.2× faster than Real‑ESRGAN on a V100 GPU (≈6.3 fps) and about 8.4× faster than generic Real‑ESRGAN models.

Principle: Same architecture as Waifu2x‑CUNet; a million‑scale private training set; extra frequency‑domain supervision versus spectral‑norm U‑Net discriminator in Real‑ESRGAN.

Compatibility: Identical model structure to Waifu2x, allowing seamless replacement of parameter files in existing Windows applications and VapourSynth pipelines.

Functionality: Supports 2×, 3×, and 4× upscaling (future support for arbitrary scales); Waifu2x only offers 1×/2×, Real‑ESRGAN only 4×.

Effectiveness: Subjective tests on four challenging cases (texture, line, extreme compression, depth‑of‑field) demonstrate that Real‑CUGAN consistently preserves texture, sharpens lines, reduces artifacts, and maintains intended blur effects better than the other two models.

Future work includes further model lightweighting, adjustable sharpening/denoising strength, arbitrary‑resolution upscaling, improved texture retention, and continuous community‑driven optimization via issue tracking.

The model and inference tools are available on GitHub (https://github.com/bilibili/ailab/tree/main/Real-CUGAN) and target Python/PyTorch developers, VapourSynth video‑processing experts, and Waifu2x‑Caffe users.

deep learningvideo processingimage restorationAI super-resolutionanime upscalingReal-CUGAN
Bilibili Tech
Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.