Artificial Intelligence 8 min read

Real-Time Mobile Super-Resolution Reconstruction in Baidu App

The article describes Baidu App's real-time mobile super-resolution using a VDSR-based model with pruning and depthwise separable convolutions, optimized via application-layer and inference engine techniques to halve latency and memory, enabling on-device high‑def image/video enhancement, reducing server load, and supporting iOS/Android integration.

Baidu App Technology
Baidu App Technology
Baidu App Technology
Real-Time Mobile Super-Resolution Reconstruction in Baidu App

1. Background

With the proliferation of mobile devices, content creation and consumption on mobile becomes increasingly easy. Baidu App, as a content distribution platform, carries massive amounts of text, image, and video content contributed by PGC and UGC. As 2K screen resolution is now mainstream, users naturally expect high‑definition resources. However, image and video acquisition, transmission, and storage are limited by various factors, leading to resources with relatively low clarity and resolution, which degrades user experience. Baidu App, together with the Baidu Vision Technology team, applies a deep‑learning‑based real‑time super‑resolution reconstruction technique to improve image and video display on the device.

2. How to Improve Resolution

Resolution represents the number of pixels per unit area on the imaging plane and reflects the ability to render fine details. Higher resolution yields more detail and more precise pixel values, resulting in better visual experience on the same hardware, albeit with larger file sizes.

Different resolution display effects (image from Wikipedia)

Super‑resolution can be understood as the process of generating additional pixels based on the existing pixel content of an image.

Traditional methods such as interpolation follow fixed rules to compute new pixel values, often resulting in mosaic, jagged edges, and blurred boundaries.

In recent years, thanks to advances in deep learning, convolutional neural networks (CNNs) have borrowed concepts from the human visual system to extract and learn image features, achieving more stable and higher‑quality reconstruction.

3. Baidu App Super‑Resolution Model

The reconstruction model is built on the VDSR residual learning framework, accelerated by model pruning and Depthwise Separable Convolution. The model input is the Y channel up‑sampled to the target resolution by an algorithm, and it supports variable input sizes.

(Image from VDSR paper)

4. Challenges of Real‑Time Mobile Super‑Resolution

5. Strategies and Optimizations for Mobile Real‑Time Super‑Resolution

Application‑layer Optimization:

1. Image super‑resolution memory: For ultra‑large images, the original picture is split into blocks, processed in parallel queues, and the peak memory usage during prediction is dynamically constrained.

2. Video super‑resolution real‑time: A strategy module provides extreme super‑resolution and safe‑frame‑rate super‑resolution to ensure playback stability.

3. Compute resource scheduling: Some CPU‑based pre‑ and post‑processing are migrated to GPU operators, with both pre/post‑processing and inference handled uniformly by the GPU.

Inference Engine Optimization:

Optimization results:

1. Image & video super‑resolution prediction latency reduced to less than 50% of the original. Batch capability: iOS optimized to 1/4 of CoreML latency. 480p prediction speed: iPhone XR ~25 ms; Snapdragon 845 Android devices ~23 ms.

2. GPU memory consumption for image & video super‑resolution reduced to below 50%.

6. Business Application and Effect Comparison

Both image and video super‑resolution have been deployed in multiple Baidu mobile products. Tens of millions of images and videos are processed by on‑device super‑resolution daily, presented to users without any server‑side intervention, thereby reducing server compute, storage, and bandwidth load for low‑quality resources.

Low‑resolution reconstruction vs. native high‑resolution quality

7. End‑to‑End Integration Plan

Baidu App will open video super‑resolution capabilities soon. Stay tuned.

// iOS

/**

超分

@param image 待 超分Image

@param scaleType SR倍数

@param block result回调

*/

-( void )executeSuperResolutionWithImage:(UIImage *)aImage

scale:(MMLImageSuperResolutionScaleType)scaleType

completion:( void (^)(UIImage *srImage, NSError *error))block API_AVAILABLE(ios( 9.0 ));

//Android

/**

* 执行图片超分

*

* @param inputBitmap 待超分的图片

* @param scale SR倍数

* @param onSrResultListener 超分结果回调

*/

void sr(Bitmap inputBitmap, float scale, OnSrResultListener onSrResultListener);

8. References

https://en.wikipedia.org/wiki/Image_resolution

https://arxiv.org/abs/1511.04587

Real-time Processingdeep learningmobile AIimage enhancementSuper-Resolution
Baidu App Technology
Written by

Baidu App Technology

Official Baidu App Tech Account

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.