Mobile Development 14 min read

Design and Implementation of the Cloud Touch Platform for Remote Mobile Device Control and Testing

The article presents the background, full‑scenario construction, core architecture, device‑pool strategy, remote iOS control via WebDriverAgent, screen‑sync using ffmpeg, streaming pipeline, data collection, and practical lessons of the Cloud Touch platform that enables unified remote testing and customer‑support workflows for mobile applications.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Design and Implementation of the Cloud Touch Platform for Remote Mobile Device Control and Testing

The Cloud Touch team at Ctrip builds a foundational platform for hotel‑wireless services, providing a cloud‑based mobile device (Cloud Touch) that supports post‑release acceptance testing, customer‑support scenarios, and internationalized verification across global locations.

Background : After functional testing, many internal teams need to perform unrestricted, guest‑view acceptance testing (e.g., competitor comparison, localization) and customer‑support staff require a production‑like environment to understand user issues. Existing test environments cannot meet these needs in terms of operation experience and resource alignment.

Full‑scenario construction : The platform defines target user groups for the Cloud Touch system, illustrated by diagrams showing device management for remote acceptance and a unified entry point for customer‑support workstations.

Technical solution based on Cloud Touch :

3.1 Core platform design : Provides centralized device allocation, cross‑region request distribution, remote control, real‑time screen sync, and standardized preset parameters for different business scenarios.

3.2 Device‑pool design : Handles a large pool of devices shared by test engineers and support staff, with a strategy for allocating devices to various scenarios.

3.3 Remote device control design and implementation : Uses iOS WebDriverAgent (WDA) as the core remote‑control framework. The client sends JSON commands to WDA, which executes mouse, keyboard, and system‑key actions. Example command format:

{
    "serial":"00008030-000D48A40291802E",
    "type":"M_TOUCH",
    "message":{
        "action":0,
        "keycodeType":"ascii",
        "keyCode":60,
        "position":{"x":687,"y":1116}
    }
}

Key components:

WDAClient – Python client library (facebook‑wda) that builds HTTP requests.

WDAServer – The machine running the WDA app, implementing the WebDriver protocol.

Session – Maintains client state; first request creates a session ID.

WebElement – Represents a DOM element.

JsonWireProtocol / Mobile JSON Wire Protocol – Communication specifications.

3.3.1 Command set adaptation : Supports mouse events (click, swipe), keyboard events (ASCII input, system keys), and complex script commands (app launch, login, install/uninstall).

3.4 Remote screen sync design and implementation :

Screen capture is performed by WDA’s built‑in mjpegServer, which streams compressed screenshots.

ffmpeg processes the mjpeg stream, encodes frames to H.264, and pushes the video to a streaming server.

Static configuration for mjpegServer:

static NSUInteger FBMjpegScalingFactor = 100; // screenshot scaling factor
static NSUInteger FBMjpegServerScreenshotQuality = 25; // 1‑100, higher = better quality
static NSUInteger FBMjpegServerFramerate = 24; // frames per second

The client fetches the mjpeg stream via HTTP, extracts individual JPEG images, and feeds them to ffmpeg for H.264 encoding. The encoded stream is then sent to the internal media server, which generates a live‑playback URL for front‑end consumption.

Bandwidth considerations: with a 4.5 Mbps ceiling, typical usage consumes ~350 KB/s, well within the available Wi‑Fi capacity. The system achieves smooth 30 fps playback without visual artifacts.

Data collection : The platform records health metrics, usage volume, and logs to monitor stability and guide iterative improvements.

Practice summary : The same technology stack (WDA, ffmpeg, cloud device pool) is widely used in UI automation testing across Ctrip. By platformizing these capabilities, the team reduces manual effort, improves resource utilization, and enables consistent, scalable acceptance testing for both internal and international teams.

Future optimizations include concurrent emulator installations and multi‑scenario reuse of single devices, aiming to fully replace physical device operations with the Cloud Touch experience.

StreamingMobile TestingFFmpegiOS AutomationCloud TouchRemote Device ControlWebDriverAgent
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.