AI-Driven UI Testing: Data Collection, Model Development, and Deployment for Mobile App Anomaly Detection
This article presents a comprehensive study on applying AI and deep‑learning techniques to mobile UI testing, covering background challenges, feasibility research, abnormal sample construction, model design, training, evaluation, and future directions for intelligent test automation.
With the evolution of software architecture toward AI‑enabled distributed systems, testing techniques have shifted from monolithic, waterfall approaches to agile, end‑to‑end, full‑link testing, especially as complexity moves from server to mobile devices.
Mobile testing can be divided into code‑intrusive methods (SDK integration, system hooks) and non‑intrusive, user‑centric approaches that rely on UI automation tools such as UIAutomator, WDA, GUITree, or computer‑vision‑based image recognition.
“Test By AI” is emerging as a key direction for large internet companies, leveraging AI to generate test paths, diagnose data features, and improve assertion accuracy; notable platforms include Test.AI, Applitool, Mabl, AirTest, AppiumPro, Fastbot, SmartX, RXT, DevEco Studio, PerfDog, and GameAISDK.
Large‑scale apps face numerous quality‑control challenges, such as blank screens, overlapping text, and misaligned graphics, which demand robust detection mechanisms.
The feasibility study proposes three technical routes: (1) using GUITree node information to verify UI elements, (2) applying traditional CV feature matching (SIFT, SURF, ORB) for image similarity, and (3) training deep‑learning models to perform one‑shot classification and threshold decisions.
To build a negative‑sample dataset, the authors describe constructing image‑missing and text‑overlap anomalies: for missing images, binary thresholding and contour analysis replace blank regions with white blocks; for overlapping text, OCR extracts text positions, then CV techniques generate synthetic overlapped samples using PIL.
The algorithm development workflow includes environment setup (local GPU or cloud services), data preparation (collection, cleaning, augmentation, labeling), network design (SE‑ResNet18 with SELayer), training (cross‑entropy loss, epoch/batch settings), and iterative optimization.
Model evaluation shows ROC curves and confusion matrices for test and real samples, with analysis of bad cases such as mislabeled data and missing sample types (e.g., pink‑background blanks), guiding future data‑collection strategies.
Future plans aim to enhance model performance through data cleaning, architecture improvements, unsupervised learning, double‑check mechanisms, and continued integration of AI to boost test path generation, data diagnostics, and assertion accuracy.
JD Retail Technology
Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.