Artificial Intelligence 7 min read

Applying Image Recognition in UI Automation Testing with Sikuli

This article introduces how image‑recognition techniques, particularly using the Sikuli tool, can be applied to UI automation testing for both web and mobile applications, covering practical scenarios, core principles, a suite of useful functions, example code, and the advantages and limitations of the approach.

360 Tech Engineering
360 Tech Engineering
360 Tech Engineering
Applying Image Recognition in UI Automation Testing with Sikuli

Whether testing web or mobile applications, there are many cases where locating elements based on page content or images is required—something traditional selector‑based methods cannot achieve. This article explains the use of image recognition in testing, focusing on the Sikuli tool.

Typical scenarios include using screenshots to detect predefined controls, validating test results by matching UI screenshots against expected images, and performing performance testing such as response‑time measurement.

Principle

Sikuli scripts are written in Jython and rely on image recognition to simulate keyboard and mouse events. The core consists of a Java library that uses a C++ OpenCV‑based engine (accessed via JNI) to locate target images on the screen, while a higher‑level Java API provides simple commands for users.

Function Overview

Find(x) : locate a single image on the screen.

findAll(x) : locate all occurrences of an image.

wait(x, 10) : wait up to 10 seconds for an image to appear.

waitVanish(x, 10) : wait up to 10 seconds for an image to disappear.

exists(x) : check if an image exists without throwing an exception.

click(x) : left‑click the best‑matched image.

doubleClick(x) : double‑click the best‑matched image.

rightClick(x) : right‑click the best‑matched image.

hover(x) : move the mouse pointer over the best‑matched image.

dragDrop(x, y) : drag image x onto image y .

type(x, "text") : type text into the focused element.

paste(x, "text") : paste text into the focused element.

Each function is illustrated with example screenshots of the corresponding Sikuli code.

Code Example

An example script demonstrates how to measure response time using Sikuli’s image‑based actions.

Advantages

Simple to learn; screenshots are enough to start automation.

Effective for games or applications with custom UI components that are hard to locate with traditional selectors.

Low learning curve; common functions are pre‑packaged.

Open‑source, allowing further customization.

Can handle elements like Flash that lack accessible DOM structures.

Limitations

Screen must be unobstructed; any overlay can prevent detection.

Screen resolution changes require new screenshots.

Cannot run in background; tests must be executed on the foreground desktop.

Overall, image‑recognition‑based automation with Sikuli provides a straightforward solution for UI testing scenarios where traditional selector‑based methods fall short.

computer visionUI automationTestingImage RecognitionSikuli
360 Tech Engineering
Written by

360 Tech Engineering

Official tech channel of 360, building the most professional technology aggregation platform for the brand.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.