Testing Asynchronous Systems: Strategies and Best Practices
Testing asynchronous systems requires specialized strategies—monitoring callbacks with synchronization primitives and reliable polling with timeouts, delays, and frequencies—to handle nondeterministic execution, avoid flaky assertions, and improve testability by decoupling business logic from periodic scheduling, as demonstrated by real‑world polling implementations for Elasticsearch and MySQL/Redis jobs.
Internet software systems evolve continuously to adapt to more complex business scenarios and higher performance requirements. System asynchronization is one of the evolution approaches. For tasks that do not require real-time processing but involve intensive computation or large data volumes, or tasks with slow I/O operations, asynchronous processing is a good choice.
Unlike testing synchronous systems or methods, testing asynchronous systems (end-to-end tests, integration tests) or asynchronous methods (unit tests) can become uncontrollable and fail probabilistically because the test thread is not blocked by the asynchronous task thread.
There are two types of asynchronous tasks:
Tasks that notify the caller after execution (via events or notifications)
Tasks that only change system state without notifying the caller
For the first type, the monitoring approach can be used for testing. The code example demonstrates using a lock object to wait for asynchronous events:
@Test
public void testAsynchronousMethod() {
callAsynchronousMethod();
assertXXX(...); //Asynchronous task may not be completed yet, assert may fail
}For the second type where no callback is provided, the polling approach is used. A reliable polling mechanism should include: timeout mechanism, initial delay, and polling frequency.
Test reliability issues can occur with polling-based testing. If system state fluctuates between two polling intervals, the test may incorrectly determine that the async operation is incomplete or has failed. The monitoring approach does not have this problem because it can immediately detect system state changes.
For periodic execution systems, testability improvements can be made by separating business logic from periodic execution logic and adding a restful interface to control business logic execution timing.
YouZan has adopted polling testing for some async jobs. They encountered two types: jobs interacting with Elasticsearch (limited by refresh mechanism, typically 5 seconds), and jobs interacting with MySQL/Redis (works well, tests can complete within 150ms and be integrated into CI/CD pipelines).
Youzan Coder
Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.