Backend Development 13 min read

Understanding Repeater: Business‑Invisible Traffic Recording and Replay for Java Services

This article explains the background, concepts, implementation details, and practical deployment considerations of Alibaba's open‑source Repeater tool, which enables low‑impact traffic recording and replay in Java backend services using JVM‑Sandbox bytecode enhancement.

Zhuanzhuan Tech
Zhuanzhuan Tech
Zhuanzhuan Tech
Understanding Repeater: Business‑Invisible Traffic Recording and Replay for Java Services

1 Demand Background

As the scale and complexity of the ZuanZuan e‑commerce platform grew, developers faced challenges in constructing complex model parameters for testing, generating realistic load‑testing traffic, and ensuring automated regression testing without impacting business performance. Repeater, an open‑source JVM‑sandbox‑based tool, addresses these problems by providing invisible traffic recording and replay with minimal performance overhead.

2 Traffic Recording and Replay Concepts

2.1 Traffic Recording

In Java, a recording consists of an entrance invocation (e.g., HTTP/Dubbo/Java) and multiple sub‑invocations. The recording binds these calls into a complete trace.

/**
 * 获取商品价格,先从redis中获取,如果redis中没有,再用rpc调用获取,
 * @param productId
 * @return
 */
public Integer getProductPrice(Long productId) { //入口调用
    //1.redis获取价格
    Integer price = redis.get(productId); //redis远程子调用
    if (Objects.isNull(price)) {
        //2.远程调用获取价格
        price = daoRpc.getProductCount(productId); //rpc远程子调用
        redis.set(productId, price); //redis远程子调用
    }
    //3.价格策略处理
    price = process(price); //本地子调用
    return price;
}

private Integer process(Long price) {
    //价格策略远程调用
    return logicRpc.process(productId); //rpc远程子调用
}

The recording captures the entrance method’s input/output and the remote sub‑calls (redis.get, daoRpc.getProductCount, redis.set, logicRpc.process) but not local calls like process .

2.2 Traffic Replay

During replay, the recorded entrance input is used to invoke the method again, and for each sub‑call the previously recorded input and output are returned directly, bypassing the real remote call. Mismatched return values mark the replay case as failed.

3 Repeater Implementation Principle

Repeater consists of a Console module for configuration and heartbeat management, and an Agent plugin module that performs the core recording and replay logic. The agent leverages JVM‑Sandbox’s event mechanism (BEFORE, RETURN, THROW) to weave bytecode at runtime without modifying business code.

3.1 How Recording and Replay Logic Is Weaved

Repeater relies on JVM‑Sandbox to inject BEFORE, RETURN, and THROW events. The following diagram (omitted) shows the bytecode changes, and the key code is illustrated below.

public int add(int a, int b) {
    try {
        Object[] params = new Object[]{a, b};
        // BEFORE event
        Spy.Ret retOnBefore = Spy.onBefore(10001, "com.taobao.test.Test", "add", this, params);
        if (retOnBefore.state == I_RETURN) return (int) retOnBefore.object;
        if (retOnBefore.state == I_THROWS) throws(Throwable) retOnBefore.object;
        a = (int) params[0];
        b = (int) params[1];
        int r = a + b;
        // RETURN event
        Spy.Ret retOnReturn = Spy.onReturn(10001, r);
        if (retOnReturn.state == I_RETURN) return (int) retOnReturn.object;
        if (retOnReturn.state == I_THROWS) throws(Throwable) retOnReturn.object;
        return r;
    } catch (Throwable cause) {
        // THROW event
        Spy.Ret retOnThrows = Spy.onThrows(10001, cause);
        if (retOnThrows.state == I_RETURN) return (int) retOnThrows.object;
        if (retOnThrows.state == I_THROWS) throws(Throwable) retOnThrows.object;
        throw cause;
    }
}

The agent inserts logic to capture request/response data during recording and to mock responses during replay via doBefore and doMock methods.

3.2 Core Recording and Replay Code

The doBefore method records metadata, parameters, and responses for non‑replay traffic, while doMock uses the recorded data to either skip, throw, or return immediately during replay.

protected void doBefore(BeforeEvent event) throws ProcessControlException {
    if (RepeatCache.isRepeatFlow(Tracer.getTraceId())) {
        processor.doMock(event, entrance, invokeType);
        return;
    }
    Invocation invocation = initInvocation(event);
    invocation.setEntrance(entrance);
    invocation.setRequest(processor.assembleRequest(event));
    invocation.setResponse(processor.assembleResponse(event));
}

@Override
public void doMock(BeforeEvent event, Boolean entrance, InvokeType type) throws ProcessControlException {
    try {
        final MockRequest request = MockRequest.builder().build();
        final MockResponse mr = StrategyProvider.instance().provide(context.getMeta().getStrategyType()).execute(request);
        switch (mr.action) {
            case SKIP_IMMEDIATELY:
                break;
            case THROWS_IMMEDIATELY:
                ProcessControlException.throwThrowsImmediately(mr.throwable);
                break;
            case RETURN_IMMEDIATELY:
                ProcessControlException.throwReturnImmediately(assembleMockResponse(event, mr.invocation));
                break;
            default:
                ProcessControlException.throwThrowsImmediately(new RepeatException("invalid action"));
        }
    } catch (ProcessControlException pce) {
        throw pce;
    } catch (Throwable throwable) {
        ProcessControlException.throwThrowsImmediately(new RepeatException("unexpected code snippet here.", throwable));
    }
}

4 Repeater Practical Deployment

4.1 Refactoring Points

Redesign the demo management console.

Extend SCF (ZuanZuan RPC) plugin for traffic recording/replay.

Replace MySQL storage with Elasticsearch for scalability.

Support IP changes in Docker environments without interrupting recording.

Add field‑filtering for replay result diffs.

Enable bulk replay.

Implement online‑environment recording.

4.2 Online‑Environment Recording

Recording in production increases memory and CPU usage due to serialization of entrance and sub‑calls. To mitigate impact, a dedicated low‑weight node (1/10 of normal nodes) is provisioned via the release system, and can be removed quickly if issues arise.

5 Summary

The article introduced the principles behind Repeater’s invisible traffic recording and replay, detailed its core code, and outlined practical refactoring steps for production deployment, aiming to help readers understand, use, and confidently adopt the tool.

Performancebackend testingJVM sandboxJava agenttraffic recordingreplay
Zhuanzhuan Tech
Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.