Backend Development 9 min read

Deep Dive into Sentinel Rate‑Limiting Core Source Code and Sliding‑Window Mechanism

This article analyzes Sentinel's flow‑control implementation by walking through the core Java methods such as entry, checkFlow, canPassCheck, passClusterCheck, and canPass, explaining how sliding‑window metrics and token‑bucket logic determine whether a request is allowed or throttled.

政采云技术

Oct 14, 2021

Deep Dive into Sentinel Rate‑Limiting Core Source Code and Sliding‑Window Mechanism

The article continues the series on Sentinel rate limiting, focusing on the core source code that performs flow‑control checks using a sliding‑window mechanism.

It starts with the entry method, which receives the request context and then calls checkFlow to apply flow‑control rules before invoking the next slot via fireEntry:

@Override
public void entry(Context context, ResourceWrapper resourceWrapper, DefaultNode node,
    int count, boolean prioritized, Object... args) throws Throwable {
    // 检测并应用流控规则
    checkFlow(resourceWrapper, context, node, count, prioritized);
    // 触发下一个Slot
    fireEntry(context, resourceWrapper, node, count, prioritized, args);
}

The checkFlow method retrieves all applicable FlowRule objects from the rule provider, iterates over them, and throws a FlowException when a rule cannot be passed:

void checkFlow(ResourceWrapper resource, Context context, DefaultNode node,
    int count, boolean prioritized) throws BlockException {
    Collection<FlowRule> rules = ruleProvider.apply(resource.getName());
    if (rules != null) {
        for (FlowRule rule : rules) {
            if (!canPassCheck(rule, context, node, count, prioritized)) {
                throw new FlowException(rule.getLimitApp(), rule);
            }
        }
    }
}

The canPassCheck method decides whether to use cluster‑mode checking or local checking based on the rule configuration:

public boolean canPassCheck(@NonNull FlowRule rule, Context context,
    DefaultNode node, int acquireCount, boolean prioritized) {
    String limitApp = rule.getLimitApp();
    if (limitApp == null) {
        return true;
    }
    if (rule.isClusterMode()) {
        return passClusterCheck(rule, context, node, acquireCount, prioritized);
    }
    return passLocalCheck(rule, context, node, acquireCount, prioritized);
}

When a rule is in cluster mode, passClusterCheck tries to obtain a token from the remote token service; if the service is unavailable, it falls back to the local algorithm:

private static boolean passClusterCheck(FlowRule rule, Context context,
    DefaultNode node, int acquireCount, boolean prioritized) {
    try {
        TokenService clusterService = pickClusterService();
        if (clusterService == null) {
            // If client is absent, fallback to local flow control
            return fallbackToLocalOrPass(rule, context, node, acquireCount, prioritized);
        }
        long flowId = rule.getClusterConfig().getFlowId();
        TokenResult result = clusterService.requestToken(flowId, acquireCount, prioritized);
        return applyTokenResult(result, rule, context, node, acquireCount, prioritized);
    } catch (Throwable ex) {
        RecordLog.warn("[FlowRuleChecker] Request cluster token unexpected failed", ex);
    }
    // Fallback to local flow control when token client or server is not available
    return fallbackToLocalOrPass(rule, context, node, acquireCount, prioritized);
}

The local check is performed in canPass, which first obtains the current usage count via avgUsedTokens(node) and then compares it with the configured limit. If the request exceeds the limit and priority waiting is enabled, it calculates a wait time and throws a PriorityWaitException:

public boolean canPass(Node node, int acquireCount, boolean prioritized) {
    int curCount = avgUsedTokens(node);
    // 判断是否需要限流
    if (curCount + acquireCount > count) {
        if (prioritized && grade == RuleConstant.FLOW_GRADE_QPS) {
            long currentTime = TimeUtil.currentTimeMillis();
            long waitInMs = node.tryOccupyNext(currentTime, acquireCount, count);
            if (waitInMs < OccupyTimeoutProperty.getOccupyTimeout()) {
                node.addWaitingRequest(currentTime + waitInMs, acquireCount);
                node.addOccupiedPass(acquireCount);
                throw new PriorityWaitException(waitInMs);
            }
        }
        return false;
    }
    return true;
}

The helper avgUsedTokens returns either the current thread count or the QPS value depending on the rule grade:

private int avgUsedTokens(Node node) {
    if (node == null) {
        return DEFAULT_AVG_USED_TOKENS;
    }
    return grade == RuleConstant.FLOW_GRADE_THREAD ? node.curThreadNum()
        : (int) (node.passQps());
}

The QPS value is calculated by passQps(), which divides the total passed count in the sliding window by the window interval:

@Override
public double passQps() {
    return rollingCounterInSecond.pass() / rollingCounterInSecond.getWindowIntervalInSec();
}

rollingCounterInSecond

aggregates the passed counts from all metric buckets within the current sliding window:

@Override
public long pass() {
    // 获取当前样本窗口
    data.currentWindow();
    long pass = 0;
    List<MetricBucket> list = data.values();
    for (MetricBucket window : list) {
        // 增加通过数
        pass += window.pass();
    }
    return pass;
}

The underlying LeapArray.values(long timeMillis) method filters out deprecated windows based on the current timestamp and returns only valid samples:

public List<T> values(long timeMillis) {
    if (timeMillis < 0) {
        return new ArrayList<>();
    }
    int size = array.length();
    List<T> result = new ArrayList<>(size);
    for (int i = 0; i < size; i++) {
        WindowWrap<T> windowWrap = array.get(i);
        if (windowWrap == null || isWindowDeprecated(timeMillis, windowWrap)) {
            continue;
        }
        result.add(windowWrap.value());
    }
    return result;
}

By tracing these methods, the article demonstrates how Sentinel determines the current pass count, evaluates it against the configured QPS or thread limit, and decides whether to allow, throttle, or defer a request, completing the local rate‑limiting flow before moving on to cluster‑level logic in future posts.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Rate Limiting

Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.