Implementing Per‑User Rate Limiting with Alibaba Cloud Service Mesh (ASM) Traffic Scheduling Suite
This article explains how to use Alibaba Cloud Service Mesh (ASM) traffic‑scheduling suite to implement rich traffic‑control scenarios such as per‑user rate limiting, request queuing and priority scheduling in a Kubernetes environment, providing step‑by‑step deployment, configuration and verification instructions.
In distributed systems, protecting and scheduling traffic is essential for stability; common mechanisms include rate limiting, concurrency limits, request queuing, priority scheduling, tenant‑level limiting and circuit breaking. Traditional middleware often tightly couples with business logic, whereas a service mesh offers transparent, non‑intrusive traffic management.
Alibaba Cloud Service Mesh (ASM) is a fully managed, Istio‑compatible mesh that adds a traffic‑scheduling suite capable of unified load dispatch, per‑user limiting, queuing, and other advanced policies without modifying application code.
Step 1 – Deploy demo services
Two sample services (httpbin and sleep) are deployed to demonstrate per‑user limiting. The following manifests are applied:
kubectl apply -f- < kubectl apply -f- <After deployment, verify connectivity:
kubectl exec -it deploy/sleep -- curl -I http://httpbin:8000/headersExpected HTTP 200 response confirms the services are reachable.
Step 2 – Enable ASM traffic‑scheduling suite
Ensure the ASM instance version is ≥ 1.21 and the Kubernetes cluster is attached. Then patch the mesh configuration to enable the adaptive scheduler:
kubectl patch asmmeshconfig default --type=merge --patch='{"spec":{"adaptiveSchedulerConfiguration":{"enabled":true,"schedulerScopes":[{"namespace":"default"}]}}}'Step 3 – Create a RateLimitingPolicy for per‑user limiting
The policy uses a token‑bucket algorithm and limits traffic based on the http.request.header.user_id label, providing independent token buckets per user.
apiVersion: istio.alibabacloud.com/v1
kind: RateLimitingPolicy
metadata:
name: ratelimit
namespace: istio-system
spec:
rate_limiter:
bucket_capacity: 2
fill_amount: 2
parameters:
interval: 30s
limit_by_label_key: http.request.header.user_id
selectors:
- agent_group: default
control_point: ingress
service: httpbin.default.svc.cluster.local
EOFKey fields:
Field
Description
fill_amount
Number of tokens added each
interval(2 tokens every 30 seconds in the example).
interval
Time period for token replenishment (30 s).
bucket_capacity
Maximum tokens the bucket can hold; setting equal to
fill_amountdisables burst traffic.
limit_by_label_key
Header key used to separate token buckets per user (
user_id).
selectors
Target services for the policy (here
httpbin.default.svc.cluster.local).
Step 4 – Verify per‑user limiting
Execute the following commands from the sleep pod:
curl -H "user_id: user1" http://httpbin:8000/headers -v
curl -H "user_id: user1" http://httpbin:8000/headers -vThe second request returns HTTP 429 Too Many Requests, confirming the limit for user1 . A request from a different user within the same interval succeeds:
curl -H "user_id: user2" http://httpbin:8000/headers -vResponse is HTTP 200, demonstrating isolated token buckets per user.
Observability
Each ASM traffic‑scheduling policy emits metrics that can be visualized in Grafana dashboards, enabling monitoring of rate‑limit hits, queue lengths, and other events. Integration details are documented in the official ASM guides.
Conclusion
While native Istio limits (global rate limiting, circuit breaking) may not cover complex scenarios, ASM’s traffic‑scheduling suite extends capabilities to include request priority, queuing, concurrency control, per‑user limiting, and progressive rollouts, providing a non‑intrusive foundation for building highly available cloud‑native microservice systems.
Alibaba Cloud Infrastructure
For uninterrupted computing services
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.