Optimizing the Qingzhou Business Gateway: Performance Boosts, FFI Integration, and Routing Enhancements
This article details the architecture of the Qingzhou Business Gateway, identifies its granular control, data‑loss, and performance issues, and explains a series of optimizations—including FFI usage, table‑pool reuse, coroutine caching, radixtree routing, and connection‑pool tuning—that raise single‑node QPS to 80 k while preserving functional capabilities.
What is the Qingzhou Business Gateway? It is the entry point for all API services of the Qingzhou student project team, built with OpenResty and Lua, handling traffic, decryption, authentication, anti‑tampering, routing, caching, mock, and documentation.
Current Issues
1. Fine‑grained control : Uses method+path+api_version as a unique control level, allowing per‑API settings for signing, authentication, internal/external access, backend path, etc.
2. Unexplained dict data loss : Data stored in nginx.dict on the master process is accessed by workers via IPC with a lock; occasional loss occurs without a known cause.
3. Poor performance : Extremely fine granularity and regex‑based routing cause high CPU usage; QPS is low.
Optimization Journey
After refactoring, a 4‑core server can reach 80 k QPS (limited by backend services and NIC). The improvements leveraged many components from the open‑source API7/APISIX ecosystem.
FFI Usage
LuaJIT's Foreign Function Interface (FFI) allows direct calls to C functions, dramatically improving performance compared to Lua I/O. Example:
-- Define FFI
ffi.cdef [[
int uname(struct uts *buf);
]]
local os = ffi.os
if os == "OSX" then
ffi.cdef [[
struct uts {
char os[256];
char hostname[256];
char release[256];
char version[256];
char machine[256];
char domain[256];
};
]]
elseif os == "Linux" then
ffi.cdef [[
struct uts {
char os[65];
char hostname[65];
char release[65];
char version[65];
char machine[65];
char domain[65];
};
]]
end
_M.os = os
function _M:getSystemInfo()
local res = {}
if self.os == "Windows" then
res["os"] = "Windows"
res["hostname"] = "unknown"
res["release"] = "unknown"
res["version"] = "unknown"
res["machine"] = "unknown"
return res
end
local uts = ffi.new("struct uts[1]")
C.uname(uts)
res["os"] = ffi.string(uts[0].os)
res["hostname"] = ffi.string(uts[0].hostname)
res["release"] = ffi.string(uts[0].release)
res["version"] = ffi.string(uts[0].version)
res["machine"] = ffi.string(uts[0].machine)
return res
endThis shows how defining C interfaces with ffi.cdef enables direct, non‑blocking system calls.
Table Reuse
Lua tables are expensive to create repeatedly; OpenResty provides a tablepool module to recycle them:
access_by_lua_block {
local tablepool = require "tablepool"
ngx.ctx.api_ctx = tablepool.fetch("ngx_ctx", 0, 10)
}
log_by_lua_block {
local tablepool = require "tablepool"
tablepool.release("ngx_ctx", ngx.ctx.api_ctx)
}Tables fetched in the access phase are released in the log phase, preventing data contamination.
Coroutine‑level Cache
Headers are frequently accessed via ngx.req.get_headers() , which internally uses FFI but still incurs overhead. A lightweight cache wrapper stores headers in the request context:
local req_header = ngx.req.get_headers
local function get_header(ctx, key, default)
if ctx.header == nil then
ctx.header = ngx.req.get_headers()
end
return ctx.header[key] or default
endThis reduces repeated FFI calls.
Routing Optimization: Traversal+Regex vs. radixtree
Original routing traversed all routes and applied regex (O(n)). Switching to lua‑resty‑radixtree , which combines a hash table lookup (O(1)) with a radix tree (O(k)), yields 100‑200× speedups in route matching.
Connection Pool
Enabling Nginx's proxy connection pool roughly doubles throughput for proxy scenarios.
From dict to In‑memory Cache
Legacy dict storage caused IPC overhead and occasional loss. The refactor moves data to worker‑level memory, using LRU caches for hot data and worker‑event for update propagation, preparing for a future etcd‑based config center.
ngx.req.set_uri vs. ngx.var
Using ngx.var to rewrite URIs avoids the heavy validation and memory copies performed by ngx.req.set_uri . Example:
# nginx.conf
set $upstream_uri "";
location / {
proxy_pass http://api_upstream$upstream_uri;
}
-- Lua code
ngx.var.upstream_uri = "/api/v1/user/info"Similarly, setting headers via ngx.var avoids the overhead of ngx.req.set_header when static values suffice.
Final Thoughts
The Qingzhou Business Gateway optimization demonstrates that systematic profiling, leveraging high‑performance OpenResty components, and careful code‑level tweaks can dramatically improve latency and throughput while maintaining functional richness.
TAL Education Technology
TAL Education is a technology-driven education company committed to the mission of 'making education better through love and technology'. The TAL technology team has always been dedicated to educational technology research and innovation. This is the external platform of the TAL technology team, sharing weekly curated technical articles and recruitment information.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.