How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability
This article explains the challenges of observability in distributed microservice and LLM architectures, introduces CloudWeGo and APMPlus, and provides step‑by‑step integration guides for Kitex, Hertz, and Eino frameworks, including code samples, data reporting methods, and advanced monitoring features such as RED metrics, LLM‑specific indicators, service topology, and future roadmap.
Background
Distributed architectures and micro‑services improve scalability but create three observability challenges: data dispersion, complex traceability, and fault propagation. Traditional monolithic monitoring cannot link full call chains across services, especially in LLM applications, requiring developers to add custom instrumentation.
What is CloudWeGo
CloudWeGo is ByteDance’s open‑source enterprise‑grade cloud‑native microservice middleware suite, focusing on high performance, scalability, and reliability. It includes sub‑projects such as Kitex, Hertz, Netpoll, Volo, Thriftgo, Fastpb, Pilota, and many others.
What is APMPlus
APMPlus is Volcano Engine’s APM service offering full‑stack performance monitoring, custom tracing, and alerting. It provides:
Exception detection and alerts : quickly locate bottlenecks and failures.
Rich attribution : stack, scheduling, dimension, and custom point analysis.
Trace and log query : combine call‑chain and logs for rapid debugging.
Flexible reporting : trend analysis for system health.
APMPlus powers stability for products like Toutiao, Douyin, and Feishu, and is trusted across industries.
Observability Integration: CloudWeGo + APMPlus
APMPlus server monitoring deeply adapts to CloudWeGo frameworks (Kitex, Hertz, Eino), enabling automatic monitoring and trace collection.
Kitex Integration
Kitex is a high‑performance Go RPC framework supporting multiple protocols and service governance.
Step 1: Initialize Kitex
<code>import (
"github.com/kitex-contrib/obs-opentelemetry/provider"
"github.com/kitex-contrib/obs-opentelemetry/tracing"
...
)
func main() {
serviceName := "echo"
p := provider.NewOpenTelemetryProvider(
provider.WithServiceName(serviceName),
provider.WithInsecure(),
)
defer p.Shutdown(context.Background())
svr := echo.NewServer(
new(EchoImpl),
server.WithSuite(tracing.NewServerSuite()),
server.WithServerBasicInfo(&rpcinfo.EndpointBasicInfo{ServiceName: serviceName}),
)
if err := svr.Run(); err != nil {
klog.Fatalf("server stopped with error:", err)
}
}
</code>Step 2: Data Reporting – either report directly to APMPlus or forward via an OpenTelemetry collector.
Hertz Integration
Hertz is a Go HTTP framework supporting HTTP/1.1, HTTP/2, HTTP/3, and WebSocket.
<code>import (
"github.com/hertz-contrib/obs-opentelemetry/provider"
hertztracing "github.com/hertz-contrib/obs-opentelemetry/tracing"
...
)
func main() {
p := provider.NewOpenTelemetryProvider(provider.WithInsecure())
defer p.Shutdown(context.Background())
tracer, cfg := hertztracing.NewServerTracer()
h := server.Default(tracer)
h.Use(hertztracing.ServerMiddleware(cfg))
h.Spin()
}
</code>Eino Integration
Eino is a Go‑centric LLM application framework offering extensibility, reliability, and performance.
<code>import (
"github.com/cloudwego/eino-ext/callbacks/apmplus"
"github.com/cloudwego/eino/callbacks"
...
)
func main() {
cbh, shutdown, err := apmplus.NewApmplusHandler(&apmplus.Config{
Host: "apmplus-cn-beijing.volces.com:4317",
AppKey: "appkey-xxx",
ServiceName: "eino-app",
Release: "release/v0.0.1",
})
if err != nil { log.Fatal(err) }
callbacks.InitCallbackHandlers([]callbacks.Handler{cbh})
// Wait for all traces and metrics to be sent before exit
shutdown(context.Background())
}
</code>Eino can be deployed locally (Docker, Redis) or on Volcano Engine (container service, model platform).
Application Observability
Through APMPlus, developers can view call chains (Traces), performance metrics, runtime status, RED golden metrics (QPS, error rate, latency), LLM‑specific metrics (call count, token usage, response latency, TTFT, TPOT), service topology, runtime Go metrics, and detailed trace analysis (flame graphs, call lists).
Future Outlook
APMPlus and CloudWeGo will deepen integration for micro‑services and LLM workloads, expanding framework support, reducing integration cost, and adding richer LLM‑specific observability (inference latency, resource usage, generation state) to help optimize model‑calling strategies.
ByteDance Cloud Native
Sharing ByteDance's cloud-native technologies, technical practices, and developer events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.