Operations 22 min read

Solving GenAI Observability Standardization with LoongSuite’s Unified Data Language

The article details how Alibaba and Ant Group’s LoongSuite GenAI observability semantic conventions use a unified data language to standardize GenAI telemetry, introduce entry/step spans, skill semantics, and token‑level tracing, and provide a reusable GenAI Utils library for scalable deployment across agents and inference engines.

AntTech
AntTech
AntTech
Solving GenAI Observability Standardization with LoongSuite’s Unified Data Language

Background

Generative AI (GenAI) agents introduce core concepts such as Model, Prompt, Token, Tool Calling, Agent, Memory, and Session. These objects need standardized collection, display, and consumption similar to HTTP requests or database calls. OpenTelemetry (OTel) began working on GenAI semantic conventions (SemConv) in early 2024 to provide a unified observability data model.

SemConv Position and Value

Beyond auto‑instrumentation and SDKs, OTel’s primary purpose is to define a common data language through SemConv. Standard field names (e.g., gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens) enable consistent analysis and governance across models, frameworks, and platforms.

Benefits of a Unified Data Language

Technical troubleshooting: Trace IDs allow operators to locate latency or error sources across agents within minutes.

Business analysis: Cross‑business metrics become comparable, supporting product decisions.

Evaluation: Continuous user‑trajectory collection builds end‑to‑end test datasets for multi‑agent scenarios.

Compliance: A unified audit trail satisfies security and regulatory requirements.

Entry/Step Span Enhancements

Long‑running agents generate hundreds of spans, making traces unreadable. LoongSuite introduces two new span types:

Entry Span: Captures the original user request and model input at the entry point, preserving data untouched by system or framework prompts.

Step Span: Represents each ReAct iteration (reflect → tool call → model call). Operators can top‑down locate the problematic iteration and then drill into the specific step.

Skill Semantics

Agents often route user intents to Skills—small, reusable business functions that orchestrate LLM and tool calls. Existing OTel conventions lack a Skill abstraction, causing three pain points:

Inability to attribute performance issues to a specific functional domain.

No health metrics (P99 latency, success rate, call frequency) at the Skill level.

Mixed traces when multiple Skills run concurrently.

LoongSuite adds gen_ai.skill.* attributes (identity, version) to existing execute_tool spans and proposes a dedicated invoke_skill span. The proposal is tracked at https://github.com/open-telemetry/semantic-conventions-genai/issues/86.

Token‑Level Inference Observation

Performance and accuracy anomalies often originate from individual token generation. LoongSuite defines attributes to capture:

Timestamp of token entry and exit from each iteration, enabling calculation of scheduling time, execution time, and user‑perceived latency.

Top‑K token probability distribution for each token, allowing detection of precision problems and sampling‑parameter misconfigurations.

These metrics are collected for vLLM, SGLang, and TensorRT‑LLM, powering an “engine microscope” that visualizes token‑wise latency, concurrency, and probability distributions.

GenAI Utils – Engineering the SemConv

GenAI Utils is a thin abstraction layer that handles span creation, attribute attachment, metric recording, event emission, and context management, preventing duplicated instrumentation code across frameworks.

Architecture

The instrumentation layer extracts data and populates Invocation objects; the ExtendedTelemetryHandler (a subclass of OTel’s TelemetryHandler) performs all OTel API interactions. When the SemConv evolves, only the Utils library requires updating.

API Usage Example (Python)

from opentelemetry.util.genai.extended_handler import get_extended_telemetry_handler
handler = get_extended_telemetry_handler(tracer_provider=tracer_provider, logger_provider=logger_provider)

# Agent invocation
with handler.invoke_agent() as invocation:
    invocation.provider = "dashscope"
    invocation.request_model = "qwen-max"
    invocation.agent_name = "ShoppingAssistant"
    invocation.agent_id = "agent-001"
    invocation.input_messages = [InputMessage(role="user", parts=[Text(content="帮我推荐一款笔记本电脑")])]
    # ... actual agent call ...
    invocation.output_messages = [OutputMessage(role="assistant", parts=[Text(content="我来帮您搜索,请稍等...")], finish_reason="tool_calls")]
    invocation.input_tokens = 42
    invocation.output_tokens = 18

# Tool execution
with handler.execute_tool() as invocation:
    invocation.tool_name = "search_products"
    invocation.tool_call_arguments = {"query": "笔记本电脑", "category": "electronics"}
    # ... actual tool execution ...
    invocation.tool_call_result = {"products": [{"name": "MacBook Pro", "price": 12999}]}

The handler automatically creates spans, sets gen_ai.agent.* and gen_ai.tool.* attributes, records durations, and captures errors.

Supported Instrumentations

Using GenAI Utils, LoongSuite has instrumented Python, Java, Go, and JavaScript agents covering major GenAI ecosystems (e.g., OpenClaw, LangChain, Spring AI). Upgrading the Utils package propagates SemConv changes to all downstream libraries.

Conclusion and Future Work

The unified semantics move GenAI observability from ad‑hoc logging to a full‑stack, standardized data model that supports performance, cost, quality, and security governance. Ongoing efforts focus on faster community response, richer multimodal support, end‑to‑end AI‑microservice tracing, and tighter collaboration with upstream OTel maintainers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

InstrumentationObservabilityOpenTelemetryGenAILoongSuiteSemanticConventions
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.