Cloud Native 17 min read

How Kelemetry Transforms Kubernetes Observability with Object‑Centric Tracing

Kelemetry, an open‑source tracing system from ByteDance, links Kubernetes control‑plane components by treating each object as a span, aggregating audit logs and events into unified traces that are visualized as trees or timelines, supporting multi‑cluster monitoring and custom conversion pipelines.

ByteDance Cloud Native
ByteDance Cloud Native
ByteDance Cloud Native
How Kelemetry Transforms Kubernetes Observability with Object‑Centric Tracing

Background

Kelemetry is a tracing system developed by ByteDance for the Kubernetes control plane. It connects the behavior of multiple Kubernetes components from a global perspective, tracking the full lifecycle of individual objects and the interactions between different objects.

Traditional distributed tracing follows a request‑centric model where a root span is created at the start of a user request and child spans are generated for each internal RPC. Kubernetes, however, operates asynchronously and declaratively: components update the desired state in the apiserver and other components continuously reconcile toward that state, making the classic span model unsuitable.

Because each component reconciles independently, direct causal relationships are hard to observe. Existing component‑specific traces only capture isolated reconciliations, leaving observability islands.

Design

1. Object as Span

Kelemetry creates a span for each Kubernetes object instead of for each operation. Events occurring on the object become child spans. Objects are linked through ownership, so child‑object spans become sub‑spans of the parent‑object span. This yields two dimensions: a tree hierarchy representing object relationships and a timeline showing event order, usually matching causality.

For example, a single‑pod Deployment’s interaction among the deployment controller, replica‑set controller, and kubelet can be displayed as a single trace using audit logs and events.

2. Audit Log Collection

Kelemetry’s primary data source is the apiserver audit log, which records detailed information about each controller operation, including the client, involved objects, and precise duration. Audit logs are exposed either as log files or via a webhook. Kelemetry provides an audit webhook to receive native audit events and a plugin API for consuming logs from vendor‑specific message queues.

3. Event Collection

Controllers emit Kubernetes

Event

objects during reconciliation. Kelemetry watches the event API to retrieve the latest version of each event. To avoid duplicate spans, it uses heuristics such as persisting the timestamp of the last processed event and ignoring earlier events after a restart, and checking for changes in

resourceVersion

.

0/4022 nodes are available to run pod xxxxx: 1072 Insufficient memory, 1819 Insufficient cpu, 1930 node(s) didn't match node selector, 71 node(s) had taint {xxxxx}, that the pod didn't tolerate.

4. Linking Object State with Audit Logs

Kelemetry runs a controller that watches object create, update, and delete events. When an audit event arrives, it links the audit span to the corresponding object span using the object's

resourceVersion

. Diffs and snapshots of each

resourceVersion

are cached in a distributed KV store, allowing later correlation of audit logs with the fields changed by a controller. This also helps identify 409 conflicts by grouping audit logs that share the same old

resourceVersion

.

5. Front‑end Trace Conversion

Because traditional tracing protocols (e.g., OTLP) cannot modify a span after it ends, Kelemetry intercepts results between the Jaeger query front‑end and the storage back‑end, applying custom conversion pipelines. Four pipelines are supported:

tree – a simplified trace tree with shortened service/operation names.

timeline – flattens nested pseudo‑spans, placing all event spans under the root.

tracing – expands non‑object spans into logs attached to related object spans.

grouping – creates a new pseudo‑span for each data source (audit/event) and merges spans from different components.

Users select a pipeline via the “service name” filter; the intermediate storage plugin generates a new CacheID that maps to the actual TraceID and the chosen pipeline.

6. Breaking Duration Limits

Traces are limited to 30‑minute windows to avoid storage overload. Kelemetry’s storage plugin merges spans that share the same object label across windows, presenting a continuous story as a single trace while eliminating duplicate object spans.

7. Multi‑Cluster Support

Kelemetry can be deployed to monitor events from multiple clusters. At ByteDance, it creates about 8 billion spans per day (excluding pseudo‑spans) using a multi‑Raft cache backend. Objects can be linked across clusters, enabling cross‑cluster component tracing.

Future Enhancements

1. Custom Trace Sources

Kelemetry will ingest traces from existing components and integrate them into its unified view, extending observability beyond audit logs and events.

2. Batch Analysis

Aggregated traces will enable queries such as “how long does a deployment upgrade take from start to first image pull?” Future work includes periodic analysis of half‑hourly trace outputs to identify patterns and correlate scenarios.

Use Cases

1. ReplicaSet Controller Anomaly

A deployment repeatedly created new Pods. Kelemetry trace revealed that the ReplicaSet controller emitted a

SuccessfulCreate

event for Pod creation but never updated the ReplicaSet status or observed the Pods, indicating a possible informer consistency issue.

2. Floating minReadySeconds

A deployment’s rolling update was unusually slow because the

minReadySeconds

field was temporarily set to 3600 by the federation component. Kelemetry trace pinpointed the change, allowing quick rollback to the intended value of 10.

Try Kelemetry

Kelemetry is open‑source on GitHub: github.com/kubewharf/kelemetry . Follow the docs/QUICK_START.md guide to get started, or view an online preview at kubewharf.io/kelemetry/trace-deployment/ . Contributions and issue reports are welcome.

- END -

Volcano Engine Cloud‑Native Team

The team builds PaaS‑class products for Volcano Engine’s public and private cloud, leveraging ByteDance’s cloud‑native expertise to accelerate digital transformation with services such as container platforms, image registries, serverless, service mesh, continuous delivery, and observability.

Cloud NativeobservabilityKubernetestracingAudit LogsKelemetry
ByteDance Cloud Native
Written by

ByteDance Cloud Native

Sharing ByteDance's cloud-native technologies, technical practices, and developer events.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.