Simplify Monitoring with Categraf: All‑in‑One Agent for Metrics, Logs, and Traces
Categraf is an all‑in‑one, Go‑based monitoring agent that consolidates metric, log, and trace collection, offering remote_write support, lightweight deployment, and extensive plugin configurations to replace multiple exporters in Prometheus‑based observability stacks.
What Is Categraf
Categraf is a monitoring collection agent similar to Telegraf, Grafana‑Agent, and Datadog‑Agent, designed to provide out‑of‑the‑box data collection for common monitoring targets, including metrics, logs, and traces, using an all‑in‑one architecture.
Key Advantages
Supports the
remote_writeprotocol and can write to Prometheus, M3DB, VictoriaMetrics, InfluxDB, etc.
Collects only numeric metric values; string values are omitted, and tags maintain a stable structure.
All‑in‑one design: a single agent handles metrics, logs, and future trace collection.
Pure Go implementation with static compilation, minimal dependencies, easy distribution and installation.
Implements best‑practice defaults, avoiding unnecessary data collection and reducing high‑cardinality issues at the source.
Provides ready‑made dashboards and alert rules for quick import.
Planned as a core component of the KuaiMao SaaS product, encouraging community contributions.
Installation
Installation is straightforward using binary releases:
<code># download
$ wget https://download.flashcat.cloud/categraf-v0.2.38-linux-amd64.tar.gz
# extract
$ tar xf categraf-v0.2.38-linux-amd64.tar.gz
# enter directory
$ cd categraf-v0.2.38-linux-amd64/</code>After extraction, edit
conf/config.tomlto set the remote write URL and heartbeat options, then start the agent:
<code>$ nohup ./categraf &>categraf.log &</code>Configuration Details
The default configuration directory
confcontains several TOML/YAML files:
config.toml– main configuration.
logs.toml– log‑agent settings.
prometheus.toml– Prometheus‑agent settings.
traces.yaml– trace‑agent settings.
conf/input/*.toml– plugin‑specific configurations.
Main Config ( config.toml )
<code>[global]
print_configs = false
hostname = ""
omit_hostname = false
precision = "ms"
interval = 15
providers = ["local"]
[log]
file_name = "stdout"
max_size = 100
max_age = 1
max_backups = 1
local_time = true
compress = false
[writer_opt]
batch = 1000
chan_size = 1000000
[[writers]]
url = "http://127.0.0.1:17000/prometheus/v1/write"
basic_auth_user = ""
basic_auth_pass = ""
timeout = 5000
dial_timeout = 2500
max_idle_conns_per_host = 100
[http]
enable = false
address = ":9100"
print_access = false
run_mode = "release"
[ibex]
enable = false
interval = "1000ms"
servers = ["127.0.0.1:20090"]
meta_dir = "./meta"
[heartbeat]
enable = true
url = "http://127.0.0.1:17000/v1/n9e/heartbeat"
interval = 10
basic_auth_user = ""
basic_auth_pass = ""
timeout = 5000
dial_timeout = 2500
max_idle_conns_per_host = 100
</code>Log Collection ( logs.toml )
<code>[logs]
api_key = "ef4ahfbwzwwtlwfpbertgq1i6mq0ab1q"
enable = false
send_to = "127.0.0.1:17878"
send_type = "http"
topic = "flashcatcloud"
use_compress = false
send_with_tls = false
batch_wait = 5
run_path = "/opt/categraf/run"
open_files_limit = 100
scan_period = 10
frame_size = 9000
collect_container_all = true
[[logs.items]]
type = "file"
path = "/opt/tomcat/logs/*.txt"
source = "tomcat"
service = "my_service"
</code>Log processing rules can be defined under
logs.Processing_rulesor within each item via
logs.items.logs_processing_rules. Supported rule types include:
exclude_at_match– drop matching log lines.
include_at_match– keep only matching lines.
mask_sequences– replace sensitive patterns.
multi_line– merge multi‑line logs based on a start‑line pattern.
Metric Collection ( prometheus.toml )
Categraf can also scrape Prometheus‑style metrics. Example configuration to scrape
kube‑state‑metrics:
<code>[prometheus]
enable = false
scrape_config_file = "/path/to/in_cluster_scrape.yaml"
log_level = "info"
</code> <code>global:
scrape_interval: 15s
external_labels:
scraper: ksm-test
cluster: test
scrape_configs:
- job_name: "kube-state-metrics"
metrics_path: "/metrics"
kubernetes_sd_configs:
- role: endpoints
api_server: "https://172.31.0.1:443"
tls_config:
ca_file: /etc/kubernetes/pki/ca.crt
cert_file: /etc/kubernetes/pki/apiserver-kubelet-client.crt
key_file: /etc/kubernetes/pki/apiserver-kubelet-client.key
insecure_skip_verify: true
scheme: http
relabel_configs:
- source_labels: [__meta_kubernetes_namespace,__meta_kubernetes_service_name,__meta_kubernetes_endpoint_port_name]
action: keep
regex: kube-system;kube-state-metrics;http-metrics
remote_write:
- url: "http://172.31.62.213/prometheus/v1/write"
</code>Trace Collection ( traces.toml )
The trace configuration wraps an OpenTelemetry Collector, allowing integration with various back‑ends; detailed settings are omitted for brevity.
Plugin Example: Process Monitoring
To monitor an Nginx process, edit
conf/input.procstat/procstat.toml:
<code># collect interval
interval = 15
[[instances]]
search_exec_substring = "nginx"
metrics_name_prefix = "nginx"
labels = { region="cloud", product="n9e" }
gather_total = true
gather_per_pid = false
</code>After updating the plugin configuration, restart Categraf to apply changes. The collected metrics will appear in your monitoring dashboard, and you can add custom labels (e.g.,
group="ops") to enrich the data.
Conclusion
Categraf supports around 60 plugins covering most middleware and cloud platforms, making it a comprehensive replacement for multiple exporters. While it simplifies many monitoring scenarios, the impact on system resources and performance should be evaluated for heavily‑instrumented environments.
Ops Development Stories
Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.