Build a Docker Container Monitoring Stack with CAdvisor, InfluxDB, Grafana
To effectively monitor Dockerized services, this guide walks through selecting a monitoring solution, deploying CAdvisor, integrating it with InfluxDB for persistent storage, visualizing metrics via Grafana, and addressing common issues such as missing utilities, memory stats, and network traffic inaccuracies.
As online services become fully Dockerized, monitoring containers becomes essential. Traditional host‑level monitoring cannot distinguish resource usage of individual containers, which is needed for both operational insight and feeding dynamic scheduling algorithms.
1. Choosing a Container Monitoring Solution
Various options exist, including the built‑in
docker statscommand, Scout, DataDog, Sysdig Cloud, Sensu Monitoring Framework, and CAdvisor.
docker statsprovides real‑time CPU, memory, and network metrics for all containers on a host, but it lacks persistence, alerting, and multi‑host aggregation.
Hosted services like Scout, DataDog, and Sysdig Cloud offer comprehensive features but are commercial. Sensu is free but complex to deploy. Ultimately, CAdvisor was selected because it is open‑source, feature‑rich, easy to deploy via an official Docker image, and can be extended with external storage.
2. Container Resource Monitoring – CAdvisor
2.1 Deployment and Operation
CAdvisor monitors container memory, CPU, network I/O, and disk I/O, exposing a web UI for real‑time status. By default it stores only two minutes of data locally, but it supports exporting metrics to InfluxDB, Redis, Kafka, Elasticsearch, etc.
Deployment is straightforward; run the container with the following command:
After starting, access the UI at
http://<em>host_ip</em>:8080to view container metrics.
2.2 Integrating InfluxDB
Because CAdvisor only retains two minutes of data locally, persisting metrics to InfluxDB—a purpose‑built time‑series database—enables long‑term storage and unified querying.
The CAdvisor container is launched with a custom configuration that points to an InfluxDB instance (e.g.,
influxdb.service.consul:8086) and supplies credentials:
<code>{
"binds": [
"/:/rootfs:ro",
"/var/run:/var/run:rw",
"/sys:/sys:ro",
"/home/docker/var/lib/docker/:/var/lib/docker:ro"
],
"image": "forum-cadvisor",
"labels": {"type": "cadvisor"},
"command": "-docker_only=true -storage_driver=influxdb -storage_driver_db=cadvisor -storage_driver_host=influxdb.service.consul:8086 -storage_driver_user=testuser -storage_driver_password=testpwd",
"tag": "latest",
"hostname": "cadvisor-{{lan_ip}}"
}
</code>A custom
forum-cadvisorimage is used to incorporate fixes and enhancements.
2.3 Common CAdvisor Issues and Fixes
1) Startup Errors
Missing
findutilscaused container startup failures; installing the package resolves the issue.
2) Missing Memory Metrics
Debian kernels may disable CGroup memory support. Adding
cgroup_enable=memoryto
GRUB_CMDLINE_LINUXand rebooting enables memory statistics.
<code>GRUB_CMDLINE_LINUX="cgroup_enable=memory"</code>3) Incorrect Network Traffic Data
CAdvisor originally reports only the first network interface. Modifying its source to aggregate all interfaces and rebuilding the binary fixes the discrepancy.
2.4 CAdvisor Internals
CAdvisor mounts the host’s root and Docker directories, reading container runtime information from Linux namespaces and cgroups. It accesses metrics such as CPU usage from
/sys/fs/cgroup/cpu/docker/<container_id>/cpuacct.statand network statistics from
/proc/<pid>/net/dev.
<code># cat /sys/fs/cgroup/cpu/docker/b1f25723c5c3a17df5026cb60e1d1e1600feb293911362328bd17f671802dd31/cpuacct.stat
user 95191
system 5028
</code> <code># cat /proc/6748/net/dev
Inter-| Receive | Transmit
face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed
eth0: 6266314 512 0 0 0 0 0 0 22787 292 0 0 0 0 0 0
eth1: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
lo: 5926805 5601 0 0 0 0 0 0 5926805 5601 0 0 0 0 0 0
</code>3. Storing Monitoring Data – InfluxDB
InfluxDB is an open‑source distributed time‑series database written in Go, ideal for persisting CAdvisor metrics. The InfluxDB container is deployed alongside CAdvisor, with data directories mounted and service registration via Consul.
After launching the InfluxDB container, create the
cadvisordatabase and a user with appropriate privileges:
<code># influx
Connected to http://localhost:8086 version 1.3.5
> create database cadvisor
> create user testuser with password 'testpwd'
> grant all on cadvisor to testuser
> create retention policy "cadvisor_retention" on "cadvisor" duration 30d replication 1 default
</code>CAdvisor automatically creates the necessary measurements and writes data via its HTTP API.
3.2 Key InfluxDB Concepts
database : logical container for measurements (e.g.,
cadvisor).
timestamp : the time column for each point.
fields : key‑value pairs storing metric values.
tags : indexed key‑value pairs useful for filtering (e.g.,
container_name,
host).
retention policy : defines how long data is kept (e.g., 30 days).
measurement : analogous to a table, grouping fields, tags, and timestamps.
3.3 InfluxDB Features
Special functions such as
FILL(),
INTEGRAL(),
STDDEV(),
MEAN(),
DERIVATIVE(), etc.
Continuous queries for down‑sampling data over time.
4. Visualizing Data – Grafana
Grafana provides a powerful, open‑source dashboard for visualizing metrics from InfluxDB (as well as other sources). It runs in a container, exposing port 8888.
After starting Grafana, configure the InfluxDB data source and create panels to display CPU, memory, and network statistics. When visualizing byte‑based metrics, select the "data (IEC)" unit.
5. Conclusion
Combining CAdvisor, InfluxDB, and Grafana offers a lightweight yet comprehensive container monitoring solution that runs entirely in containers, aligns with a Docker‑centric infrastructure, and provides the data needed for both real‑time observability and downstream analytics such as anomaly detection and intelligent scheduling.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.