Cloud Computing 11 min read

Mastering OpenStack Monitoring: Key Metrics and Best Practices

This article explains what OpenStack is, outlines its core modules, and details the most important monitoring metrics for Nova, Neutron, Keystone, hypervisors, tenants, and RabbitMQ, helping engineers build a robust, scalable OpenStack monitoring solution.

Efficient Ops
Efficient Ops
Efficient Ops
Mastering OpenStack Monitoring: Key Metrics and Best Practices

What is OpenStack

OpenStack is an IaaS software jointly developed by NASA and Rackspace that enables anyone to build and provide cloud computing services, including private clouds within firewalls for enterprises.

OpenStack Module Composition

OpenStack consists of five core modules:

Nova – Compute service

Keystone – Identity (authentication) service

Glance – Image service

Neutron – Virtual networking service

Cinder – Block storage service

Horizon – UI component

Nova

Nova provides instance lifecycle management, compute resource management, network and authorization management, a RESTful API, asynchronous communication, and supports various hypervisors such as Xen, KVM, VMware vSphere, and Hyper‑V.

Key Nova metrics for monitoring include:

openstack.nova.current_workload – current workload (build, snapshot, migration, resize, etc.)

openstack.nova.running_vms – number of running VMs

openstack.nova.hypervisor_load.1 – hypervisor load, disk, RAM, CPU metrics

openstack.nova.limits.max_personality – project‑related limits

Nova communicates via AMQP using RabbitMQ, enabling asynchronous callbacks that keep API calls non‑blocking.

Neutron & Keystone

Neutron provides virtual network management, simplifying network configuration. Keystone offers authentication and access‑policy services for all OpenStack components, using a REST‑based Identity API.

Important Monitoring Metrics

Monitoring should focus on four categories:

Hypervisor metrics – VM count, hypervisor load, etc.

Nova Server metrics – disk I/O, RAM usage, etc.

Tenant metrics – resource usage per tenant, CPU cores, instance count

Message Queue metrics – RabbitMQ queue size and performance

Hypervisor Metrics

Key hypervisor metrics include:

hypervisor_load – system load over the past minute, similar to OS load average

current_workload – number of active tasks (build, snapshot, migrate, resize)

running_vms – total running VMs

vcpus_available – available CPU cores (useful for capacity planning)

free_disk_gb – free disk space, affecting VM creation

free_ram_mb – free RAM, a critical resource metric

Nova Server Metrics

Monitoring Nova Server metrics helps detect issues such as the “Noisy Neighbor” problem. Metrics like hdd_read_req indicate VM performance and can trigger investigations when spikes occur.

Tenant Metrics

Tenant metrics reflect business‑related resource consumption. Monitoring total_cores_used, max_total_cores, total_instances_used, and max_total_instances helps allocate resources efficiently across different user groups.

RabbitMQ Metrics

RabbitMQ, the message queue used by OpenStack, provides several important metrics:

consumer_utilisation – ideal value is 100%; lower values indicate processing delays

memory – high memory usage can trigger disk paging and throttling

count – number of queues; a count of 0 should raise an alarm

consumers – number of active consumers; zero indicates a serious issue

monitoringCloud ComputingmetricsRabbitMQOpenStackNova
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.