Operations 10 min read

Introduction to ELKB: Architecture, Components, and Typical Use Cases of Elasticsearch, Logstash, Kibana, and Beats

The article introduces the ELKB stack—a combination of Elasticsearch, Logstash, Kibana, and Beats—explaining its background, user needs, architecture, component functions, typical scenarios, and the team’s practical implementations for real‑time log and time‑series data processing.

Tencent Database Technology
Tencent Database Technology
Tencent Database Technology
Introduction to ELKB: Architecture, Components, and Typical Use Cases of Elasticsearch, Logstash, Kibana, and Beats

1. Background

ELKB (Elasticsearch, Logstash, Kibana, Beats) is an open‑source distributed log‑management solution. With its closed‑loop processing pipeline, high search performance, linear scalability, and low operational cost, ELKB has quickly become the leading choice for real‑time log handling in recent years. This article first gives a brief overview of the ELK ecosystem and its application scenarios, with more detailed ELK work to follow.

2. User Requirements

In log processing, users often encounter the following needs:

Operations engineers want to analyze error logs across a distributed environment and locate problems in real time using keyword search. Problem: Grep‑style per‑machine log searching becomes painful in distributed setups; traditional big‑data solutions (Hadoop ecosystem) can centralize logs but using MapReduce or SparkSQL for grep‑like queries incurs high cost and poor timeliness, often delayed by hours or days.

Business teams need up‑to‑date request volume, latency, and error‑rate metrics, while operations teams require real‑time visibility of service status. Problem: Hadoop + analytical databases (OLAP) can clean and analyze logs, but this approach also suffers from poor timeliness and adds multiple systems, increasing maintenance overhead.

ELKB was created precisely to satisfy the above log‑analysis requirements.

3. ELKB Architecture

ELKB is an open‑source, closed‑loop distributed log‑management solution covering the entire lifecycle from collection, cleaning, storage, analysis, to visualization. A typical deployment architecture looks like the diagram below:

Component functions are as follows:

Beat : A lightweight agent deployed on business machines to collect data in real time and forward it downstream.

With Beats, the original ELK stack is now referred to as the ELKB ecosystem.

Elastic provides many Beat types for collecting files, network packets, probe data, etc., and users can easily extend Beats to capture custom sources.

FileBeat is commonly used to harvest log files; a single FileBeat can collect multiple log formats from one host and send them to different downstream pipelines while limiting CPU and memory usage.

Logstash : A module that combines log collection and cleaning.

Logstash Agent : Uses Logstash’s collection capabilities and can act as an agent for virtually any log type, including system, error, and custom application logs.

Logstash Indexer : Uses Logstash’s cleaning capabilities, applying regex to extract information and transform raw text into structured or semi‑structured data before sending it to Elasticsearch.

Elasticsearch : A distributed system built on Apache Lucene that offers easy cluster scaling, full‑text search, and analytical database capabilities.

Initially used for site‑wide search (e.g., GitHub’s internal search), competing with Solr.

Elastic now focuses on log collection, storage, analysis, and monitoring, making ELKB the flagship open‑source log‑processing solution.

Some users also employ Elasticsearch as a document‑oriented database, similar to MongoDB.

Kibana : A web‑based graphical interface for searching, analyzing, and visualizing data stored in Elasticsearch.

It leverages Elasticsearch’s powerful search and analytics, offering rich visualizations (line charts, pie charts, maps, etc.) to help users intuitively explore data.

Notes :

Logstash’s complexity makes it unsuitable as an on‑host agent; since ES 5.x, many of its functions have been replaced by lighter Beats.

To simplify architecture and reduce maintenance, Elastic has gradually integrated data‑cleaning capabilities directly into Elasticsearch, achieving up to 10× performance improvement for single‑rule parsing.

When additional consumers (e.g., backup, offline analysis) need the same logs, Kafka is often introduced as a distribution hub; for pure log analysis and search, the green‑boxed part of the diagram can be omitted, yielding a much simpler architecture.

4. Typical Application Scenarios

ELKB’s primary market is log processing, with typical scenarios including:

Fuzzy Search : Real‑time log collection from business machines, storage in Elasticsearch, and rapid issue location via fuzzy search capabilities, meeting sub‑second latency requirements. Example: using error or slow logs for troubleshooting.

Structured Analysis : Real‑time log collection followed by cleaning with Logstash or Elasticsearch, storage in Elasticsearch, and flexible visualization/analysis with Kibana. Example: analyzing business, audit, or transaction logs to monitor service health.

Scenarios where ELKB is not suitable:

Massive Offline Processing : For low‑timeliness, petabyte‑scale log archives, Hadoop‑based solutions are more cost‑effective.

5. Team Work Overview

Currently we use ELK for two main scenarios:

Time‑Series Data Processing : We develop on top of Elasticsearch to support a time‑series data model, delivering high write throughput, high‑concurrency queries, and low storage cost, along with features such as data down‑sampling, access control, and hot‑cold data management. Compared with InfluxDB, our write throughput is 30% higher (≈200 k writes/s per node) and query concurrency is nearly four times higher (≈20 k QPS per node), while offering richer analytics.

Log Processing : Through interactive log ingestion, automated collection/reporting, and performance tuning of log parsing/analysis, we provide a simple yet efficient log platform. Additional features such as monitoring alerts, data export, permission control, and log management further enhance log‑analysis capabilities and help users extract maximum value from their logs.

We will gradually share more details about our work in the future, and we welcome continued attention.

ElasticsearchobservabilityloggingELKLogstashBeatskibana
Tencent Database Technology
Written by

Tencent Database Technology

Tencent's Database R&D team supports internal services such as WeChat Pay, WeChat Red Packets, Tencent Advertising, and Tencent Music, and provides external support on Tencent Cloud for TencentDB products like CynosDB, CDB, and TDSQL. This public account aims to promote and share professional database knowledge, growing together with database enthusiasts.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.