Centralized Log Collection with Filebeat and Graylog
This article explains how to use Filebeat together with Graylog to collect, process, and visualize logs from multiple services and environments, covering tool introductions, configuration files, component details, deployment methods, and practical code examples.
When a company runs many services across test and production environments, centralized log collection becomes essential. The article compares using Nginx for external log exposure versus a dedicated log collection service like ELK, and recommends Graylog as a lightweight, extensible alternative that saves effort.
1. Filebeat Tool Introduction
Log collection solution: Filebeat + Graylog!
[1] Filebeat – Log file shipping service
Filebeat is a log shipper that monitors specified log directories or files, reads new entries continuously, and forwards them to elasticsearch , logstash , or graylog .
2. Filebeat Configuration File
The core of configuring Filebeat is writing its configuration file.
The default configuration file is /etc/filebeat/filebeat.yml for RPM/DEB installations. For Mac or Windows, refer to the extracted files. The main configuration includes the inputs.d directory where all .yml files define log sources.
# Configure input sources
# All files under inputs.d are loaded
filebeat.config.inputs:
enabled: true
path: ${path.config}/inputs.d/*.yml
# Uncomment for JSON logs
# json.keys_under_root: true
# Load modules
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 1
# Output to Logstash (Graylog)
output.logstash:
hosts: ["11.22.33.44:5500"]
processors:
- add_host_metadata: ~
- rename:
fields:
- from: "log"
to: "message"
- add_fields:
target: ""
fields:
# Token to prevent unauthenticated data submission
token: "0uxxxxaM-1111-2222-3333-VQZJxxxxxwgX "A simple inputs.d example shows how to collect logs from specific files, filter by keywords, and add tags.
# Log type definition
- type: log
enabled: true
paths:
- /var/log/supervisor/app_escape_worker-stderr.log
- /var/log/supervisor/app_escape_prod-stderr.log
symlinks: true
include_lines: ["WARNING", "ERROR"]
tags: ["app", "escape", "test"]
multiline.pattern: '^\[?[0-9]...{3}'
multiline.negate: true
multiline.match: afterFilebeat also provides modules for common services such as PostgreSQL, Redis, and iptables.
# iptables module
- module: iptables
log:
enabled: true
var.paths: ["/var/log/iptables.log"]
var.input: "file"
# postgresql module
- module: postgresql
log:
enabled: true
var.paths: ["/path/to/log/postgres/*.log*"]
# nginx module
- module: nginx
access:
enabled: true
var.paths: ["/path/to/log/nginx/access.log*"]
error:
enabled: true
var.paths: ["/path/to/log/nginx/error.log*"]3. Graylog Service Introduction
Log collection solution: Filebeat + Graylog!
[1] Graylog – Log monitoring system
Graylog is an open‑source log aggregation, analysis, and alerting platform. Compared with ELK, it is simpler to deploy but less extensible; a commercial version is also available.
In a typical deployment, Graylog consists of three components: Elasticsearch for storing and searching logs, MongoDB for Graylog configuration, and the Graylog server itself for the web UI and APIs.
Number
Component
Function
Key Features
1
DashboardsFixed data panels
Save specific search‑based panels
2
SearchingConditional log search
Keyword, time, saved searches, panels, grouping, export, highlighting, custom time
3
AlertAlert configuration
Email, HTTP callback, custom script
4
InputsLog ingestion
Sidecar active collection or passive reporting
5
ExtractorsLog field conversion
JSON, KV, timestamp, regex parsing
6
StreamsLog classification
Route logs to different indices
7
IndicesPersistent storage
Configure storage performance
8
OutputsLog forwarding
Send streams to other Graylog clusters or services
9
PipelinesLog filtering
Define cleaning rules, field add/remove, conditional filters, custom functions
10
SidecarLightweight collector
Client‑server mode for large scale
Graylog processes logs through Inputs → Extractors → Streams → Pipelines, allowing end‑to‑end handling without extra post‑processing.
rule "discard debug messages"
when
to_long($message.level) > 6
then
drop_message();
end4. Service Installation and Deployment
Main steps to deploy Filebeat + Graylog
Filebeat can be installed via RPM/DEB packages, source compilation, Docker, or Kubernetes. Example for Ubuntu (DEB):
# Ubuntu (deb)
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.8.1-amd64.deb
sudo dpkg -i filebeat-7.8.1-amd64.deb
sudo systemctl enable filebeat
sudo service filebeat startDocker example:
# Run Filebeat container
docker run -d --name=filebeat --user=root \
--volume "./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro" \
--volume "/var/lib/docker/containers:/var/lib/docker/containers:ro" \
--volume "/var/run/docker.sock:/var/run/docker.sock:ro" \
docker.elastic.co/beats/filebeat:7.8.1 filebeat -e -strict.perms=false \
-E output.elasticsearch.hosts=["elasticsearch:9200"]Graylog can be deployed with Docker Compose. First generate a 16‑character password_secret and a SHA‑256 hash for the admin password, then place them in docker-compose.yml :
# Generate password_secret (at least 16 chars)
sudo apt install -y pwgen
pwgen -N 1 -s 16
# Generate SHA‑256 of admin password
echo -n "Enter Password: " && head -1 /dev/stdin | tr -d '\n' | sha256sum | cut -d " " -f1 version: "3"
services:
mongo:
restart: on-failure
container_name: graylog_mongo
image: "mongo:3"
volumes:
- "./mongodb:/data/db"
networks:
- graylog_network
elasticsearch:
restart: on-failure
container_name: graylog_es
image: "elasticsearch:6.8.5"
volumes:
- "./es_data:/usr/share/elasticsearch/data"
environment:
- http.host=0.0.0.0
- transport.host=localhost
- network.host=0.0.0.0
- "ES_JAVA_OPTS=-Xms512m -Xmx5120m"
ulimits:
memlock:
soft: -1
hard: -1
deploy:
resources:
limits:
memory: 12g
networks:
- graylog_network
graylog:
restart: on-failure
container_name: graylog_web
image: "graylog/graylog:3.3"
ports:
- "9000:9000" # Web UI
- "5044:5044" # Filebeat input
- "12201:12201" # GELF TCP
- "12201:12201/udp" # GELF UDP
- "1514:1514" # Syslog TCP
- "1514:1514/udp" # Syslog UDP
volumes:
- "./graylog_journal:/usr/share/graylog/data/journal"
environment:
- GRAYLOG_PASSWORD_SECRET=zscMb65...FxR9ag
- GRAYLOG_ROOT_PASSWORD_SHA2=77e29e0f...557515f
- GRAYLOG_HTTP_EXTERNAL_URI=http://11.22.33.44:9000/
- GRAYLOG_TIMEZONE=Asia/Shanghai
- GRAYLOG_ROOT_TIMEZONE=Asia/Shanghai
networks:
- graylog_network
depends_on:
- mongo
- elasticsearch
networks:
graylog_network:
driver: bridgeWhen using Docker containers, the GELF log driver can send logs directly to Graylog:
# Run a container with GELF driver
docker run --rm=true \
--log-driver=gelf \
--log-opt gelf-address=udp://11.22.33.44:12201 \
--log-opt tag=myapp \
myapp:0.0.1Docker‑Compose example for a service using the GELF driver:
version: "3"
services:
redis:
restart: always
image: redis
container_name: "redis"
logging:
driver: gelf
options:
gelf-address: udp://11.22.33.44:12201
tag: "redis"
# ... other services5. Graylog Web Interface Features
Overview of Graylog UI functions and characteristics
The UI provides dashboards, search, alerts, streams, inputs, extractors, pipelines, sidecar management, and more, allowing users to visualize, query, and manage logs efficiently.
Additional resources and promotional links are included at the end of the original article.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.