Filebeat + Graylog: A Complete Guide to Log Collection, Processing, and Visualization
This article introduces the Filebeat log‑shipping tool and the Graylog log‑management platform, explains their architectures and workflows, provides detailed configuration examples for Filebeat inputs, modules, and Graylog components, and walks through deployment using Docker, Docker‑Compose, and native package installations.
This article presents a practical solution for centralized log collection using Filebeat as a lightweight log shipper and Graylog as an open‑source log aggregation, analysis, and alerting platform.
Filebeat Overview – Filebeat monitors specified log files or directories, reads new entries, and forwards them to a configured output such as Elasticsearch, Logstash, or Graylog. It runs one or more prospectors (now called inputs) that spawn harvesters for each discovered file.
Filebeat Workflow – After installation, Filebeat starts prospectors to detect log sources, each prospector creates a harvester that reads the latest content, sends events to the spooler, and finally forwards the batched data to the target address (e.g., Graylog).
Filebeat Configuration Example
# Configure input sources (inputs.d/*.yml)
filebeat.config.inputs:
enabled: true
path: ${path.config}/inputs.d/*.yml
# Load modules
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
# Output to Graylog (via Logstash input)
output.logstash:
hosts: ["11.22.33.44:5500"]
# Processors example
processors:
- add_host_metadata: ~
- rename:
fields:
- from: "log"
to: "message"
- add_fields:
target: ""
fields:
token: "0uxxxxaM-1111-2222-3333-VQZJxxxxxwgX "Sample Input Definition (inputs.d/example.yml)
# Collect log type
- type: log
enabled: true
paths:
- /var/log/supervisor/app_escape_worker-stderr.log
- /var/log/supervisor/app_escape_prod-stderr.log
symlinks: true
include_lines: ["WARNING", "ERROR"]
tags: ["app", "escape", "test"]
multiline.pattern: '^\[?[0-9]...{3}'
multiline.negate: true
multiline.match: afterFilebeat also supports built‑in modules for common services such as PostgreSQL, Redis, and Nginx, which simplify configuration of log parsing and enrichment.
Graylog Overview – Graylog aggregates logs, stores them in Elasticsearch, and keeps configuration in MongoDB. It provides a web UI for searching, creating streams, applying extractors, and defining pipelines for advanced processing.
Graylog Core Components
Input – receives log data (e.g., GELF, Syslog, Beats).
Extractor – parses fields from raw messages.
Stream – groups messages based on rules and forwards them to specific index sets.
Index Set – defines Elasticsearch shard/replica settings and retention policies.
Pipeline – scripted processing for filtering or enriching messages.
Example pipeline rule to discard debug messages (level > 6):
rule "discard debug messages"
when
to_long($message.level) > 6
then
drop_message();
endSidecar – a lightweight agent that pulls configuration from Graylog and can run as NXLog, Filebeat, or Winlogbeat, allowing centralized management of log collectors across Linux and Windows hosts.
Deployment Instructions
Filebeat can be installed via DEB/RPM packages, compiled from source, or run as a Docker container. Example Docker command:
docker run -d --name=filebeat --user=root \
--volume "./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro" \
--volume "/var/lib/docker/containers:/var/lib/docker/containers:ro" \
--volume "/var/run/docker.sock:/var/run/docker.sock:ro" \
docker.elastic.co/beats/filebeat:7.8.1 filebeat -e -strict.perms=false \
-E output.elasticsearch.hosts=["elasticsearch:9200"]Graylog is commonly deployed with Docker‑Compose. Below is a minimal compose file that starts MongoDB, Elasticsearch, and Graylog, exposing ports for web UI, GELF, and Syslog inputs:
version: "3"
services:
mongo:
restart: on-failure
container_name: graylog_mongo
image: "mongo:3"
volumes:
- "./mongodb:/data/db"
networks:
- graylog_network
elasticsearch:
restart: on-failure
container_name: graylog_es
image: "elasticsearch:6.8.5"
volumes:
- "./es_data:/usr/share/elasticsearch/data"
environment:
- http.host=0.0.0.0
- transport.host=localhost
- network.host=0.0.0.0
- "ES_JAVA_OPTS=-Xms512m -Xmx5120m"
ulimits:
memlock:
soft: -1
hard: -1
deploy:
resources:
limits:
memory: 12g
networks:
- graylog_network
graylog:
restart: on-failure
container_name: graylog_web
image: "graylog/graylog:3.3"
ports:
- 9000:9000 # Web UI
- 5044:5044 # Beats input
- 12201:12201 # GELF TCP
- 12201:12201/udp # GELF UDP
- 1514:1514 # Syslog TCP
- 1514:1514/udp # Syslog UDP
volumes:
- "./graylog_journal:/usr/share/graylog/data/journal"
environment:
- GRAYLOG_PASSWORD_SECRET=zscMb65...FxR9ag
- GRAYLOG_ROOT_PASSWORD_SHA2=77e29e0f...557515f
- GRAYLOG_HTTP_EXTERNAL_URI=http://11.22.33.44:9000/
- GRAYLOG_TIMEZONE=Asia/Shanghai
- GRAYLOG_ROOT_TIMEZONE=Asia/Shanghai
networks:
- graylog_network
depends_on:
- mongo
- elasticsearch
networks:
graylog_network:
driver: bridgeAfter starting the stack, create a GELF input in Graylog, then run containers with the Docker log driver pointing to that input, e.g.:
docker run --rm=true \
--log-driver=gelf \
--log-opt gelf-address=udp://11.22.33.44:12201 \
--log-opt tag=myapp \
myapp:0.0.1The article concludes with promotional notes for the author’s “Spring Cloud Alibaba” video series and PDF resources, but the technical content above provides a self‑contained guide for building a Filebeat‑to‑Graylog log collection pipeline.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.