Operations 15 min read

Filebeat + Graylog: A Complete Guide to Log Collection, Processing, and Visualization

This article introduces the Filebeat log‑shipping tool and the Graylog log‑management platform, explains their architectures and workflows, provides detailed configuration examples for Filebeat inputs, modules, and Graylog components, and walks through deployment using Docker, Docker‑Compose, and native package installations.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Filebeat + Graylog: A Complete Guide to Log Collection, Processing, and Visualization

This article presents a practical solution for centralized log collection using Filebeat as a lightweight log shipper and Graylog as an open‑source log aggregation, analysis, and alerting platform.

Filebeat Overview – Filebeat monitors specified log files or directories, reads new entries, and forwards them to a configured output such as Elasticsearch, Logstash, or Graylog. It runs one or more prospectors (now called inputs) that spawn harvesters for each discovered file.

Filebeat Workflow – After installation, Filebeat starts prospectors to detect log sources, each prospector creates a harvester that reads the latest content, sends events to the spooler, and finally forwards the batched data to the target address (e.g., Graylog).

Filebeat Configuration Example

# Configure input sources (inputs.d/*.yml)
filebeat.config.inputs:
  enabled: true
  path: ${path.config}/inputs.d/*.yml
# Load modules
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
# Output to Graylog (via Logstash input)
output.logstash:
  hosts: ["11.22.33.44:5500"]
# Processors example
processors:
  - add_host_metadata: ~
  - rename:
      fields:
        - from: "log"
          to: "message"
  - add_fields:
      target: ""
      fields:
        token: "0uxxxxaM-1111-2222-3333-VQZJxxxxxwgX "

Sample Input Definition (inputs.d/example.yml)

# Collect log type
- type: log
  enabled: true
  paths:
    - /var/log/supervisor/app_escape_worker-stderr.log
    - /var/log/supervisor/app_escape_prod-stderr.log
  symlinks: true
  include_lines: ["WARNING", "ERROR"]
  tags: ["app", "escape", "test"]
  multiline.pattern: '^\[?[0-9]...{3}'
  multiline.negate: true
  multiline.match: after

Filebeat also supports built‑in modules for common services such as PostgreSQL, Redis, and Nginx, which simplify configuration of log parsing and enrichment.

Graylog Overview – Graylog aggregates logs, stores them in Elasticsearch, and keeps configuration in MongoDB. It provides a web UI for searching, creating streams, applying extractors, and defining pipelines for advanced processing.

Graylog Core Components

Input – receives log data (e.g., GELF, Syslog, Beats).

Extractor – parses fields from raw messages.

Stream – groups messages based on rules and forwards them to specific index sets.

Index Set – defines Elasticsearch shard/replica settings and retention policies.

Pipeline – scripted processing for filtering or enriching messages.

Example pipeline rule to discard debug messages (level > 6):

rule "discard debug messages"
when
  to_long($message.level) > 6
then
  drop_message();
end

Sidecar – a lightweight agent that pulls configuration from Graylog and can run as NXLog, Filebeat, or Winlogbeat, allowing centralized management of log collectors across Linux and Windows hosts.

Deployment Instructions

Filebeat can be installed via DEB/RPM packages, compiled from source, or run as a Docker container. Example Docker command:

docker run -d --name=filebeat --user=root \
  --volume "./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro" \
  --volume "/var/lib/docker/containers:/var/lib/docker/containers:ro" \
  --volume "/var/run/docker.sock:/var/run/docker.sock:ro" \
  docker.elastic.co/beats/filebeat:7.8.1 filebeat -e -strict.perms=false \
  -E output.elasticsearch.hosts=["elasticsearch:9200"]

Graylog is commonly deployed with Docker‑Compose. Below is a minimal compose file that starts MongoDB, Elasticsearch, and Graylog, exposing ports for web UI, GELF, and Syslog inputs:

version: "3"
services:
  mongo:
    restart: on-failure
    container_name: graylog_mongo
    image: "mongo:3"
    volumes:
      - "./mongodb:/data/db"
    networks:
      - graylog_network

  elasticsearch:
    restart: on-failure
    container_name: graylog_es
    image: "elasticsearch:6.8.5"
    volumes:
      - "./es_data:/usr/share/elasticsearch/data"
    environment:
      - http.host=0.0.0.0
      - transport.host=localhost
      - network.host=0.0.0.0
      - "ES_JAVA_OPTS=-Xms512m -Xmx5120m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    deploy:
      resources:
        limits:
          memory: 12g
    networks:
      - graylog_network

  graylog:
    restart: on-failure
    container_name: graylog_web
    image: "graylog/graylog:3.3"
    ports:
      - 9000:9000   # Web UI
      - 5044:5044   # Beats input
      - 12201:12201   # GELF TCP
      - 12201:12201/udp   # GELF UDP
      - 1514:1514   # Syslog TCP
      - 1514:1514/udp   # Syslog UDP
    volumes:
      - "./graylog_journal:/usr/share/graylog/data/journal"
    environment:
      - GRAYLOG_PASSWORD_SECRET=zscMb65...FxR9ag
      - GRAYLOG_ROOT_PASSWORD_SHA2=77e29e0f...557515f
      - GRAYLOG_HTTP_EXTERNAL_URI=http://11.22.33.44:9000/
      - GRAYLOG_TIMEZONE=Asia/Shanghai
      - GRAYLOG_ROOT_TIMEZONE=Asia/Shanghai
    networks:
      - graylog_network
    depends_on:
      - mongo
      - elasticsearch

networks:
  graylog_network:
    driver: bridge

After starting the stack, create a GELF input in Graylog, then run containers with the Docker log driver pointing to that input, e.g.:

docker run --rm=true \
  --log-driver=gelf \
  --log-opt gelf-address=udp://11.22.33.44:12201 \
  --log-opt tag=myapp \
  myapp:0.0.1

The article concludes with promotional notes for the author’s “Spring Cloud Alibaba” video series and PDF resources, but the technical content above provides a self‑contained guide for building a Filebeat‑to‑Graylog log collection pipeline.

monitoringDockeroperationslog collectionFilebeatGraylog
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.