Operations 20 min read

Centralized Log Collection with Filebeat and Graylog

This article explains how to use Filebeat together with Graylog to collect, ship, store, and analyze logs from multiple environments, covering tool introductions, configuration files, Docker deployment, Spring Boot integration, and practical search syntax for effective log monitoring.

Code Ape Tech Column
Code Ape Tech Column
Code Ape Tech Column
Centralized Log Collection with Filebeat and Graylog

When a company runs many services across test and production environments, centralized log collection becomes essential. The article compares using Nginx for log exposure versus a dedicated log collection service like ELK, and recommends Graylog as a simpler, extensible alternative that stores logs in Elasticsearch and caches configuration in MongoDB.

Filebeat Overview

Filebeat is a lightweight log shipper that monitors specified log directories or files, reads new entries, and forwards them to Elasticsearch, Logstash, or Graylog. When enabled, Filebeat starts one or more prospectors to detect log files, spawns a harvester for each file, and sends harvested events to a spooler before finally delivering them to the configured Graylog address.

Because Filebeat is lighter than Logstash, it is recommended for environments with limited resources or simpler log collection needs.

Filebeat Configuration

The main configuration file is typically located at /etc/filebeat/filebeat.yml . Below is a sample configuration that enables input files from the inputs.d directory, loads modules, sets up Elasticsearch templates, and defines the Logstash output address.

# Configure input sources
# We have configured all *.yml files under inputs.d
filebeat.config.inputs:
  enabled: true
  path: ${path.config}/inputs.d/*.yml
# If logs are JSON, enable this
#json.keys_under_root: true

# Load Filebeat modules
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.template.settings:
  index.number_of_shards: 1

# Output to Logstash (Graylog)
output.logstash:
  hosts: ["11.22.33.44:5500"]

#output.file:
#  enable: true

processors:
  - add_host_metadata: ~
  - rename:
      fields:
        - from: "log"
          to: "message"
  - add_fields:
      target: ""
      fields:
        # Add Token to prevent unauthenticated data submission
        token: "0uxxxxaM-1111-2222-3333-VQZJxxxxxwgX "

An example inputs.d file for collecting logs from specific services is shown below.

# Log type
- type: log
  enabled: true
  # Paths to log files
  paths:
    - /var/log/supervisor/app_escape_worker-stderr.log
    - /var/log/supervisor/app_escape_prod-stderr.log
  symlinks: true
  # Include only lines containing these keywords
  include_lines: ["WARNING", "ERROR"]
  # Tag the data
  tags: ["app", "escape", "test"]
  # Multiline handling
  multiline.pattern: '^\[?[0-9]...{3}'
  multiline.negate: true
  multiline.match: after

# Additional log types can be added similarly
- type: log
  enabled: true
  ...

Filebeat also provides built‑in modules for common services such as iptables, PostgreSQL, and Nginx, each with its own configuration snippet.

# iptables module
- module: iptables
  log:
    enabled: true
    var.paths: ["/var/log/iptables.log"]
    var.input: "file"

# PostgreSQL module
- module: postgresql
  log:
    enabled: true
    var.paths: ["/path/to/log/postgres/*.log*"]

# Nginx module
- module: nginx
  access:
    enabled: true
    var.paths: ["/path/to/log/nginx/access.log*"]
  error:
    enabled: true
    var.paths: ["/path/to/log/nginx/error.log*"]

Graylog Service Overview

Graylog is an open‑source log aggregation, analysis, and alerting platform. It consists of three core components: Elasticsearch for storing and searching log data, MongoDB for Graylog configuration, and the Graylog server itself for the web UI and processing.

Deployments can range from a single‑node setup to a clustered architecture for high scalability. Images in the original article illustrate both minimal and optimized cluster deployments.

Graylog Core Concepts

Input – the source of log data; each input can have Extractors to transform fields.

Stream – groups logs based on criteria; each stream can write to its own Elasticsearch index set.

Extractor – configured under System → Input to parse and convert fields.

Index Set – defines shard and replica settings, retention policies, and performance parameters.

Pipeline – allows custom processing scripts; an example rule discarding messages with level > 6 is provided.

Sidecar – a lightweight collector daemon (supports NXLog, Filebeat, Winlogbeat) that pulls configuration from Graylog via REST API.

rule "discard debug messages"
when
  to_long($message.level) > 6
then
  drop_message();
end

Logs stored in Graylog can be searched directly, or forwarded to other services via Graylog outputs.

Installation and Deployment

Filebeat can be installed via Debian/Ubuntu packages, Docker, or source compilation. Example commands for Ubuntu:

# Ubuntu (deb)
curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.8.1-amd64.deb
sudo dpkg -i filebeat-7.8.1-amd64.deb
sudo systemctl enable filebeat
sudo service filebeat start

Docker deployment example:

docker run -d --name=filebeat --user=root \
  --volume="./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro" \
  --volume="/var/lib/docker/containers:/var/lib/docker/containers:ro" \
  --volume="/var/run/docker.sock:/var/run/docker.sock:ro" \
  docker.elastic.co/beats/filebeat:7.8.1 filebeat -e -strict.perms=false \
  -E output.elasticsearch.hosts=["elasticsearch:9200"]

Graylog can be deployed with Docker‑Compose. After generating a 16‑character password_secret and a SHA‑256 hash of the admin password, the following docker‑compose.yml defines MongoDB, Elasticsearch, and Graylog services, exposing ports for the web UI (9000) and various inputs (5044, 12201, 1514, etc.).

version: "3"
services:
  mongo:
    restart: on-failure
    container_name: graylog_mongo
    image: "mongo:3"
    volumes:
      - "./mongodb:/data/db"
    networks:
      - graylog_network

  elasticsearch:
    restart: on-failure
    container_name: graylog_es
    image: "elasticsearch:6.8.5"
    volumes:
      - "./es_data:/usr/share/elasticsearch/data"
    environment:
      - http.host=0.0.0.0
      - transport.host=localhost
      - network.host=0.0.0.0
      - "ES_JAVA_OPTS=-Xms512m -Xmx5120m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    deploy:
      resources:
        limits:
          memory: 12g
    networks:
      - graylog_network

  graylog:
    restart: on-failure
    container_name: graylog_web
    image: "graylog/graylog:3.3"
    ports:
      - 9000:9000   # Web UI
      - 5044:5044   # Filebeat input
      - 12201:12201 # GELF TCP
      - 12201:12201/udp
      - 1514:1514   # Syslog TCP
      - 1514:1514/udp
    volumes:
      - "./graylog_journal:/usr/share/graylog/data/journal"
    environment:
      - GRAYLOG_PASSWORD_SECRET=zscMb65...FxR9ag
      - GRAYLOG_ROOT_PASSWORD_SHA2=77e29e0f...557515f
      - GRAYLOG_HTTP_EXTERNAL_URI=http://11.22.33.44:9000/
      - GRAYLOG_TIMEZONE=Asia/Shanghai
      - GRAYLOG_ROOT_TIMEZONE=Asia/Shanghai
    networks:
      - graylog
    depends_on:
      - mongo
      - elasticsearch

networks:
  graylog_network:
    driver: bridge

GELF (Graylog Extended Log Format) inputs accept structured events and support compression and chunking. Docker containers can send logs directly to Graylog by specifying the gelf log driver.

# Docker run with GELF driver
docker run --rm=true \
  --log-driver=gelf \
  --log-opt gelf-address=udp://11.22.33.44:12201 \
  --log-opt tag=myapp \
  myapp:0.0.1
# Docker‑compose example for Redis service
version: "3"
services:
  redis:
    restart: always
    image: redis
    container_name: "redis"
    logging:
      driver: gelf
      options:
        gelf-address: udp://11.22.33.44:12201
        tag: "redis"
  ...

Graylog Web UI Features

The article includes screenshots of the Graylog UI, demonstrating search, stream management, dashboard creation, and alert configuration.

Spring Boot Integration

To forward Spring Boot logs to Graylog, add the logback‑gelf dependency (version 3.0.0) and create a logback.xml configuration file. The configuration defines a GELF UDP appender pointing to the Graylog host and port, sets chunk size, compression, and includes fields such as application name.

<appender name="GELF" class="de.siegmar.logbackgelf.GelfUdpAppender">
  <!-- Graylog address -->
  <graylogHost>ip</graylogHost>
  <!-- UDP input port -->
  <graylogPort>12201</graylogPort>
  <!-- Chunk size, compression, etc. -->
  <maxChunkSize>508</maxChunkSize>
  <useCompression>true</useCompression>
  <encoder class="de.siegmar.logbackgelf.GelfEncoder">
    <includeRawMessage>false</includeRawMessage>
    <includeMarker>true</includeMarker>
    <includeMdcData>true</includeMdcData>
    <includeCallerData>false</includeCallerData>
    <includeRootCauseData>false</includeRootCauseData>
    <includeLevelName>true</includeLevelName>
    <shortPatternLayout class="ch.qos.logback.classic.PatternLayout">
      <pattern>%m%nopex</pattern>
    </shortPatternLayout>
    <fullPatternLayout class="ch.qos.logback.classic.PatternLayout">
      <pattern>%d - [%thread] %-5level %logger{35} - %msg%n</pattern>
    </fullPatternLayout>
    <staticField>app_name:austin</staticField>
  </encoder>
</appender>

Replace the placeholder ip with the actual Graylog server address, restart the application, and the logs will appear in Graylog's Search view.

Log Search Syntax

Graylog supports simple fuzzy queries (e.g., orderid ), exact phrase queries (e.g., "orderid: 11" ), field‑specific queries ( message:http ), multi‑field queries, and Boolean combinations such as message:http AND level_name:ERROR OR source:192.168.0.4 .

Final Note

The author encourages readers to like, share, and follow the article, and promotes a paid knowledge community offering advanced projects and tutorials on Spring, micro‑services, big‑data sharding, DDD, and more.

monitoringDockerElasticsearchloggingLog CollectionFilebeatGraylog
Code Ape Tech Column
Written by

Code Ape Tech Column

Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.