Operations 3 min read

Understanding Filebeat Harvester, Prospector, and Configuration for System Log Collection

This article explains how Filebeat’s harvester and prospector components read and forward system logs, maintain file offsets in a registry, and provides a sample YAML configuration for collecting logs from a specified file and sending them to Elasticsearch, illustrating key operational concepts for log management.

Practical DevOps Architecture
Practical DevOps Architecture
Practical DevOps Architecture
Understanding Filebeat Harvester, Prospector, and Configuration for System Log Collection

Filebeat can write syslog entries to a designated file (e.g., fifile.txt) and also parse logs in JSON format.

The harvester component reads each file line by line, forwards the content to the output, and is responsible for opening and closing the file.

The prospector (or explorer) manages harvesters and discovers all input sources. It currently supports two types—log and stdin—each of which can be defined multiple times. The prospector checks each file to decide whether to start a harvester or keep an existing one running, and can ignore files when appropriate.

Filebeat keeps the state of each file, frequently flushing it to a registry file on disk. This state records the last offset read by a harvester, ensuring that all log lines are sent. If the Elasticsearch or Logstash output becomes unreachable, Filebeat continues tracking the last sent offset and resumes reading when the output becomes available again. On restart, Filebeat reads the registry to rebuild state so each harvester resumes from its previous position; a separate state is maintained for each prospector‑discovered file, handling cases where files are deleted or moved.

# cat /etc/filebeat/filebeat.yml filebeat.inputs: - type: log enabled: true paths: - /var/log/fifile.log include_lines: ['^ERR', '^WARN', '^INFO'] output.elasticsearch: hosts: ["192.168.20.182:9200","192.168.20.181:9200","192.168.20.180:9200"] index: "system-%{[agent.version]}-%{+yyyy.MM.dd}" setup.ilm.enabled: false setup.template.name: "system" setup.template.pattern: "system-*"

The article concludes with a list of recommended readings on ELK stack deployment, Nginx log collection, Zabbix agent deployment, and MySQL troubleshooting.

monitoringELKLog CollectionFilebeatsystem logs
Practical DevOps Architecture
Written by

Practical DevOps Architecture

Hands‑on DevOps operations using Docker, K8s, Jenkins, and Ansible—empowering ops professionals to grow together through sharing, discussion, knowledge consolidation, and continuous improvement.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.