Comprehensive Guide to Deploying and Configuring Logstash in an ELK Stack
This article provides a step‑by‑step walkthrough of Logstash’s role in the ELK stack, covering deployment architecture, core concepts, input/filter/output plugins, multiline handling, JVM tuning, high‑availability risks, and practical startup scripts for reliable log processing in production environments.
Introduction
The article explains how to solve common Logstash issues, understand its runtime mechanism, and deploy an ELK stack in a clustered environment.
1. Deployment Architecture Diagram
A test environment with 12 machines is described: 4 for backend micro‑services, Filebeat and Logstash; 3 for Elasticsearch and Kibana. The topology image is shown and the deployment steps are listed.
2. What Logstash Is Used For
Logstash collects, parses, and transforms logs, providing a unified search entry for production troubleshooting.
3. Logstash Principles
3.1 Configuration Overview
Logstash reads a configuration file that defines inputs, filters, and outputs.
3.2 Input Plugins
Typical input is beats (e.g., Filebeat) on port 5044, but other inputs like Kafka are also possible.
3.3 Filter Plugins
3.3.1 Grok Plugin
Uses regular expressions to extract fields such as logTime , thread , level , class , and content from the message field.
filter {
grok {
match => ["message", "(?
\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3}) \[(?
.*)\] (?
\w*) (?
\S*) - (?
.*)"]
}
}3.3.2 Multiline Plugin
Combines stack‑trace lines into a single event using a time‑stamp pattern.
filter {
multiline {
pattern => "^\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}\.\d{3}"
negate => true
what => "previous"
}
}3.3.3 Mutate Plugin
Removes unnecessary fields before indexing.
mutate {
remove_field => ["agent","message","@version","tags","ecs","input","[log][offset]"]
}3.3.4 Date Plugin
Converts the extracted logTime to the @timestamp field, fixing timezone offsets.
date {
match => ["logTime","MMM d HH:mm:ss","MMM dd HH:mm:ss","ISO8601"]
timezone => "Asia/Shanghai"
}3.4 Output Plugin
Logs are sent to Elasticsearch cluster nodes.
output {
stdout {}
elasticsearch {
hosts => ["10.2.1.64:9200","10.2.1.65:9200","10.2.1.66:9200"]
index => "qa_log"
}
}4. How Logstash Runs
Start Logstash with logstash -f weblog.conf . The JVM defaults to -Xms1g -Xmx1g; JVM options can be edited in config/jvm.options for performance tuning.
5. Logstash Downtime Risks
5.1 Single‑Point Deployment Risks
Potential failures include Logstash crash, host outage, or host reboot. Solutions involve Keepalived for HA, deploying a secondary Logstash instance, and configuring automatic start‑up scripts.
5.2 Enabling Automatic Startup
Creates a systemd service rc-local.service and an /etc/rc.local script that launches Logstash and Filebeat, sets JAVA_HOME , and adjusts permissions.
sudo vim /etc/systemd/system/rc-local.service
# (service definition)
sudo vim /etc/rc.local
#!/bin/sh -e
nohup /home/software/filebeat-7.6.2-linux-x86_64/filebeat -e -c /home/software/filebeat-7.6.2-linux-x86_64/config.yml &
exit 0Conclusion
The guide covers Logstash deployment architecture, common pitfalls, runtime principles, and high‑availability strategies, encouraging readers to explore further features.
Wukong Talks Architecture
Explaining distributed systems and architecture through stories. Author of the "JVM Performance Tuning in Practice" column, open-source author of "Spring Cloud in Practice PassJava", and independently developed a PMP practice quiz mini-program.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.