Operations 15 min read

How to Collect Nginx Logs with Rsyslog, Kafka, and ELK Without Agents

Learn how to set up agent‑less log collection for Nginx using Rsyslog, forward logs via the omkafka module to a Kafka cluster, and process them with Logstash into Elasticsearch for visualization in Kibana, including installation, configuration, and testing steps.

Efficient Ops
Efficient Ops
Efficient Ops
How to Collect Nginx Logs with Rsyslog, Kafka, and ELK Without Agents
In conventional log‑collection solutions the client must install an extra agent such as Logstash or Filebeat, which adds complexity and resource consumption. Rsyslog provides a way to collect logs without any additional programs.

Rsyslog

Rsyslog is a high‑performance log‑collection and processing service that is fast, reliable, and modular. It can receive logs from many sources (file, TCP, UDP, Unix socket, etc.) and output them to various destinations such as MySQL, MongoDB, Elasticsearch, Kafka, handling millions of messages per second.

Rsyslog is the enhanced version of syslog and is installed by default on most Linux distributions, so no extra installation is required on the client side.

Collecting Nginx Logs

The ELK log‑collection flow using Rsyslog is illustrated below:

The processing chain is: Nginx → syslog → Rsyslog → omkafka → Kafka → Logstash → Elasticsearch → Kibana.

Nginx sends its access logs to the syslog service, which forwards them to the Rsyslog server. Rsyslog, using the omkafka module, writes the logs into Kafka. Logstash reads from Kafka and stores the data in Elasticsearch, where Kibana can query it.

Rsyslog is included with the OS, so the client does not need to install any extra software.

The omkafka module is not installed by default; it must be added if you want Rsyslog to write to Kafka.

The omkafka module is supported only in Rsyslog versions 8.7.0 and newer, so check the version with rsyslogd -v and upgrade if necessary.

Rsyslog Upgrade

1. Add the Rsyslog repository key:

<code># apt-key adv --recv-keys --keyserver keys.gnupg.net AEF0CF8E</code>

2. Add the repository source:

<code>echo "deb http://debian.adiscon.com/v8-stable wheezy/" >> /etc/apt/sources.list
echo "deb-src http://debian.adiscon.com/v8-stable wheezy/" >> /etc/apt/sources.list</code>

3. Install/upgrade Rsyslog:

<code># apt-get update && apt-get -y install rsyslog</code>

Adding the omkafka Module

1. Install build tools required for

autoreconf

:

<code># apt-get -y install pkg-config autoconf automake libtool unzip</code>

2. Install the many dependency packages needed by omkafka:

<code># apt-get -y install libdbi-dev libmysqlclient-dev postgresql-client libpq-dev libnet-dev librdkafka-dev libgrok-dev libgrok1 libpcre3-dev libtokyocabinet-dev libglib2.0-dev libmongo-client-dev libhiredis-dev
# apt-get -y install libestr-dev libfastjson-dev uuid-dev liblogging-stdlog-dev libgcrypt-dev
# apt-get -y install flex bison librdkafka1 librdkafka-dev librdkafka1-dbg</code>

3. Compile and install the omkafka module:

<code># mkdir tmp && cd tmp

# git init
# git pull [email protected]:VertiPub/omkafka.git

# autoreconf -fvi
# ./configure --sbindir=/usr/sbin --libdir=/usr/lib --enable-omkafka && make && make install && cd ..</code>

Rsyslog Collecting Nginx Logs

Client‑side Nginx Configuration

<code>log_format  jsonlog '{'
    "host": "$host",
    "server_addr": "$server_addr",
    "http_x_forwarded_for":"$http_x_forwarded_for",
    "remote_addr":"$remote_addr",
    "time_local":"$time_local",
    "request_method":"$request_method",
    "request_uri":"$request_uri",
    "status":$status,
    "body_bytes_sent":$body_bytes_sent,
    "http_referer":"$http_referer",
    "http_user_agent":"$http_user_agent",
    "upstream_addr":"$upstream_addr",
    "upstream_status":"$upstream_status",
    "upstream_response_time":"$upstream_response_time",
    "request_time":$request_time'
}';

access_log syslog:server=rsyslog.domain.com,facility=local7,tag=nginx_access_log,severity=info jsonlog;</code>

Nginx supports the syslog logging method only from version 1.10 onward; ensure your version is newer.

JSON format is used to reduce Logstash processing load and simplify the configuration.

Logs are sent directly to a remote Rsyslog server via syslog, eliminating the need for local log files and log‑rotation.

The access_log parameters are explained below:

syslog : send logs to a syslog service.

server : address of the Rsyslog server (default UDP port 514).

facility : log facility (default

local7

).

tag : a tag (e.g.,

nginx_access_log

) to identify the source on the server.

severity : log level (default

info

).

Server‑side Rsyslog Configuration

<code># cat /etc/rsyslog.d/rsyslog_nginx_kafka_cluster.conf
module(load="imudp")
input(type="imudp" port="514")

# nginx access log ==> rsyslog server (local) ==> kafka
module(load="omkafka")

template(name="nginxLog" type="string" string="%msg%")

if $inputname == "imudp" then {
    if ($programname == "nginx_access_log") then
        action(type="omkafka"
            template="nginxLog"
            broker=["10.82.9.202:9092","10.82.9.203:9092","10.82.9.204:9092"]
            topic="rsyslog_nginx"
            partitions.auto="on"
            confParam=[
                "socket.keepalive.enable=true"
            ]
        )
}

:rawmsg, contains, "nginx_access_log" ~</code>

Key configuration points:

module : load

imudp

to receive UDP syslog messages and

omkafka

to forward them to Kafka.

input : listen on UDP port 514 (TCP can also be enabled).

template : define a template named

nginxLog

; because the log is already JSON, no extra formatting is needed.

action : when the input name is

imudp

and the program name matches the Nginx tag, forward the log to the specified Kafka brokers and topic.

:rawmsg, contains : discard the original message after it has been sent to Kafka to avoid duplicate storage.

The omkafka module will automatically create the Kafka topic if it does not exist.

Server‑side Logstash Configuration

<code>input {
    kafka {
        bootstrap_servers => "10.82.9.202:9092,10.82.9.203:9092,10.82.9.204:9092"
        topics => ["rsyslog_nginx"]
    }
}

filter {
    mutate {
        gsub => ["message", "\\x", "\\\\x"]
    }

    json {
        source => "message"
    }

    date {
        match => ["time_local","dd/MMM/yyyy:HH:mm:ss Z"]
        target => "@timestamp"
    }
}

output {
    elasticsearch {
        hosts => ["10.82.9.205", "10.82.9.206", "10.82.9.207"]
        index => "rsyslog-nginx-%{+YYYY.MM.dd}"
    }
}
</code>

Important parameters:

input : Kafka cluster address and topic name.

filter : No extra processing is needed for JSON, but a

gsub

replaces escaped characters if Chinese characters appear in URLs.

output : Elasticsearch hosts and a daily index pattern.

Testing and Verification

Restart both Rsyslog and Nginx services, then generate some traffic.

Check that the Kafka topic exists:

<code># bin/kafka-topics.sh --list --zookeeper 127.0.0.1:2181
__consumer_offsets
rsyslog_nginx
</code>

Consume messages from the topic to verify log format:

<code># bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic rsyslog_nginx
{"host":"domain.com","server_addr":"172.17.0.2","http_x_forwarded_for":"58.52.198.68","remote_addr":"10.120.89.84","time_local":"28/Aug/2018:14:26:00 +0800","request_method":"GET","request_uri":"/","status":200,"body_bytes_sent":1461,"http_referer":"-","http_user_agent":"Mozilla/5.0 ...","upstream_addr":"-","upstream_status":"-","upstream_response_time":"-","request_time":0.000}
</code>

In Kibana, create an index pattern

rsyslog-nginx-*

, select

@timestamp

, and explore the data. Use the Discover page to filter by fields such as status or request_uri , enable auto‑refresh, and visualize request volume, error rates, top IPs, etc. Build dashboards with Visualize for deeper analysis.

Conclusion

Nginx access logs are a treasure trove: they reveal traffic trends, service reliability, campaign popularity, and can guide operational improvements.

Rsyslog can be made highly available by deploying multiple instances behind a load balancer; in our experience a single instance processes around 200 k logs per minute without downtime.

We use UDP because Nginx’s syslog mode only supports UDP. UDP offers higher performance and avoids the retry overhead that TCP could introduce under unstable network conditions.

DevOpsKafkanginxELKlog collectionRsyslog
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.