Step-by-Step Guide to Installing and Configuring Apache Flume on a Cluster
This guide walks through downloading Apache Flume, setting up a master‑slave cluster, and configuring NetCat, Exec, and Avro sources with corresponding sinks and memory channels, including verification commands to ensure the agents run correctly.
1. Software download
wget http://mirror.bit.edu.cn/apache/flume/1.6.0/apache-flume-1.6.0-bin.tar.gz
tar zxvf apache-flume-1.6.0-bin.tar.gz
2. Cluster environment
Master: 172.16.11.97 Slave1: 172.16.11.98 Slave2: 172.16.11.99
3. NetCat source configuration (conf/flume-netcat.conf)
vim conf/flume-netcat.conf
# Name the components on this agent agent.sources = r1 agent.sinks = k1 agent.channels = c1 # Source configuration agent.sources.r1.type = netcat agent.sources.r1.bind = 127.0.0.1 agent.sources.r1.port = 44444 # Sink configuration agent.sinks.k1.type = logger # Channel configuration agent.channels.c1.type = memory agent.channels.c1.capacity = 1000 agent.channels.c1.transactionCapacity = 100 # Bind source and sink to the channel agent.sources.r1.channels = c1 agent.sinks.k1.channel = c1
Verification:
bin/flume-ng agent --conf conf --conf-file conf/flume-netcat.conf --name=agent -Dflume.root.logger=INFO,console
telnet master 44444
4. Exec source configuration (conf/flume-exec.conf)
vim conf/flume-exec.conf
# Name the components on this agent agent.sources = r1 agent.sinks = k1 agent.channels = c1 # Source configuration agent.sources.r1.type = exec agent.sources.r1.command = tail -f /data/hadoop/flume/test.txt # Sink configuration agent.sinks.k1.type = logger # Channel configuration agent.channels.c1.type = memory agent.channels.c1.capacity = 1000 agent.channels.c1.transactionCapacity = 100 # Bind source and sink to the channel agent.sources.r1.channels = c1 agent.sinks.k1.channel = c1
Verification:
bin/flume-ng agent --conf conf --conf-file conf/flume-exec.conf --name=agent -Dflume.root.logger=INFO,console
while true; do echo `date` >> /data/hadoop/flume/test.txt ; sleep 1; done
5. Avro source configuration (conf/flume-avro.conf)
vim conf/flume-avro.conf
# Define a memory channel agent.channels.c1.type = memory # Define Avro source agent.sources.r1.type = avro agent.sources.r1.bind = 127.0.0.1 agent.sources.r1.port = 44444 agent.sources.r1.channels = c1 # Define HDFS sink agent.sinks.k1.type = hdfs agent.sinks.k1.channel = c1 agent.sinks.k1.hdfs.path = hdfs://master:9000/flume_data_pool agent.sinks.k1.hdfs.filePrefix = events- agent.sinks.k1.hdfs.fileType = DataStream agent.sinks.k1.hdfs.writeFormat = Text agent.sinks.k1.hdfs.rollSize = 0 agent.sinks.k1.hdfs.rollCount = 600000 agent.sinks.k1.hdfs.rollInterval = 600 # Bind components agent.sources = r1 agent.sinks = k1 agent.channels = c1
Verification:
bin/flume-ng agent --conf conf --conf-file conf/flume-avro.conf --name=agent -Dflume.root.logger=DEBUG,console
telnet master 44444
Practical DevOps Architecture
Hands‑on DevOps operations using Docker, K8s, Jenkins, and Ansible—empowering ops professionals to grow together through sharing, discussion, knowledge consolidation, and continuous improvement.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.