Backend Development 9 min read

Using Multiple Streams and Groups in Apache Storm Topology

This article explains how to declare and emit multiple stream IDs in Apache Storm, demonstrates code examples for MultiStream and MultiGroup patterns, discusses common pitfalls, and shows how to abstract stream declarations and bolt configurations for more flexible and dynamic topologies.

Architect
Architect
Architect
Using Multiple Streams and Groups in Apache Storm Topology

Apache Storm allows a spout or bolt to emit tuples to multiple streams by specifying a stream‑id in the collector.emit call and declaring each stream in declareOutputFields . Although a single tuple may only be sent to one stream, all possible stream‑ids must be declared.

The article first presents a simple example where words are emitted to stream1 or stream2 based on lexical order, while every tuple is also sent to stream3 . It shows the corresponding Java code for execute and declareOutputFields .

Next, a more complex topology (RandomSentenceSpout → SplitSentenceBolt → WordCountBolt → PrinterBolt) is introduced, illustrating how each component can declare several streams such as split-stream , count-stream , and print-stream . The code snippets demonstrate emitting to specific streams and declaring them with appropriate Fields .

To avoid repetitive stream declarations, the article proposes extracting a helper method declareStream(OutputFieldsDeclarer declarer, Fields fields) that registers the same set of stream‑ids with different field definitions. This method can be called from each bolt’s declareOutputFields implementation.

Building on this, the concept of MultiGroup is introduced: by wrapping all stream‑ids in a single method and emitting only one stream‑id, the grouping logic can be unified. The article shows an incorrect attempt to apply the same grouping to all bolts, explains why it fails (e.g., mismatched fields between spout and bolt), and provides the correct configuration where each bolt’s groupings match the actual output streams.

Finally, the article demonstrates how to abstract bolt registration with a setBolt helper that sets up shuffle and fields groupings based on a common naming pattern ( name + "-stream" ). This results in a topology where each bolt uses the same grouping strategy while preserving the original data flow (spout → split → count → print).

The discussion highlights that, although multiple streams are declared, a tuple still travels through a single logical path, and the abstraction simplifies building dynamic topologies without altering the runtime behavior.

distributed systemsJavastream processingBackend DevelopmentApache Stormtopology
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.