Low-Code Generation of Flink StreamGraph, JobGraph, and ExecutionGraph
This article explains how to generate Flink's StreamGraph, JobGraph, and ExecutionGraph using a low‑code canvas approach, detailing the underlying concepts, the transformation pipeline from DataStream to DAG, and providing Java code examples for building and assembling operators via drag‑and‑drop.
Submitting a DataStream Flink application goes through StreamGraph, JobGraph, and ExecutionGraph stages to produce an executable DAG, while a Flink SQL application adds an extra step using the flink-table-planner module to convert SQL to StreamGraph.
In Flink, each operator (Source, Filter, Map, Sink) forms part of the logical execution graph (StreamGraph). The article shows a Java example that creates a StreamExecutionEnvironment, adds a Kafka source, applies filter and map functions, and finally adds a sink.
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStream dataStream = env.addSource(new FlinkKafkaConsumer("topic"));
DataStream filteredStream = dataStream.filter(new FilterFunction() {
@Override
public boolean filter(Object value) throws Exception {return true;}
});
DataStream mapedStream = filteredStream.map(new MapFunction() {
@Override
public Object map(Object value) throws Exception {return value;}
});
mapedStream.addSink(new DiscardingSink());
env.execute("test-job");StreamGraph is the logical DAG describing sources, transformations, and sinks; it maps one‑to‑one to JobGraph, whose vertices (JobVertex) are created after optimizer processing, and each vertex corresponds to a Task that runs on a TaskManager.
ExecutionGraph is the physical execution plan derived from JobGraph, detailing task scheduling, execution order, data transfer, and state management; each Task may be split into multiple SubTasks according to parallelism.
The proposed canvas (drag‑and‑drop) mode stores nodes (operators) and edges in MySQL tables, uses a BFS‑based traversal to assemble the DataStream API program, and aims to enable zero‑code Flink application development.
In practice, additional metadata such as operator parallelism, schema, key‑by fields, and custom UDF support must be handled, making the real implementation more complex than the simplified example.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.