Big Data 9 min read

Understanding Flink Architecture, Job Example, and Execution Graph Layers

This article explains Flink’s cluster architecture, the roles of Client, JobManager and TaskManager, demonstrates a SocketTextStreamWordCount example, and details the four-layer execution graph (StreamGraph, JobGraph, ExecutionGraph, Physical Execution Graph) to illustrate how Flink schedules and runs streaming jobs.

Architecture Digest

Aug 13, 2017

Understanding Flink Architecture, Job Example, and Execution Graph Layers

When a Flink cluster starts, it launches a JobManager and one or more TaskManagers. A client submits a job to the JobManager, which then schedules tasks across the TaskManagers; the TaskManagers report heartbeats and statistics back to the JobManager and exchange data via streams. All three components run as independent JVM processes.

The three main components are:

Client : the entry point for submitting jobs, can run on any machine that can reach the JobManager; it may exit after submission for streaming jobs or stay alive to await results.

JobManager : responsible for job scheduling, checkpoint coordination, and generating an optimized execution plan, similar to Storm’s Nimbus.

TaskManager : launched with a fixed number of slots; each slot can run one task (thread). It receives tasks from the JobManager, establishes Netty connections to upstream operators, and processes incoming data.

Flink uses a multi‑threaded model where multiple jobs and tasks share a single TaskManager process. While this improves CPU utilization, the lack of resource isolation and debugging difficulty make the design less appealing compared to a per‑job process model.

The article then presents a concrete job example using Flink’s built‑in SocketTextStreamWordCount from the examples package, which counts word occurrences from a socket stream.

To run the example, first start a local netcat server, then submit the Flink program. As words are typed into the netcat terminal, the TaskManager’s output shows the word counts.

By replacing the final line env.execute with System.out.println(env.getExecutionPlan()) and running the job with parallelism 2, the logical execution plan is printed as a JSON string, which can be visualized using Flink’s visualizer tool.

Flink’s execution graphs are organized into four layers:

StreamGraph : the initial graph generated from the user’s Stream API code, representing the program’s topology.

JobGraph : an optimized version of the StreamGraph submitted to the JobManager; it performs chaining of compatible operators to reduce serialization and transmission overhead.

ExecutionGraph : the distributed execution graph created by the JobManager from the JobGraph; it is the core data structure for scheduling.

Physical Execution Graph : the actual deployment of tasks on TaskManagers; it is not a concrete data structure but a view of the running tasks across the cluster.

Each layer serves a distinct purpose, decoupling concerns such as logical program structure, optimization, scheduling, and physical deployment. This design mirrors similar multi‑graph approaches in systems like Spark, facilitating easier analysis and manipulation at each stage.

Source: http://jm.taobao.org/2017/08/07/20170807/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Flink TaskManager JobManager Execution Graph

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.