Understanding Fat-Tree (CLOS) Network Architecture for Data Centers
The article explains the Fat-Tree (CLOS) network topology introduced in 2008, describing its non‑convergent bandwidth design, three‑layer structure, practical benefits, common configurations, and limitations, while also providing references and visual illustrations of the architecture.
In 2008, a paper titled "A scalable, commodity data center network architecture" introduced a three‑tier CLOS network architecture called Fat‑Tree, marking the third application of the CLOS model in data‑center networking.
What is Fat‑Tree
Fat‑Tree is a non‑bandwidth‑convergent topology: unlike traditional tree networks where bandwidth shrinks toward the root, Fat‑Tree’s bandwidth remains constant from leaf to root, enabling a non‑blocking network.
To achieve non‑convergent bandwidth, every node (except the root) must have equal upward and downward bandwidth and provide line‑rate forwarding for access traffic.
Traditional single‑root or multi‑root topologies suffer from high cost (root switches need massive bandwidth) and performance bottlenecks (cannot support large‑scale MapReduce or data copying).
Researchers proposed various topologies to solve root bottlenecks; Fat‑Tree has become widely used in recent research, as cited in the SIGCOMM paper by Al‑Fares, Loukissas, and Vahdat.
Practical Value of Fat‑Tree
Fat‑Tree is recognized in the industry as a way to build non‑blocking networks by using many low‑performance switches to achieve full bandwidth for any communication pattern, while keeping all switches identical and inexpensive.
It is a switch‑centric topology that scales horizontally, uses uniform port‑count switches, and reduces network construction costs.
The Fat‑Tree consists of three layers: core, aggregation, and access. A k‑ary Fat‑Tree has five key characteristics:
Each switch has k ports.
The core layer contains (k/2)² switches.
There are k pods; each pod has k/2 aggregation and k/2 access switches.
Each access switch connects to k/2 servers, so a k‑ary Fat‑Tree can host k³/4 servers in total.
Any two pods have k parallel paths.
Common configurations include 2‑ary, 4‑ary, 6‑ary structures.
In a k=4 Fat‑Tree, servers under the same access switch belong to the same subnet and communicate at layer‑2; servers under different access switches require routing.
Limitations of Fat‑Tree
The scalability is theoretically limited by the port count of core switches, hindering long‑term data‑center growth.
Fault tolerance within a pod is poor; failures of lower‑level switches severely affect service quality.
The topology does not efficiently support one‑to‑all or all‑to‑all communication patterns, making it less suitable for high‑performance distributed workloads like MapReduce or Dryad.
The high switch‑to‑server ratio keeps equipment costs relatively high.
Achieving a 1:1 oversubscription ratio is difficult due to TCP packet reordering concerns.
References
Fang Guijun, “Data Center Network Architecture Issues and Evolution – CLOS, Fat‑Tree, Spine‑Leaf”, 2020.
Jinse Niu Shen, “Fat‑Tree Topo Architecture”.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.