Dragonfly Network Topology and Routing Algorithms for High‑Performance Data Centers
The article explains the Dragonfly network topology, its hierarchical structure, key parameters, routing algorithms (Minimal, Non‑Minimal, UGAL variants) and deadlock avoidance techniques, highlighting how modern data‑center networks address latency bottlenecks in high‑performance computing environments.
According to Hyperion Research, 28‑38 E‑class or near‑E‑class supercomputers will be built worldwide between 2021 and 2026. This article, referencing the "Bus‑Level Data Center Network Technology White Paper," examines how, after compute and storage performance improve, network latency becomes the dominant end‑to‑end delay in data centers.
Network New Topology Architecture
Traditional CLOS architectures prioritize generality at the expense of latency and cost‑performance. New topologies aim to reduce hop count by about 20 % for high‑performance computing workloads, supporting ultra‑large scale traffic with strict static‑delay requirements.
Dragonfly, proposed by John Kim et al. in 2008 ("Technology‑Driven, Highly‑Scalable Dragonfly Topology"), features a small network diameter and low cost, and is used in Cray XC supercomputers.
Topology Structure
A simple Dragonfly network consists of three hierarchical layers:
Switch layer: one switch connected to p compute nodes.
Group layer: a Switch layers; each switch in the group is fully connected (all‑to‑all) with a‑1 links to the other switches.
System layer: g Group layers, also fully connected.
For a single switch, ports are allocated as follows: p ports to compute nodes, a‑1 ports to other switches within the group, and h ports to switches in other groups.
From these parameters we derive key network properties:
Switch port count: k = p + (a‑1) + h
Number of groups: g = a·h + 1
Total compute nodes: N = a·p·(a·h + 1)
If each group of switches is collapsed into a single virtual switch, its port count becomes k' = a·(p + h)
A balanced configuration often uses the relationship a = 2·p = 2·h .
Routing Algorithms
Dragonfly supports two main routing families:
Minimal Routing (MIN) : shortest‑path routing, typically traversing at most one global link and two local links (max three hops).
Non‑Minimal Routing (VAL / VLB) : Valiant‑style routing that first selects a random intermediate group, potentially using up to two global links and three local links (max five hops).
UGAL (Universal Globally‑Adaptive Load‑balanced) : chooses between MIN and VAL based on the sum of queue lengths on the candidate paths.
Several UGAL variants are described:
UGAL‑G: decision based on queue lengths of all switches in the source group.
UGAL‑L: decision based only on local switch queues, which can bias toward MIN.
UGAL‑LVC: separates MIN and VAL into virtual channels (VCs) to improve load balance.
UGAL‑LVC_H: uses VC‑based decision only when MIN and VAL exit ports differ.
UGAL‑LCR: mitigates increased latency at large buffers by adjusting credit return delays.
Deadlock Avoidance
Because Dragonfly’s topology creates many potential cycles, deadlock avoidance requires multiple virtual channels: two VCs for Minimal routing and three VCs for Non‑Minimal routing.
References:
https://ngdcn.com/post/208.html
https://www.cnblogs.com/Nreyab/p/15590684.html
http://blog.sysu.tech
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.