Fundamentals 13 min read

From OSI Model to RDMA: High‑Performance Networking, Leaf‑Spine Architecture, and Switch Selection

This article examines the evolution of network protocols from the OSI seven‑layer model and TCP/IP to RDMA technologies such as InfiniBand and RoCE, compares traditional three‑tier and leaf‑spine data‑center designs, and evaluates Ethernet, InfiniBand, and RoCE switches for high‑throughput, low‑latency HPC environments.

Architects' Tech Alliance
Architects' Tech Alliance
Architects' Tech Alliance
From OSI Model to RDMA: High‑Performance Networking, Leaf‑Spine Architecture, and Switch Selection

As computer networks continuously upgrade, protocols become increasingly critical for data exchange. The OSI seven‑layer model, introduced in the 1980s, standardized inter‑computer communication, while modern high‑performance computing (HPC) demands high throughput and low latency, driving a shift from traditional TCP/IP to RDMA technologies.

The OSI layers each serve specific functions: the Physical layer defines hardware signaling; the Data Link layer handles framing and error control; the Network layer creates logical circuits using IP addresses; the Transport layer ensures reliable data flow; the Session layer manages connections; the Presentation layer performs data formatting and encryption; and the Application layer provides end‑user services such as email and file transfer.

Real‑world protocols often deviate from the pure OSI model. TCP/IP, for example, condenses the seven layers into four: Application, Transport, Network, and Data Link, optimizing the stack for practical use.

In HPC, the latency and CPU overhead of TCP/IP have led to the adoption of Remote Direct Memory Access (RDMA), which enables direct memory reads/writes over the network without OS intervention, delivering high throughput and low latency. RDMA variants include InfiniBand, RoCE, and iWARP, each with distinct technical and cost considerations.

Leaf‑spine architecture addresses the shortcomings of traditional three‑tier data‑center designs (access, aggregation, core). In a three‑tier model, bandwidth waste, large fault domains, and increased latency arise from spanning‑tree protocols and multiple hop paths. Leaf‑spine replaces the core layer with spine switches (L1) and leaf switches (L2), providing a flat, non‑blocking topology where each leaf connects to every spine, enabling equal‑cost multi‑path (ECMP) routing and rapid fault isolation.

NVIDIA’s SuperPOD exemplifies leaf‑spine in practice. A DGX A100 SuperPOD uses QM8790 switches (40 × 200 Gb ports) in a non‑blocking layout, with each DGX server connecting eight NICs to eight leaf switches. Scaling rules dictate a 1:1.17 server‑to‑switch ratio for DGX A100 and up to 1:0.50 for DGX H100, with larger deployments adding L1 switches to maintain bandwidth.

When selecting switches, Ethernet, InfiniBand, and RoCE each have trade‑offs:

Scalability – InfiniBand supports the largest node counts, enabling virtually unlimited cluster sizes.

Performance – InfiniBand delivers the lowest latency and highest raw bandwidth; RoCE leverages existing Ethernet infrastructure for better efficiency than TCP/IP; Ethernet with TCP/IP incurs higher CPU overhead.

Manageability – TCP/IP/ Ethernet is familiar and easier to administer, while InfiniBand requires specialized hardware and expertise.

Cost – InfiniBand ports and switches are typically more expensive; RoCE and Ethernet offer more cost‑effective solutions.

Equipment – RoCE and TCP/IP run on standard Ethernet switches; InfiniBand requires dedicated IB switches.

Enterprises must weigh performance requirements against budget and operational complexity when choosing between InfiniBand, RoCE, and Ethernet for modern data‑center interconnects.

High Performance Computingnetwork protocolsRDMAInfiniBandleaf-spinedata center architecture
Architects' Tech Alliance
Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.